Motion estimation approaches enable the robust prediction of successive camera poses when a camera undergoes erratic motion. It is especially difficult to make robust predictions under such conditions when using a constant-velocity model. However, motion estimation itself inevitably involves pose errors that result in the production of an inconsistent map. To solve this problem, we propose a novel 3D visual SLAM approach in which both motion estimation and stochastic filtering are performed; in the proposed method, visual odometry and Rao-blackwellized particle filtering are combined. First, to ensure that the process and the measurement noise are independent (they are actually dependent in the case of a single sensor), we simply divide observations (i.e., image features) into two categories, common features observed in the consecutive key-frame images and new features detected in the current key-frame image. In addition, we propose a key-frame SLAM to reduce error accumulation with a data-driven proposal distribution. We demonstrate the accuracy of the proposed method in terms of the consistency of the global map.