Fast and Robust Initialization for Visual-Inertial

文件名称: Fast and Robust Initialization for Visual-Inertial SLAM.pdf

所属分类: VR

开发工具:

文件大小: 970kb

下载次数: 0

上传时间: 2019-07-02

提供者: ye_shen_********

下载 (970kb)

不能下载？报告错误

详细说明：Abstract— Visual-inertial SLAM (VI-SLAM) requires a good initial estimation of the initial velocity, orientation with respect to gravity and gyroscope and accelerometer biases. In this paper we build on the initialization method proposed by Martinelli [1] and extended by Kaiser et al. [2], modifying it to be more general and efficient. We improve accuracy with several rounds of visual-inertial bundle adjustment, and robustify the method with novel observability and consensus tests, that discard erroneous solutions. Our results on the EuRoC dataset show that, while the original method produces scale errors up to 156%, our method is able to consistently initialize in less than two seconds with scale errors around 5%, which can be further reduced to less than 1% performing visual-inertial bundle adjustment after ten secondsHowever, the above expressions have an important draw- back. each time that b or b are modified all the imu A(b)x=s(b, a, B (12) integration requires to be recomputed, which is very time consuming. To solve this problem we have adopted the where x=(v1, A ))is the unknown vector with linear linear preintegration correction from [7] [8], splitting these dependence. To jointly find linear and no-linear dependent expressions in bias dependent and non-dependent terms parameters, we solve the next unconstrained minimization using delta terms△R1s,△v1,,△p1 which are direct p computed from IMU measurements and which are defined as follows (bg,a, B). arg min (min lIA(bo)x-s(b, a, B )2)(13) △R1全RR (5) Cost function c(b%, a, B)-min A(b9x-s(b,a, 3) RI(V (6) is evaluated for each(b, a, B)using the following scheme 1 △p1j会R(p;-p1-v1△t1257) l) Update△R1,;and△p1. USing[8J, we dont need to reintegrate all imu measurements each time that If during intermediate steps b9 changes more than 0.2 b changes. We simply update delta terms using their rad/sec from the value used for preintegration, the prein- Jacobians w.r.t. bias. That supposes an important com- tegration is recomputed with this new bias, otherwise, delta putational saving terms are directly updated using their Jacobians w.r.t. biases 2)Compute A(b9) and s(b, a, B) and build the linear O△R1△v1,a△v1,;O△p1yO△p this way, we system relinearize each time we get too far from the linearization 3)Solve x=A(b )\s(b, a, B)using conjugate gradient, point. These Jacobians can be found in [ 8. We rewrite ec which is suitable for sparse systems (1)using expressions(5),(6)and(7) 4)Compute c(b,c, B)=A(bX-s(b,c, B) The computational cost of evaluating clog △,-△f1 first, from solving a sparse system with no more than iu-入u-v1△t1,-g 3+m× m unknowns and3×(m.-1)× m. equations APL, -Apl, 1, +(AR1,j, tBo (8) and, second, from integrating inertial measurement along the initialization time. However, using formulation from [8] v=1..m,∈C2\1 we can avoid reintegration, integrating IMU measures only once, and updating preintegrated terms by means of a linear Now, we can add the gravity magnitude information. approximation Instead of adding it as a constraint of the gravity magnitude To optimize c(b, a, B)and find the correct gyro bias and as done in [2], we prefer to model the gravity by mean of gravity direction we use Levenberg-Marquardt algorithm a rotation matrix parametrized by only two angles(a, B) where Jacobians of the cost function are computed numeri (rotation around x-axis has no effect) and the vector gr cally. As result, not only IMU initialization parameters are (0,0,-g), thus we remain in an unconstrained problem found (g, b9 and V1) but also the position of tracked points A,Ul). We highlight that not all M tracked features have g= Exp(a, B,0)gr (9) been used during this initialization, but only a small set Equation (8) becomes of m features, aiming to reduce computational complexity. However, the solutions found after this step are not accurate enough to launch the system, and further intermediate stages 入 1△t i(b,b, 0, B) (10) are required where II. IMPROVED SOLUTION A. First ba and observability test SIi (b9,b,a, B)=Exp(a, B, 0)gI △行,1 After finding the initial parameters(g, b, V1, j)we build a visual-inertial BA problem with the same n body poses and m points from the previous step(see figure 2) △p1-△p11+(△R-△R12(11) body poses have six optimizable variables(t)∈se(3) Bc We set the z axis in the estimated gravity direction. All Now, each time biases are updated, we do not need to except the first one, which has only two(pitch and roll) since preintegrate again all the measurements, but only to update translation and yaw have been fixed in order to remove the them by means of Jacobians. Neglecting accelerometer bias four gauge freedoms inherent to the visual inertial problem as in [2], the only unknowns are A;(Vi= 1..m,j c(initial position and yaw). Body velocities are also included Ci\1i), v1, c, B and b. Stacking equations for all possible in the optimization task, and they evolve according to the values of i, and 3 we build an overdetermined sparse linear inertial measurements. Initial estimations for each vertex system, with only three non-zero elements per row, such as: are added using results from the MK-solution. In addition, 1290 H Matrix (log10 scale) Tracked Points for initialization IMU error pendencies Vk k-th velocit iI KFi-KF1+1 ii KFi-IMU Bias b Accelerometer bias Ii KF i-MPJ MP TK k-th n Initialization frames 6 Fig. 2: Graph for the first visual-inertial BA. The body poses 100 and points included in the optimization are the same used in the initial solution 20 100 Fig 3: Example of Hessian matrix for an initial map with 5 keyframes(KF)and 20 map points(MP). One can distinguish accelerometer bias b is included in this optimization, but different blocks, outlined with dashed lines. In the top- similarly to b it is assumed to be constant for all frames. left part, we have the diagonal blocks of each keyframe Previous b9 estimation from MK-solution step is included (red), blocks relating consecutive keyframes, due to the IMU bv y means of a prior, as well as b is forced to be e close measurements(blue), and blocks relating keyframes and to zero. We call this optimization first BA or simply BAl. IMU biases(pink). In the bottom-right part, there are only the Analytic expression for Jacobians, found in [81, are used diagonal blocks of the map points(orange). Out-of-diagonal for IMU residuals, while Jacobians for the reprojection error terms relate map points with the keyframes that observe them have been derived analytically, taking into account that we (brown). In this example all cameras observe all features are optimizing body pose and not camera pose. Usually this optimization provides a better initialization solution. However, if the motion performed gives low observ in figure ability of the IMU variables, the optimization can converge Applying the svd decomposition to H and looking at the to arbitrarily bad Solutions. For example this happens in case smallest singular value one can determine if the performed of pure rotational motion or non-accelerated motions [l].In motion guarantees observability of all the IMU variables order to detect these failure cases we propose an observ- Hence we discard all initializations where the smallest bility test, where we analyze the uncertainty associated to Hessian singular value falls below a threshold denoted b estimated variables. This could be done by analyzing the lobs. If this observability test is not passed, we discard the covariance matrix of the estimated variables and checking if initializ ation attempt. Examples of a successful and a rejected its singular values are small enough. However, this would case are shown in figure 4 require to invert the information matrix, i.e. the hessian matrix from first BA, which has high dimensions (3m+6+ B. Consensus test and second BA 9n-4), being computationally too expensive. Instead, we perform the observability test imposing a minimal threshold as we have noted before, not all m tracked features have to all singular values of the Hessian matrix associated to our been used in MK-solution and first BA steps, but only m first BA. The Hessian can be built from the Jacobian matrices features To take advantage of these extra unused tracked associated to each edge in the graph, as explained next points, we propose to perform a consensus test in order Denote x1.xp) the set of p states, and eI.e,t the to detect initializations which have been performed using set of q measurements which appear in the first BA. Let's spurious data, such as bad tracked features call the set of measurement where state i is involved. The First, the 3D point position of each unused track is Hessian block matrix for states i and ,, taking a trst order triangulated between the two most distant frames which saw approximation can be built as follows the point, by mean of Least-Squares triangulation using a SVD decomposition [12]. Only tracks with parallax greater H≈∑J (14) than 0.01 radians are used. Then we re-project each 3D point into all the frames which observe it, compute the residual re e∈S1nC projection error, and perform a x2(95%)test with 2ni-3 where e stands for the information matrix of the e mea- degrees of freedom, where n; is the number of frames which surement, and Ji.e for the Jacobian of the e measurement observe this point. The consensus test is performed counting wIt.i-th state. In order to have a non-zero (i, j block the percentage of inliers: if it is bigger than a threshold t matrix, there must to be an edge between i and j node in the we consider that the proposed solution is accurate, if not, we graph(measurement depending on both variables) as shown discard the initialization attempt 1291 H singular values (logia) TABLE I: Parameters of our initialization algorithm Successful case Failure case Total number of tracks M 10.0 Sing. Values1100 Sing. values Track-length test (in pixels) 200 Tracks used for MK-solutio 20 d for MK-solu Observability lest: Singular value threshold t 0.1 5.0 Consensus test: Inlier threshold Coris 0.0 nitializations Trajectories after bser ability and consensus test Good Initialization Bad Initia ization 5.0 7.5 7.5 10.0 0255075100 75100 1 Singular Value index Singular value index E Fig. 4: Singular values of the information matrix for a successful initialization and a failure case on the duroc V103 sequence. The successful case has a RMSE ate error of 3. 16 in the initialization trajectory, and corresponds to a translation and rotation motion the failure case has an error of 64.99 and corresponds to an almost pure rotational mono on. We draw the observability threshold used tobs =0.1 X(m) If the consensus test is successful, we perform a secona Fig. 5: Initializations found along the EuRoC V101 trajec BA(or simply Ba2) including the m points used in the initial tory, after the observability and consensus tests. In blue, solution plus all the points which have been triangulated and ground truth trajectory, in green, estimated initialization detected as inliers, having a total of Mpoints. The graph trajectories whose RMSe ate error is lower than 5%; in for this optimization is similarly built than in case of Ba l red. those with a bigger error. Our method was able to find 51l correct initializations along the whole trajectory running but with more points in real time C. Map initialization After this second BA, the keyframe poses are accurate enough, but we only have a few points to initialize the A. Results map. Before launching the whole OrB-SLAM Visual-inertial Euroc dataset provides stereo images and synchronized system, we triangulate new points aiming to densify the point IMU measures for three different indoor environments, with cloud and to ease the posterior tracking operation. Since we different complexity. We have tested our method for envi already have the keyframe poses, we extract ORB features in ronment VI from EuRoC at three difficulty levels. We run each keyframe and perform an epipolar search in each other, two different experiments using the orb descriptor. All these new points, together In a first experiment, we try lo initialize as often as possi- with the M points from BA2, are promoted to map points, ble in real time. Along the whole trajectory, every time the and the n frames used for initialization are promoted to map tracking thread has m tracks with length l, if the initialization keyframes. The covisiblity graph [13] of this new map is thread is idle, a new initialization attempt is launched. Figure also created, taking into account the observations of points. 5 shows the initializations found for trajectory V101 after the observability and consensus test. We show in red trajectories IV. EXPERIMENTS which have a RMSE atE [15] error bigger than 5% of the The most important parameters of our method are shown initialization trajectory length. We can see in the figure that in table I. Our implementation uses ORB-SLAM visual- our initialization algorithm is successful almost along all inertial [5] with its three threads for tracking, mapping and the trajectory. The parts without initializations are due to loop closing. Initialization is performed in a parallel thread, rejection from observability or consensus test thus it has no effect in the real time tracking thread for In table i we show the main numerical results of these MK-Solution we use Eigen C++ library, while for graph experiments with the three v1 sequences. RMSE ATE [15 optinization of Bal and BA2 we use g20 C++ library [14 is expressed in percentage over the length of the initialization Experiments have heen run in V1 dataset from EuRoc [9 trajectory. Below each sequence name we show successful using a Intel Core i7-7700 computer with 32 GB of memory. initializations over the total number. First thing to notice 1292 TABLE IL: Results of exhaustive initialization tests over the TABLE Ill: Comparison of running time for MK Solu- three VI EuroC sequences tion+ BA1+BA2 repeating MMU integration in each iteration Ⅴ 1 EuRoc datasct and using preintegration with first order bias correction [81 v101 EuRoC Dataset track-length test cons. rest RMSE Scale RMSe Scale CPu Trajectory CPU time (ms) atE (%o)error(%) AtE (%) error (%)Lime (ns) time (s) Mean Std Max MK-solution 9.176 32.998 7.74925.10495.082 2.235 Reintegrating each time 301 302 91.974 678.886 V101 MK-Solution Using first order correction 120.983 27 609 214.989 511/728)+BA1 3.97710.7192.3526.471104.1142235 MK-solution + bal&2 3.2708.81620365496120.9832235 MK→ solution12.025156.7516.7604892660.2850.968 V02 MK-solution6.33825.2522.5417.195109630.968 TABLE IV: Results of VI-SLaM using our initialization (101/395)+BA (average errors on five executions are shown) MK-solutio15.14920.3411.9355.497844430.968 +BA/2 VI EuRoc dataset MK-solution47.928128.0086.63421.69162.160 1.070 Atter V103 MK-solution 71.774 After ba ss After BA lOs 28.160 2.4756.83673.301 1.070 initialization 1/336)+ MK-solution7106824.551.8705259846761070 RMSE Scalc RMSE Scalc RMSE Scal +bal&2 ate error AtE error Ate error ame (m)(% mj V1_lasy0.01834990.02001.850.0170084 Ⅴ102 medium0.03647380.00763.670.01620.71 is the large number of initialization attempts. For example Vl_03 difficult00043480001292.500.01200.27 in sequence V101 which lasts 130 seconds, up to 728 initializations are computed, and 511 of them have passed the observability and consensus test. The table shows that initialization can be found in the accompanying video the original Martinelli-Kaiser solution obtains average scale Compared with the initialization method proposed in [5] errors between 32.9%0 and 156.7%o on these sequences. our method requires trajectories of 1 or 2 seconds instead of This error can be reduced until 8.8% to 24.5% applying 15 seconds, uses less CPU time, and is able to successfully the two rounds of visual-inertial BA proposed here. More initialize in sequence V103, where the previous method interestingly, applying the novel observability and consensus failed tests, inaccurate initializations are consistently rejected, and the average scale error is reduced to around 5% for all sequences, a very significant improvement over the original V. CONCLUSIONS method. The atE error is also drastically reduced after both tests We have proposed a fast joint monocular-inertial initial Considering the initialization time we see an evident dif- ization method based on the work of Martinelli [1] and ference between V101, that requires initialization trajectories Kaiser et al. [2]. We have adapted it to be more general, of 2.2 seconds in average. and V102 and V103 where 1 allowing incomplete feature tracks, and more computation second is enough. In these two last sequences motion is faster ally efficient using the Imu preintegration method of Forster and the track-length test is satisfied in less time than in the et al. [8]. Our results show that the original Martinelli-Kaiser first sequence, where the quad-copter is flying at low speed. technique does not provide a good enough initialization in Regarding the computational cost, the average CPU re- most practical scenarios, hence we have proposed two visual quired to solve the initialization is less than 85ms for inertial Ba steps to improve the solution and two novel sequences V102 and V103, and around 121ms for V101, tests to detect bad initializations. These techniques have due to the longer preintegration period. In all cases, the MK- proven to be worth it, reducing scale error down to 5yo and solution step takes around 75% of the total initialization CPu rejecting bad initializations Solutions found after those steps time is good enough to launch Visual-Inertial ORBSLAM [5] and In table Iii we show computational times for our method converge to very accurate maps which uses preintegration with first order bias correction In summary, we have developed a fast method for joint from [8]. Compared with using the original formulation from initialization of monocular-inertial SLAM,using trajectories Martinelli and Kaiser, computing time is reduced by 60%. of l to 2 seconds, that is much more accurate and robust than In a second experiment, we launch visual-inertial ORB- the original technique [21, with a maximum computational SLAM [5]and we retrieve the RMSe ate and the scale error cost of 215ms just after the proposed initialization, and after performing As future work we would like to investigate the adaptation full visual-inertial ba at 5 seconds and 10 seconds from of the initialization method to the stereo case taking into the first keyframe timestamp. We can see in table Iv that account that scale is directly observable from the images. We all three sequences converge to scale error smaller than 1% are also interested in taking profit of gyroscope readings for after 10 seconds, confirming that our initialization method tracking, even before the initialization has been performed is accurate enough to launch visual-inertial SLAM. An ex- Finally, we would like to test the initialization performance in ample of Visual-Inertial ORBSLAM [5] using our proposed in more difficult scenarios with our own acquired sequences 1293 REFERENCES [9] M. Burri, J Nikolic, P Gohl, T. Schneider, J. Rehder, S. Omari, M. w Achtelik. andR. siegwart. "The Euroc micro aerial vehicle datasets [1 A Martinelli, " Closed-form solution of visual-inertial structure from The International Journal of Robotics Research, vol 35, no. 10, pp motion, International Journal of Computer Vision, vol. 106, no. 2 1157-1163,2016 pp.138-152.2014 [10 E. Rublee, V. Rabaud, k. Konolige, and G. Bradski, ORB: An effi [2 J. Kaiser, A Martinelli, F. Fontana, and D. Scaramuzza, "Simultaneous cient altcrnativc to SIFT or SURF, in IEEE International Conference state initialization and gyroscope bias calibration in visual inertial on Computer Vision(ICCV), 201l, pp. 2564-2571 aided navigation "IEEE Robotics and Automation Letters, vol 2 no. 1, [Il B. D. Lucas, T. Kanade et al., "An iterative image registration pp.18-25,2017 echnique with an application to stereo vision, in Int. Joint. Conf [3]M. Li and A.I. Mourikis, "Iligh-precision, consistent EKF-based on Artificial Intelligence (IJCAl), 1981, pp. 674-679 visual-inertial odometry, "The International Journal of Robotics Re- [12]R. Szeliski, Computer vision: algorithms and applications. Springer search,vol.32,no.6,pp.690-711,2013 Verlag. London 2011 [4]S. Leutenegger, S. Lynell, M. BoSse, R. Siegwart, and P. Furgale, [13]R. Mur-Artal, J. Montiel, and J. D. Tardos,ORB-SLAM: a versa Keyframe-based visual-inertial odometry using nonlinear optimiza tile and accurate monocular SLAM system. IEEE Transactions on tion, The international Journal of robotics Research, vol 34, no. 3 Robotics,vol.31,no.5,pp.1147-1163.2015 pp.314334,2015. [14]R. Kuilllmerle, G. Griselli, H. Strasdal, K. Konolige, and W. Burgard, [S]R. Mur-Artal and J D. Tardos, ""Visual-inerlial Nonocular SLAM with g 20: A general framework for graph optimization, in IEEE Inter map reuse, IEEE Robotics and Automation Letters, vol 2, no 2, pp national Conference on Robotics and Automation(ICRA), 2011, pp 796-803,2017 3607-3613 [6 T Qin and S. Shen, "Robust initialization of monocular visual-inertial [15] J. Sturm, N. Engelhard,F. Endres, W. Burgard, and D. Cremers,"A estimation on aerial robots, in IEEE/RS/ International Conference on benchmark for the evaluation of RGB-d SlaM systems, "in IEEE/RS/ Intelligent Robots and Systems(IROS), 2017, pp International Conference on Intelligent Robots and System.(IROS) [7] T. Lupton and S. Sukkarieh, Visual-inertial-aided navigation for high 2012,pp.573-580 dynamic motion in built environments without initial conditions IELL Transactions on Robotics, vol. 28, no. l, pp 61-76, 2012 8 C. Forster, L Carlone, F. Dellaert, and D. Scaramuzza, "IMU preinte gration on manifold for efficient visual-inertial maximum-a-posteriori estimation, in Robotics: Science and Systems, 2015 129

(系统自动生成,下载前可以参看下载内容)