In the field of computer vision, predicting the motion of a moving camera is a universal problem.
Technologies such as autonomous drones and self-driving cars are gaining more attention. Therefore, there is an emergent need for fast and efficient algorithms that facilitate onboard video processing, to return accurate and timely information at a low computational cost.
Moreover, such an estimation of camera movement, also called “pose estimation,” is a critical component of target tracking on-board moving vehicles or platforms.
At Brigham Young University, researchers have developed a technique to considerably decrease the computation time and complexity of pose estimation by smartly “seeding” an algorithm that is already being used in the computer vision industry.
The study results have been reported in IEEE/CAA Journal of Automatica Sinica—a joint publication by the Institute of Electrical and Electronics Engineers and the Chinese Academy of Sciences.
Frames of a video feed from a moving camera are used by pose estimation algorithms to produce hypotheses on how the camera has moved over the duration of each consecutive frame. To date, algorithms employed for pose estimation necessitated the generation of about 5 to 10 hypotheses for how the camera moved, from the data offered by the video feed.
These hypotheses were scored by how well they fit the data, where the highest-scoring hypothesis was chosen as the most appropriate pose estimate. Sadly, it is computationally expensive to generate multiple hypotheses and they lead to slower return time for robust pose estimation.
The team discovered a technique to seed, or offer hints to, an algorithm already used in computer vision by feeding it with information in between each frame, thereby considerably minimizing the need for producing multiple hypotheses.
“At each iteration, we use the current best hypothesis to seed the algorithm.” The decrease in the required number of hypotheses directly leads to decreased complexity and computation time; “we show that this approach significantly reduces the number of hypotheses that must be generated and scored to estimate the pose, thus allowing real-time execution of the algorithm.”
The researchers compared their seeding technique with other sophisticated pose estimation algorithms to identify the impact of the reduction in the number of hypotheses on the computation precision.
After 100 iterations, error for the seeding methods that use prior information is comparable to the OpenCV five-point polynomial solver, despite the fact that only one hypothesis is generated per iteration instead of an average of about four hypotheses.
Moreover, investigation of the two algorithms in time revealed that the researchers’ algorithm considerably outclassed other sophisticated methods. In a majority of the cases, the new algorithm was found to be 10 times faster.
Then, the researcher modified their algorithm to allow target tracking and tested it on a multi-rotor UAV. The algorithm was successful in tracking several targets at a resolution of 640 x 480. The findings correlated well with their previous analysis.
The complete algorithm takes 29 milliseconds to run per frame, which means it is capable of running in real-time at 34 frames per second (FPS).
As a next step, the researchers intend to apply their algorithm for 3D scene reconstruction and more complicated tracking techniques.
White, J H & Beard, R W (2020) An iterative pose estimation algorithm based on epipolar geometry with application to multi-target tracking. IEEE/CAA Journal of Automatica Sinica. doi.org/10.1109/JAS.2020.1003222.