Structure from Motion for aerial images using Bundler

Structure from Motion for aerial images using Bundler

Introduction

Bundler is an open source Structure from Motion software for reconstructing 3D scenarios from unordered image sets. It includes several computer vision algorithms and packages, as for instance SIFT, ANN, Sparse Bundle Adjustment and Nister's 5 point algorithm among others. As can be seen in its website, Bundler has been succesfully used for unordered image collections taken from the Internet. Concerning the work here described, the approach has been completely different. Ordered sets have been used instead, consisting of frames taken from a short video sequence. Moverover, the video shootings were recorded from aerial vehicles with non-calibrated cameras and no Exif tag was available for initializing the reconstructions. In other words, it does not have any focal length information to rely on. Though this is a very tough scenario for Bundler, some succesfull results were obtained in this work by forcing the focal length initialization to be in the frames width order of magnitude.

Aims

This work has pursued the use of state of the art open source SfM software with videos recorded from aerial vehicles. In all cases, cameras are non-calibrated and Exif tags are not available. The aims of the project can be summarized in the following points:

Test Bundler with aerial videos.
Test the same scenarios but using forced focal length initializations.
Validate Bundler results.
Check estimated focal lengths and radio distortions.
Georeference Bundler results.
Place the 3D cloud results on top of orthoimages.

Results

Bundler results are displayed with Meshlab and visually checked for consistency. Since frames are taken from a video sequence, the line linking all cameras must follow a coherent trajectory with regard to the aircraft flight. As can be seen in the images below, the camera aircraft trajectory is easily identified in Bundler's results. The first image corresponds to Bundler reconstruction result before being georeferenced. The second image displays the results after georeferencing them and placing them on top of an orthoimage of the area.

For easily georeferencing the scenario, recognizable objects are identified both in the orthoimage and in the point cloud. The adjust is fairly easy when structures are identified, specifically buildings.

This reconstruction results have been obtained forcing the initial value of the focal length to be close to the width of the image in pixels. In the following two graphs

The first graph shows the first camera pair focal length estimation evolution along the SBA optimization iterations. Number of SBA iterations is displayed on the horizonal axis and focal length value in pixels on the vertical one. After some iterations the value tend to converge to a value.

All previous images where taken from the same scenario. When loading the input frames, the focal length was forced to a value with respect to the image width. If initial estimations are not provided with the set of images (from Exif tags), focal length values obtained with Bundler are far more stable when they are initialized to a forced value close to the real one. The focal length( $f_{pix}$ ) in pixels and the width( $w_{pix}$ ) of the image are related by the following equation:

$f_{pix} = r w_{pix}$

where $r$ is a ratio. This ratio is an scalar value which depending on the camera configuration can be assumed to be close to 1. The ratio was set to $r \approx 1,43$ for the scenario shown above, providing visually acceptable reconstructions. Since the camera calibration is totally unknown, the ratio was chosen after analysing Bundler's initial output with unforced focals and then used to rerun Bundler with forced focals. This approach may be valid under certain circumstances when approximate results are enough. If accurate 3D scenarios are a must, then another approach should be taken, such as knowing the focal length range of the camera (camera zoom), the exact focal lenght for the images or additional constraints on the scenario.

[Note: the information on this section will be enlarged and better explained, updates coming soon]

References

David Núñez Clemente. "Validación de una herramienta de reconstrucción de estructura para su aplicación a vídeos tomados desde un avión no tripulado (UAV)",

Trabajo Fin de Carrera, Ingeniería de Telecomunicación, Universidad de Alcalá, Julio 2010.

Spanish [pdf][slides]

English [slides]

Links

More information about Bundler can be found at Bundler's homepage.

Very nice results of SfM can be seen in Building Rome in a day.

If you need a software to display PLY files, go to Meshlab homepage.

If you need dense muti-view stereo scenarios from Bundler output files, then have look at PMVS and CMVS.

Acknowledgements

First I would like to thank INTA for granting me the Rafael Calvo Rodés scholarship, where I started in the computer vision field, and specially to Severino Fernández, who was my supervisor during this time. I thank Noah Snavely for kindly answering the mails I sent him with questions about Bundler and Pablo Fernandez Alcantarilla for the discussions we had about Bundler's algorithms and SfM which were very instructive. I also thank the Meshlab team for their application, which has been essential for displaying and analysing Bundler's PLY files.

Back to home page.

Last updated on Feb 16, 2011