Week 1: 2D Greetings
March 4, 2024
Welcome to my 2D blogs about 3D models! Over the next weeks, I’ll be diving into a journey of clouds, not the type in the sky, but point clouds that will be the foundation of my senior project. As a commemoration of the Disney trip my fellow seniors and I have just come back from and an introduction to what point clouds look like, I will include a video of a dense point cloud generated from our Mickey ears in my next blog. I was going to include it in this blog, but I’m currently getting errors with building the OpenSFM library.
Structure from Motion Photogrammetry
Photogrammetry is the art of using 2D images to get information about its 3D qualities, such as depth, shapes, and sizes of objects in the image.
Structure from motion photogrammetry is a technique for generating 3D models by estimating the X, Y, and Z coordinates of each point in a photo by comparing the variation in different photos of the same scene. The resulting 3D model is called a point cloud because it is comprised of a group of points, which don’t form the continuous surfaces seen in animations. Since a lot of images with different perspectives of the same location are required to accurately reconstruct a 3D model, structure from motion can take several hours. When I ran it on 60 images of a 5 square feet plot of land in my backyard, it took around 2-3 hours.
The variation in photos is caused by the phenomenon known as parallax, which is what causes mountains far away to appear stationary even when you’re driving at 90mph on the highway. The Apple website extensively uses parallax to convince you to make your annual purchase of their software update in an iPhone case with one more camera.
Here’s another example of a website that uses parallax to create a visual novel experience: https://www.sbs.com.au/theboat/.
Purpose
I tackled the optimization of photogrammetry with the main motivation of increasing the relevance of using this technology for disaster response. Since emergency responders are not always trained in interpreting graphs generated with techniques like LIDAR and SLAM, the photorealism (maintained colors and textures from pictures) of photogrammetry makes it very useful. Additionally, you don’t need fancy (and expensive) equipment, unlike with LIDAR. Instead, a simple phone camera and cheap drone are enough.
Goal
I’ll be investigating if replacing NumPy vector operations with Bit-sliced Indexing (BSI) operations as the primary data structure can speed up the algorithm used in OpenSFM, an open source implementation of structure from motion photogrammetry. Since both data structures are capable of running many of their operations in linear time, I’ll be investigating how to diminish the constant factor to turn out better results. Speeding up the binary operations will make creating 3D models time-efficient and increase the chances of the 3D model being useful for urgent scenarios.
Currently, it takes a few hours to run OpenSFM on up to 100 photos. My goal is to reduce the time to maybe an hour or so, enough to demo on the day of senior project presentations. (Early advertisement to come to my presentation if you want a chance to get your own 3D model of anything you can bring to school, including yourself!)
Preview of Upcoming Blog
I cloned and partly set up the starter code with the necessary dependencies for running OpenSFM to make the Mickey ears demo in this blog. Next week, I will finish setting it up, and then it will be all about the math because I’m going to have to figure out how to break down the equations into functions in BSI.
Sources:
https://www.sbs.com.au/theboat/
Leave a Reply
You must be logged in to post a comment.