Facebook has made it clear that it sees the future in Virtual Reality through its multi-billion dollar purchase of Oculus. However, now that the Rift is finally shipping, and millions are at least experimenting with inexpensive alternatives like Google Cardboard, the company has turned its attention to creating more content for those devices. At its annual developer conference, F8, it took a big step in this direction by announcing it has not only designed what it considers the best 360-degree camera rig, but that it will open source the hardware and software. All you’ll need is $30,000 worth of components and a couple engineers to assemble one of your own.
Leaving aside the creative challenges in shooting immersive video, the technical challenges themselves are also fairly daunting. They fall on both the hardware and the software — although Surround 360 project lead, Brian Cabral, makes it clear that getting the hardware right is the preferred approach, as it makes the software’s task much more tractable. He identifies three major challenging areas that Facebook had to deal with to ensure that Surround 360 video would be high-quality and artifact-free: creating a sense of depth, synchronizing the cameras, and aligning them. In each case, Facebook has made the investment to create a hardware design that addressed the issues. Better yet, the company is planning to give away its hardware and software designs, so that they can be implemented and improved by anyone willing to take on the challenge.
One reason that binocular vision and head movement are such powerful tools for humans and other animals to determine depth is that objects at different depths move differently in our field of view — when either they move, or we move our head. Unfortunately, it isn’t possible to capture video simultaneously from every possible angle and eye position, as we’re limited to a discrete number of cameras, and finite bandwidth. This means we need to post-process the captured video to create and output two 360-degree spherical videos — one to show to each eye with a 3D viewing device like Gear VR, or Oculus, or Vive.
Because the Surround has 50% overlap between adjacent cameras, it gets two different looks at each object, making it possible to mathematically construct a stereo view. Using a technology called optical flow to analyze the captured images and create a depth map of objects, Surround’s software is able to calculate the appropriate stereo disparity for each point of the image sphere, and use that to create an appropriate spherical video frame for viewing by each eye.
This feature — allowing 3D experiences — is perhaps the biggest differentiator between high-end capture rigs like Jaunt VR, GoPro Odyssey, Surround, and others when compared with consumer-oriented or mid-priced offerings like the newly announced GoPro Omni (which is “only” $5,000).
Optical flow works by analyzing multiple image frames captured over time to develop a model of the most likely motion for objects in the scene. Since the camera may be moving, and different objects may be moving at different speeds, the process is a complex one. Worse, in some cases it is not possible to determine with certainty what object motion caused a particular shift of that object in subsequent frames. To give the software a chance to accurately build these stereo models, it is essential that all cameras in the rig capture images at precisely the same time — a feature referred to as gen-lock. For Surround 360, that means 17 sensors that all have to fire at once.
Another non-starter in many cameras is the use of a rolling shutter. Rolling shutters expose each frame one piece at a time, so rapid motion can cause a tearing within a frame. For traditional video this isn’t always visible, and can often be somewhat corrected in post processing. However, when multiple cameras are being used to generate a 360 stereo image, rolling shutter artifacts make stitching them together nearly impossible. So the Surround design relies on pricier global shutters (where the entire image is captured at once) to avoid this issue.
Similarly, mis-aligned cameras are a recipe for messy images and additional stitching headaches, caused by needing the software to attempt to align frames after the fact (you can see this issue when looking at a panorama taken using your phone or other handheld camera). Thick metal plates and rigid connections help the Surround 360 make sure its cameras are properly aligned, and that they stay aligned.
The camera rig features 17 4.1MP industrial-grade imagers from Point Grey. Fourteen are arranged with a 50% overlapping field of view around a circle. Each of them has a wide-angle 7mm f/2.4 lens. They are complemented by three fisheye lenses — one on top and two on the bottom. The fisheye’s have a 2.7mm focal length (giving them a 185-degree field of view) and an adjustable aperture of f/1.8-f/16. The rig has two lenses on the bottom so that they can in essence shoot around a support pole, making it disappear in the final image. All the cameras feature global shutters to eliminate the tearing artifacts typical of less-expensive rolling shutter cameras.
Thick aluminum top and bottom plates both help insure rigidity — necessary for proper alignment — and serve as heat sinks to allow extended operation. The structural plates are then covered with an 18-gauge powder-coated steel cover.
The camera’s software — controlled by a web interface to the unit’s Linux-powered CPU — is used to control frame rate, exposure, and other settings. To ensure all the cameras fire at the same time, Hirose GPIO cables are used, with the top camera serving as the master, and the other 16 slaved off of it. The system’s software runs overnight on servers to generate stereo 8K (and 4K and 6K) 360-degree output, with the option to also generate monoscopic versions — yet another example of why GPU makers like AMD and Nvidia are so excited about VR.
There are quite a few other high-end rigs for capturing 360-degree stereo video, like Samsung’s Beyond, JauntVR’s proprietary rig, and Nokia’s OZO. Google’s Jump project uses the GoPro Odyssey, a custom rig built from off-the-shelf GoPro cameras. All of these are expensive, and so far, all are proprietary. While Facebook’s current design is also expensive — with component cost estimated as being around $30,000 — it will be entirely open source, which should encourage innovation both in features and in cost reduction. For example, one of the cost drivers for the Surround 360 is its 8K output capability. That drives the use of 4MP sensors, and the expensive RAID array of SSDs to offload the Raw data. A 4K-only version could likely be much less expensive, but still allow for immersive experiences suitable for many viewing conditions.
For a slick explanation of the system, Facebook has provided this video introduction.