It is nearly impossible to overstate the enthusiasm for deep-learning-based AI among most of the computer science community and big chunks of the tech industry. Talk to nearly any CS professor and you get an overwhelming sense that just about every problem can now be solved, and every task automated. One even quipped, “The only thing we need to know is which job you want us to eliminate next.” Clearly there is a lot of hubris baked-in to these attitudes. But with the rapid advances in self-driving vehicles, warehouse robots, diagnostic assistants, and speech and facial recognition, there is certainly plenty of reason for computer scientists to get cocky.
And no one is better at being cocky than Nvidia CEO, Jen-Hsun Huang. On stage, he is always something of a breathless whirlwind, and as he recapped the recent, largely Nvidia-powered, advances in AI, and what they portend for the future, it reminded me of a late-night infomercial, or perhaps Steve Jobs revealing one more thing. In this case, though, Nvidia has a lot more than one thing up its sleeve. It is continuing to push forward with its AI-focused hardware, software, and solutions offerings, many of which were either announced or showcased at this year’s GTC.
For anyone who still thinks of Nvidia as a consumer graphcis card company, the DGX-1 should put that idea to rest. A $129,000 supercomputer with 8 tightly-coupled state-of-the-art Pascal-architecture GPUs, it is nearly 10 times faster at supervised learning than Nvidia’s flagship unit a year ago. For those who want something a little less cutting edge, and a lot less expensive, Nvidia offers the M40 for high-end training, and the M4 for high-performance and low-power AI runtimes.
Nvidia has supported AI, and especially neural net, developers for a while with its Deep Learning SDK. At GTC Nvidia announced version 5 of it neural network libraries (cuDNN). In addition to supporting the new Tesla P100 GPU, the new version promises faster performance and reduced memory usage. It also adds support for Recurrent Neural Networks (RNNs), which are particularly useful for applications that work with time series data (like audio and video signals — speech recognition, for example).
CuDNN isn’t a competitor to the big neural net developer tools. Instead, it serves as a base layer for accelerated implementations of popular tools like Google TensorFlow, UC Berkeley’s Caffe, University of Montreal’s Theano, and NYU’s Torch. However, Nvidia does have its own neural net runtime offering, Nvidia GPU Inference Engine (GIE). Nvidia claims over 20 images per second, per watt for GIE running on either a Tesla M4 or Jetson Tx1. CuDNN 5, GIE, and the updated Deep Learning SDK are all being made available as part of an update to Nvidia’s ComputeWorks.
TensorFlow in particular got a big shout-out from Huang during his keynote. He applauded that it was open source (like several of the other tools are) and was helping “democratize AI.” Because the source is accessible, Nvidia was able to adapt a version for the DGX-1, which he and Google’s TensorFlow lead Rajat Monga showed running (well, showed a monitor session logged into a server someplace that was running it).
The always-fascinating poster session in the GTC lobby featured literally dozens of different research efforts based on using Nvidia GPUs and one of these deep-learning engines to crack some major scientific problem. Even the winner of the ever-popular Early Stage Companies contest was a deep-learning application: Startup Sadako is teaching a robot how to learn to identify and sort recyclable items in a waste stream using a learning network. Another crowd favorite at the event, BriSky, is a drone company, but relies on deep learning to program its drones to automatically perform complex tasks such as inspections and monitoring.
Programming a problem-solving neural network is one thing, but for many applications the final product is a physical vehicle, machine or robot. Nvidia’s JetPack SDK — the power behind the Jetson TX1 developer kit — provides not just a Ubuntu-hosted development toolchain, but libraries for integrating computer vision (Nvidia VisionWorks and OpenCV4Tegra), as well as Nvidia GameWorks, cuDNN, and CUDA. Nvidia itself was showcasing some of the cool projects that the combination of the JetPack SDK and Jetson TX1 developer kit have made possible, including an autonomous scaled-down race car and autonomous (full-size) 3-wheeled personal transport vehicle, both based on work done at MIT.
Huang also pointed to other current examples of how deep learning — made possible by advances in algorithms and increasingly powerful GPUs — is changing our perception of what computers can do. Berkeley’s Brett robot, for example, can learn tasks like putting clothes away, assembling a model, or screwing a cap on a water bottle by simple trial and error — without explicit programming. Similarly, Microsoft’s image recognition system has achieved much higher accuracy than the human benchmark that was the gold standard until as recently as last year. And of course, AlphaGo’s mastery of one of the most mathematically complex board games has generated quite a bit of publicity, even among people who don’t typically follow AI or play Go.
In line with its chin-out approach to new technologies, massive banners all over the GTC proclaimed that Nvidia’s AI software learned to be a better driver than a human in “hours.” I assume they are referring to the 3,000 miles of training that Nvidia’s DAVENET neural network received before it was used to create the demo video we were shown. The statement reeks of hyperbole, of course, since we didn’t see DAVENET do anything especially exciting, or avoid any truly dangerous situations, or display any particular gift. But it was shown navigating a variety of on and off road routes. If it was truly trained to do that by letting it drive 3,000 miles (over the course of 6 months according to the video), that is an amazing accomplishment. I’m sure it is only a taste of things to come, and Nvidia plans to be at the center of them.