Qualcomm announced this morning that it’s building its first deep learning software development kit (SDK) for Snapdragon 820 processors. The new SDK (the Snapdragon Neural Processing Engine) runs on top of Qualcomm’s Zeroth Machine Intelligence Platform and is designed to leverage the heterogeneous compute capabilities of the Snapdragon 820.
Before we dive into this topic in more detail, let’s clear up one point of confusion. We first reported on Zeroth more than a year ago, when Qualcomm was discussing including Zeroth as a physical hardware core known as an NPU, or Neural Processing Unit. This core was rumored to be included as standard on all Snapdragon 820 devices. We now know that Qualcomm opted not to ship an NPU with the Snapdragon 820, and the Zeroth brand name refers to a software machine learning platform rather than a specific processing block on the SoC.
Deep learning is a subset of machine learning which, as the name implies, is a method of teaching a computer how to do something, rather than programming it to do something. Early neural networks were fairly shallow, with an input layer, a few hidden layers, and then an output layer. A deep learning network, as the name implies, use far more layers to calculate the relationship between variables.
Neural networks are widely used in computer vision and have been deployed in that field for several decades, but much of the research into fields like self-driving cars has been made possible by advances in deep learning. A conventional neural network might have a single hidden layer where “weights” are computed for the purpose of facial recognition, speech interpretation, or handwriting analysis:
In an example like this, data is feed in, the network weights it (according to the parameters it has learned through training runs), and then the output is displayed. A deep learning network, in contrast, looks more like this:
Qualcomm, for example, currently uses Zeroth for two technologies: Snapdragon Scene Detect, which classifies objects, items, and people within a visual scene, and Snapdragon Smart Protect, which uses machine learning to look for suspicious behavior that could be a sign that a smartphone has been compromised.
If you’re having trouble grasping how deep learning is useful, consider the following example. Imagine you’re walking down the street and you see a house with the front door standing open. How you interpret this will depend on a great many additional data points: Is there a vehicle obviously being loaded or unloaded? Are there any people visible in or near the entranceway? Do you hear shouting, laughter, or music? Are there any lights on inside the house, and if there are, can you see anything? Is it 5 AM, 12 noon, or 11:30 PM?
The answers to these questions determines how you respond to the situation. If there are people moving in and out of the house and loud music playing, it’s probably a party. If no one is visible and the house is dark, you might be witnessing a break-in — or someone may simply have forgotten to latch the door properly. We assign “weights” to these probabilities and evaluate the situation accordingly — and we do it unconsciously and at extraordinary speed compared with a conventional computer. Conventional neural networks try to duplicate this process. Deep learning networks expand on the basic principles of neural networks, but add more hidden layers and, as a result, are capable of evaluating more complicated scenarios and making more sophisticated determinations.
According to Qualcomm, the Snapdragon Neural Processing Engine contains the following features:
Qualcomm is clearly interested in emerging markets like self-driving vehicles, as is Nvidia. The “intelligence” of deep learning has profound implications for how we interface with technology, however, and could potentially lead to a revolution in human-computer interaction.
One of the differences between computers on shows like Star Trek: The Next Generation and our own technology is that Star Trek (and plenty of other sci-fi) depicts a computer that’s both conversationally fluent and capable of interpreting less-than perfectly clear statements. The replicator knows that when Captain Picard says “Tea, Earl Grey, hot,” he wants his tea served at a specific temperature and does not ask him to explain what “hot” means. (There’s an interesting StackExchange thread on syntax and speech as depicted on Star Trek, for the truly nerdy.)
Deep learning networks could help us build computer programs that are far more capable of parsing human speech than current software. I suspect it’s also the basis for much of the work companies like Facebook and Microsoft are doing on bot research, though Tay’s implosion last month also shows the perils of such research.
With the Zeroth Machine Platform and the Snapdragon Neural Processing Engine, Qualcomm is throwing its hat into the ring and betting developers will use the capabilities of the Snapdragon 820’s CPU, DSP, and GPU to build heterogeneous networks that leverage the capabilities of all three processing blocks. The SDK is expected to be available in the back half of this year.