When I was in the second grade, a “frenemy” dared me to dial 911 from a payphone while on a school outing. Minutes later, a police officer arrived at the public park, disbanded all of us from play, and treated the class to a lecture on the importance of only using 911 during an emergency circumstance. The lesson was not lost upon me, and years later I can still remember the panic I felt, as the police officer’s gaze swept over us like an angry god displeased with his pagan followers.
Unfortunately, thousands of adults routinely place prank calls to the coast guard, police and other emergency service personnel, costing the public both in tax payer dollars and, not infrequently, human lives. It was for this reason the US coast guard reached out to researchers at Carnegie Mellon University to help identify a prank caller who had been routinely calling in SOS emergencies to which they were obligated to respond, often putting their own first responders in grave danger.
Using cutting-edge machine learning algorithms and signal process technologies, Dr. Rita Singh and her colleagues pieced together a profile of the prank caller solely from recordings of his voice — including a prediction of his height, weight, and the size of room he was calling from. When the coast guard finally succeeded in trailing down the perpetrator, his profile matched up almost perfectly with the one generated by the CMU researchers. This generated a flood of interest in Dr. Singh’s work on computer voice recognition.
As Rita Singh explained to me during a recent conversation, even when individuals desire to conceal their voice, say by imitating an accent or changing modulation, it will likely fail to fool her algorithms. Our voices identify us almost as well as our fingerprints. There are micro expressions contained within a voice that happen at a subconscious level and are beyond a person’s control. For instance, the speed at which a person rises from a “t” to an “a” when pronouncing the word “tap” comprises a micro expression that cannot be easily faked or altered.
These advances go well beyond identifying a person by their voice, though. Through the creative use of signal processing and machine learning, Rita’s team can identify a person’s use of intoxicants or other substances — and even more surprisingly, the onset of medical conditions the speaker may not even be aware they possess. For instance, the biomarker for Parkinson’s can be detected in a person’s voice long before any other symptoms arise. This raises the prospect of using voice recognition in the medical field to diagnose diseases with speech-related biomarkers.