Stressful driving situations come up every day, and as automotive assistants get smarter, their ability to help us navigate through those difficult situations increases. What if these assistants could sense the context of your reactions (and emotions), and respond in a way that can alleviate some of the stress? What if they could respond in a more relevant, effective and empathetic manner using not only voice but also computer vision-based AI, like Affectiva’s Emotion AI?
Today’s in-car intelligent assistants have transformed from basic question and answer systems to more intelligent and conversational assistants that are up to any task. A key part of creating these more relevant experiences is humanizing them. This will create better, more effective and enjoyable experiences for drivers, and turn in-car voice platforms into true conversational assistants.
Affectiva is happy to partner with Cerence, who use our Emotion AI to further humanize their voice and AI-powered in-car intelligent assistants, optimizing what the in-cabin environment looks and feels like.
We recently featured Cerence on our Human-Centric AI Podcast, where Stefan Hamerich, Director of Product Management at Cerence, spoke to the topic of designing for personal experiences within the vehicle. Cerence is the global industry leader in creating unique, moving experiences for the mobility world. As an innovation partner to the world’s leading automakers, it is helping transform how a car feels, responds and learns. Its track record is built on more than 20 years of knowledge and more than 350 million cars shipped with Cerence technology. Stefan’s team is responsible for the strategy and roadmap of all of Cerence’s embedded speech input and text products as well as their embedded platform.
In this episode, he shares with us some of his thoughts and ideas around designing in-vehicle systems with the human in mind.
1 - Tell us about your background: how has your career path taken you to Cerence, and what is your role there today?
At university, I really fell in love with communicating with machines. Back then, it was mostly typing and the machines would answer with a spoken response. Then I went to IBM, and after various acquisitions, I landed at Nuance, which then spun off to Cerence. I still find it extremely cool to talk to a device (such as the car) and get it to follow my directions. After all these years, I still find it important to bring the human “touch” into devices in order to make them more human.
I lead a team of product managers at Cerence; originally I started with Text to Speech (TTS) only. Today, we have all the core technologies and also the Embedded platform under our purview. The core tech consists of the classical speech technology, which includes speech signal enhancement with echo-cancellation, noise reduction, speech recognition, natural language understanding, and more.
At Cerence, we are white labeling our product, which means that the end consumer will probably never notice us. But, we have actually shipped as a part of 350 million cars. Our AI-powered assistants and speech capabilities are a large part of that. We are most famous for speech, and recently announced that Cerence is powering the voice and AI-driven features in the Mercedes-Benz User Experience (MBUX). Cerence was also selected by FCA to provide conversational and interactive AI in the new Electric Fiat 500. We are also long time partners with BMW, and just this month we presented our products together to demonstrate our vision for the impact of conversational AI on the driving experience. Audi’s voice assistant was also built on the Cerence Drive platform, which empowers a conversational mobility assistant and infotainment experience that makes it safer and more natural than ever for drivers to interact with their Audi vehicles.
All of these systems we are powering are built on an intuitive, powerful interaction between humans and cars. There’s still so much more that can be done, and for us it’s really about giving people more of that human communication through interactions with these devices. This not only improves road safety, which is where we got started, but will also be more fun, too!
2 - Though CES 2019 feels like many lifetimes ago, can you explain what Affectiva and Cerence did together and take us through that experience?
At CES 2019, we showcased a vehicle in which we were trying to understand the mood of the passengers, as well as fatigue detection. Affectiva’s Automotive AI used cameras within the vehicle to analyze facial expressions and tone of voice to help our mobility assistant understand the drivers’ and passengers’ cognitive and emotional states. This was a starting point for us, and from there, we created a complete assistant within the car that could adapt its behavior accordingly, changing both its response style and tone of voice to match the situation. Besides providing more “empathic” assistance, this technology could enhance safety on the road by preventing distracted, drowsy and impaired driving. For example, occupants could sit in that vehicle and if they were yawning or speaking in a certain way, the assistant reacted with responses like, "Maybe you need to take a break. Would you like some coffee?" or start up some entertainment to engage the occupants. We did this with games like “Name that Tune” to raise their attention.
Speaking and hearing the in-vehicle assistant response is always nice. But if you are able to actually experience it, see how it interacts and how it displays in the car, this is key to human-machine interaction.
3 - You recently wrote an article on improving car conversations and road safety around technical acoustical challenges in a larger vehicle and how your ICC solution may be able to address something like that. So can you explain a little bit more on what that was about?
If you have a larger car or van, something with a third row, it can be difficult for those sitting in that final row to communicate with the driver; sometimes, you have to yell. It may be easier for the driver to turn their head towards the back to hear better, but of course that’s extremely dangerous while driving.
We wanted to enable a more pleasant conversation between the humans in the vehicle, regardless of where they are seated: and that is exactly what In-Car Communication (ICC) is doing. It is basically an intercom system that allows the driver to speak to back passengers, and vice versa. To do this, we are leveraging the microphones used for the hands-free or speech systems, in addition to the speakers in the car.
Another issue is latency: as a driver, hearing yourself talking to the back passengers a second or two later does not help your conversation. So we cleaned up the complete signal to play it back over the speaker instantaneously, which makes a huge impact on the enjoyment of the conversation. We also implemented a button on the steering wheel for the driver to easily activate the ICC to assist with communication among the occupants. All of these features make the entire in-cabin experience more connected and “human.”
To hear the full Q+A, listen to the podcast here.