After exploring various recent Recurrent Neural Networks (RNN) based techniques, we introduced our hierarchical joint models to recognize passenger intents along with relevant slots associated with the action to be performed in AV scenarios. We collected a multi-modal in-cabin dataset with multi-turn dialogues between the passengers and AMIE using a Wizard-of-Oz scheme via a realistic scavenger hunt game activity.
In our current explorations, we focused on AMIE scenarios describing usages around setting or changing the destination and route, updating driving behavior or speed, finishing the trip, and other use-cases to support various natural commands. When the passengers give instructions to AMIE, the agent should parse such commands properly and trigger the appropriate functionality of the AV system. In this work, we explored AMIE (Automated-vehicle Multi-modal In-cabin Experience), the in-cabin agent responsible for handling certain passenger-vehicle interactions. Understanding passenger intents and extracting relevant slots are crucial building blocks towards developing contextual dialogue systems for natural interactions in autonomous vehicles (AV).