Inventors
Ryan Price, Srinivas Bangalore
Publication date
2022/11/22
Patent office
US
Patent number
11508355
Application number
16172115
Description
Systems and methods are disclosed herein for discerning aspects of user speech to determine user intent and/or other acoustic features of a sound input without the use of an ASR engine. To this end, a processor may receive a sound signal comprising raw acoustic data from a client device, and divides the data into acoustic units. The processor feeds the acoustic units through a first machine learning model to obtain a first output and determines a first mapping, using the first output, of each respective acoustic unit to a plurality of candidate representations of the respective acoustic unit. The processor feeds each candidate representation of the plurality through a second machine learning model to obtain a second output, determines a second mapping, using the second output, of each candidate representation to a known condition, and determines a label for the sound signal based on the second mapping. a