METHOD AND SYSTEM FOR UNSUPERVISED DISCOVERY OF UNIGRAMS IN SPEECH RECOGNITION SYSTEMS

Invention Publication

US20230144379A1 METHOD AND SYSTEM FOR UNSUPERVISED DISCOVERY OF UNIGRAMS IN SPEECH RECOGNITION SYSTEMS 审中-公开

Please log in to see more content

Patent Title: METHOD AND SYSTEM FOR UNSUPERVISED DISCOVERY OF UNIGRAMS IN SPEECH RECOGNITION SYSTEMS
Application No.: US17520816

Application Date: 2021-11-08
Publication No.: US20230144379A1

Publication Date: 2023-05-11
Inventor: LEV HAIKIN , ARNON MAZZA , EYAL ORBACH , AVRAHAM FAIZAKOF
Applicant: GENESYS CLOUD SERVICES, INC.
Applicant Address: US CA Daly City
Assignee: GENESYS CLOUD SERVICES, INC.
Current Assignee: GENESYS CLOUD SERVICES, INC.
Current Assignee Address: US CA Daly City
Main IPC: G10L15/197
IPC: G10L15/197 ; G10L15/06 ; G10L15/22 ; G10L15/10 ; G06N20/00

METHOD AND SYSTEM FOR UNSUPERVISED DISCOVERY OF UNIGRAMS IN SPEECH RECOGNITION SYSTEMS

Abstract:

A system and method of automatically discovering unigrams in a speech data element may include receiving a language model that includes a plurality of n-grams, where each n-gram includes one or more unigrams; applying an acoustic machine-learning (ML) model on one or more speech data elements to obtain a character distribution function; applying a greedy decoder on the character distribution function, to predict an initial corpus of unigrams; filtering out one or more unigrams of the initial corpus to obtain a corpus of candidate unigrams, where the candidate unigrams are not included in the language model; analyzing the one or more first speech data elements, to extract at least one n-gram that comprises a candidate unigram; and updating the language model to include the extracted at least one n-gram.

Public/Granted literature

US11984116B2 Method and system for unsupervised discovery of unigrams in speech recognition systems Public/Granted day:2024-05-14

Information query

Global Dossier Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）
G10L15/08	.语音分类或检索
G10L15/18	..利用自然语言模型
G10L15/183	...用上下文相关性，例如：语言模型
G10L15/19	....语法上下文，例如：基于字母顺序规则的识别假定的消除二义性
G10L15/197	.....概率文法，例如：字元语法