Domain-specific speech recognizers in a digital medium environment

Invention Grant

US10586528B2 Domain-specific speech recognizers in a digital medium environment 有权

Please log in to see more content

Patent Title: Domain-specific speech recognizers in a digital medium environment
Application No.: US15423429

Application Date: 2017-02-02
Publication No.: US10586528B2

Publication Date: 2020-03-10
Inventor: Ramesh Radhakrishna Manuvinakurike , Trung Huu Bui , Robert S. N. Dates
Applicant: Adobe Inc.
Applicant Address: US CA San Jose
Assignee: Adobe Inc.
Current Assignee: Adobe Inc.
Current Assignee Address: US CA San Jose
Agency: SBMC
Main IPC: G10L15/06
IPC: G10L15/06 ; G10L15/197 ; G10L15/02 ; G10L15/16 ; G10L15/22 ; G06Q10/10

Domain-specific speech recognizers in a digital medium environment

Abstract:

Domain-specific speech recognizer generation with crowd sourcing is described. The domain-specific speech recognizers are generated for voice user interfaces (VUIs) configured to replace or supplement application interfaces. In accordance with the described techniques, the speech recognizers are generated for a respective such application interface and are domain-specific because they are each generated based on language data that corresponds to the respective application interface. This domain-specific language data is used to build a domain-specific language model. The domain-specific language data is also used to collect acoustic data for building an acoustic model. In particular, the domain-specific language data is used to generate user interfaces that prompt crowd-sourcing participants to say selected words represented by the language data for recording. The recordings of these selected words are then used to build the acoustic model. The domain-specific speech recognizers are generated by combining a respective domain-specific language model and crowd-sourced acoustic model.

Public/Granted literature

US20180218728A1 Domain-Specific Speech Recognizers in a Digital Medium Environment Public/Granted day:2018-08-02

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）
G10L15/06	.创建基准模板；训练语音识别系统，例如对说话者声音特征的适应（G10L15/14优先）