Patent search ap:("AT&T Intellectual Property I Page L.P.") AND inv:"Patrick HAFFNER"

1.

发明申请
SYSTEM AND METHOD FOR COMBINING FRAME AND SEGMENT LEVEL PROCESSING, VIA TEMPORAL POOLING, FOR PHONETIC CLASSIFICATION 有权

公开(公告)号：US20160063991A1

公开(公告)日：2016-03-03

申请号：US14936772

申请日：2015-11-10

Applicant: AT&T Intellectual Property I, L.P.

Inventor： Sumit CHOPRA , Dimitrios DIMITRIADIS , Patrick HAFFNER

IPC: G10L15/02 , G10L15/08

CPC classification number: G10L15/02 , G10L15/08 , G10L15/16

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for combining frame and segment level processing, via temporal pooling, for phonetic classification. A frame processor unit receives an input and extracts the time-dependent features from the input. A plurality of pooling interface units generates a plurality of feature vectors based on pooling the time-dependent features and selecting a plurality of time-dependent features according to a plurality of selection strategies. Next, a plurality of segmental classification units generates scores for the feature vectors. Each segmental classification unit (SCU) can be dedicated to a specific pooling interface unit (PIU) to form a PIU-SCU combination. Multiple PIU-SCU combinations can be further combined to form an ensemble of combinations, and the ensemble can be diversified by varying the pooling operations used by the PIU-SCU combinations. Based on the scores, the plurality of segmental classification units selects a class label and returns a result.

2.

发明申请
System and Method for Combining Frame and Segment Level Processing, Via Temporal Pooling, for Phonetic Classification 有权
Title translation: 用于组合帧和段级处理的系统和方法，通过时间合并，用于语音分类

公开(公告)号：US20150058012A1

公开(公告)日：2015-02-26

申请号：US14537400

申请日：2014-11-10

Applicant: AT&T Intellectual Property I, L.P.

Inventor： Sumit CHOPRA , Dimitrios DIMITRIADIS , Patrick HAFFNER

IPC: G10L15/08

CPC classification number: G10L15/02 , G10L15/08 , G10L15/16

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for combining frame and segment level processing, via temporal pooling, for phonetic classification. A frame processor unit receives an input and extracts the time-dependent features from the input. A plurality of pooling interface units generates a plurality of feature vectors based on pooling the time-dependent features and selecting a plurality of time-dependent features according to a plurality of selection strategies. Next, a plurality of segmental classification units generates scores for the feature vectors. Each segmental classification unit (SCU) can be dedicated to a specific pooling interface unit (PIU) to form a PIU-SCU combination. Multiple PIU-SCU combinations can be further combined to form an ensemble of combinations, and the ensemble can be diversified by varying the pooling operations used by the PIU-SCU combinations. Based on the scores, the plurality of segmental classification units selects a class label and returns a result.

Abstract translation: 本文公开了用于通过时间池来组合帧和段级处理用于语音分类的系统，方法和非暂时的计算机可读存储介质。帧处理器单元接收输入并从输入中提取与时间相关的特征。多个池化接口单元基于集合时间依赖特征并根据多个选择策略选择多个时间相关特征来生成多个特征向量。接下来，多个分段分类单元生成特征向量的得分。每个分段分类单元（SCU）可专用于特定的汇聚接口单元（PIU）以形成PIU-SCU组合。可以进一步组合多个PIU-SCU组合以形成组合的集合，并且可以通过改变PIU-SCU组合使用的合并操作来使集合多样化。基于分数，多个分段分类单元选择分类标签并返回结果。

3.

发明申请
SYSTEM AND METHOD FOR DYNAMIC FACIAL FEATURES FOR SPEAKER RECOGNITION 审中-公开

公开(公告)号：US20160078869A1

公开(公告)日：2016-03-17

申请号：US14953984

申请日：2015-11-30

Applicant: AT&T Intellectual Property I, L.P.

Inventor： Ann K. SYRDAL , Sumit CHOPRA , Patrick HAFFNER , Taniya MISHRA , Ilija ZELJKOVIC , Eric ZAVESKY

IPC: G10L15/25 , G06K9/00 , G10L21/06

CPC classification number: G10L15/25 , G06F21/32 , G06F2221/2103 , G06K9/00255 , G06K9/00281 , G06K9/00288 , G06K9/00315 , G06K9/00335 , G10L17/24 , G10L21/06

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for performing speaker verification. A system configured to practice the method receives a request to verify a speaker, generates a text challenge that is unique to the request, and, in response to the request, prompts the speaker to utter the text challenge. Then the system records a dynamic image feature of the speaker as the speaker utters the text challenge, and performs speaker verification based on the dynamic image feature and the text challenge. Recording the dynamic image feature of the speaker can include recording video of the speaker while speaking the text challenge. The dynamic feature can include a movement pattern of head, lips, mouth, eyes, and/or eyebrows of the speaker. The dynamic image feature can relate to phonetic content of the speaker speaking the challenge, speech prosody, and the speaker's facial expression responding to content of the challenge.

4.

发明申请
System and Method for Combining Speech Recognition Outputs From a Plurality of Domain-Specific Speech Recognizers Via Machine Learning 审中-公开
Title translation: 通过机器学习从多个领域特定的语音识别器中组合语音识别输出的系统和方法

公开(公告)号：US20140358537A1

公开(公告)日：2014-12-04

申请号：US14459719

申请日：2014-08-14

Applicant: AT&T Intellectual Property I, L.P.

Inventor： Mazin GILBERT , Srinivas BANGALORE , Patrick HAFFNER , Robert BELL

IPC: G10L15/32 , G10L15/06 , G10L15/26

CPC classification number: G10L15/32 , G10L15/063 , G10L15/26 , G10L2015/0638

Abstract: Disclosed herein are systems, methods and non-transitory computer-readable media for performing speech recognition across different applications or environments without model customization or prior knowledge of the domain of the received speech. The disclosure includes recognizing received speech with a collection of domain-specific speech recognizers, determining a speech recognition confidence for each of the speech recognition outputs, selecting speech recognition candidates based on a respective speech recognition confidence for each speech recognition output, and combining selected speech recognition candidates to generate text based on the combination.

Abstract translation: 本文公开了用于在不需要模型定制或接收到的语音的领域的先前知识的情况下在不同的应用或环境上执行语音识别的系统，方法和非暂时的计算机可读介质。该公开内容包括：利用特定领域的语音识别器的集合来识别接收的语音，为每个语音识别输出确定语音识别置信度，基于每个语音识别输出的相应语音识别置信度选择语音识别候选，以及组合所选语音识别候选人基于组合生成文本。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification