专利检索 ipc:"G10L15/00" 第 1 页

1.

发明授权
Electronic device supporting improved voice activity detection 有权

公开(公告)号：US12125498B2

公开(公告)日：2024-10-22

申请号：US17570557

申请日：2022-01-07

申请人： SAMSUNG ELECTRONICS CO., LTD.

发明人： Seungbeom Ryu , Sungjae Park , Hyuk Oh , Myeungyong Choi , Junkwon Choi

IPC分类号： G10L15/00 , G06N3/045 , G10L15/02 , G10L15/16 , G10L15/22 , G10L25/84 , H04R1/08 , H04R3/00

CPC分类号： G10L25/84 , G06N3/045 , G10L15/02 , G10L15/16 , G10L15/22 , H04R1/08 , H04R3/00 , G10L2015/223 , H04R2420/07

摘要： According to various embodiments, an electronic device may include: a microphone; an audio connector; a wireless communication circuit; a processor operatively connected to the microphone, the audio connector, and the wireless communication circuit; and a memory operatively connected to the processor, wherein the memory may store instructions that, when executed, cause the processor to: receive a first audio signal through the microphone, the audio connector, or the wireless communication circuit, extract audio feature information from the first audio signal, and recognize a speech section in a second audio signal, received after the first audio signal through the microphone, the audio connector, or the wireless communication circuit, using the audio feature information.

2.

发明授权
Methods for neural network-based voice enhancement and systems thereof 有权

公开(公告)号：US12125496B1

公开(公告)日：2024-10-22

申请号：US18644959

申请日：2024-04-24

申请人： Sanas.ai Inc.

发明人： Shawn Zhang , Lukas Pfeifenberger , Jason Wu , Piotr Dura , David Braude , Bajibabu Bollepalli , Alvaro Escudero , Gokce Keskin , Ankita Jha , Maxim Serebryakov

IPC分类号： G10L15/00 , G10L15/02 , G10L15/06 , G10L21/0232 , G10L25/30 , G10L15/16 , G10L15/22

CPC分类号： G10L21/0232 , G10L15/02 , G10L15/063 , G10L25/30 , G10L15/16 , G10L15/22

摘要： The disclosed technology relates to methods, voice enhancement systems, and non-transitory computer readable media for real-time voice enhancement. In some examples, input audio data including foreground speech content, non-content elements, and speech characteristics is fragmented into input speech frames. The input speech frames are converted to low-dimensional representations of the input speech frames. One or more of the fragmentation or the conversion is based on an application of a first trained neural network to the input audio data. The low-dimensional representations of the input speech frames omit one or more of the non-content elements. A second trained neural network is applied to the low-dimensional representations of the input speech frames to generate target speech frames. The target speech frames are combined to generate output audio data. The output audio data further includes one or more portions of the foreground speech content and one or more of the speech characteristics.

3.

发明授权
Systems and methods to generate records within a collaboration environment 有权

公开(公告)号：US12124998B2

公开(公告)日：2024-10-22

申请号：US18476712

申请日：2023-09-28

申请人： Asana, Inc.

发明人： Steve B. Morin

IPC分类号： G06F16/00 , G06F3/0486 , G06F40/40 , G06Q10/06 , G06Q10/10 , G06T7/33 , G10L15/00

CPC分类号： G06Q10/103 , G06F3/0486 , G06F40/40 , G06Q10/06 , G06T7/33 , G10L15/005

摘要： Systems and methods to generate records within a collaboration environment are described herein. Exemplary implementations may perform one or more of: manage environment state information maintaining a collaboration environment; obtain input information defining digital assets representing sets of content input via a user interface; generate content information characterizing the sets of content represented in the digital assets; generate individual records based on the content information; and/or other operations.

4.

发明授权
End-to-end integration of dialog history for spoken language understanding 有权

公开(公告)号：US12119008B2

公开(公告)日：2024-10-15

申请号：US17655441

申请日：2022-03-18

申请人： International Business Machines Corporation , The Ohio State University

发明人： Samuel Thomas , Vishal Sunder , Hong-Kwang Kuo , Jatin Ganhotra , Brian E. D. Kingsbury , Eric Fosler-Lussier

IPC分类号： G10L19/00 , G06F40/126 , G06N3/045 , G10L15/00

CPC分类号： G10L19/00 , G06F40/126 , G06N3/045 , G10L15/00

摘要： Systems, computer-implemented methods, and computer program products to facilitate end to end integration of dialogue history for spoken language understanding are provided. According to an embodiment, a system can comprise a processor that executes components stored in memory. The computer executable components comprise a conversation component that encodes speech-based content of an utterance and text-based content of the utterance into a uniform representation.

5.

发明授权
Determining multilingual content in responses to a query 有权

公开(公告)号：US12118981B2

公开(公告)日：2024-10-15

申请号：US17475897

申请日：2021-09-15

申请人： GOOGLE LLC

发明人： Wangqing Yuan , Bryan Christopher Horling , David Kogan

IPC分类号： G10L15/00 , G10L13/08 , G10L15/22 , G10L15/26

CPC分类号： G10L13/086 , G10L15/22 , G10L2015/223 , G10L2015/225

摘要： Implementations relate to determining multilingual content to render at an interface in response to a user submitted query. Those implementations further relate to determining a first language response and a second language response to a query that is submitted to an automated assistant. Some of those implementations relate to determining multilingual content that includes a response to the query in both the first and second languages. Other implementations relate to determining multilingual content that includes a query suggestion in the first language and a query suggestion in a second language. Some of those implementations relate to pre-fetching results for the query suggestions prior to rendering the multilingual content.

6.

发明授权
Systems and methods for generating synthesized speech responses to voice inputs indicative of a user in a hurry 有权

公开(公告)号：US12118978B2

公开(公告)日：2024-10-15

申请号：US18387211

申请日：2023-11-06

申请人： ROVI GUIDES, INC.

发明人： Ankur Aher , Jeffry Copps Robert Jose

IPC分类号： G10L13/06 , G10L13/00 , G10L13/02 , G10L13/033 , G10L13/08 , G10L15/00 , G10L15/10 , G10L15/16 , G10L15/18 , G10L15/22 , G10L15/26 , G10L25/63

CPC分类号： G10L13/0335 , G10L25/63 , G10L13/00 , G10L13/02 , G10L13/06 , G10L13/08 , G10L15/00 , G10L15/10 , G10L15/16 , G10L15/18 , G10L15/22 , G10L15/26

摘要： The system provides a synthesized speech response to a voice input, based on the prosodic character of the voice input. The system receives the voice input and calculates at least one prosodic metric of the voice input. The at least one prosodic metric can be associated with a word, phrase, grouping thereof, or the entire voice input. The system also determines a response to the voice input, which may include the sequence of words that form the response. The system generates the synthesized speech response, by determining prosodic characteristics based on the response, and on the prosodic character of the voice input. The system outputs the synthesized speech response, which includes a more natural, relevant, or both answer to the call of the voice input. The prosodic character of the voice input and/or response may include pitch, note, duration, prominence, timbre, rate, and rhythm, for example.

7.

发明授权
Computer-implemented method and computer system for configuring a pretrained text to music AI model and related methods 有权

公开(公告)号：US12118976B1

公开(公告)日：2024-10-15

申请号：US18622365

申请日：2024-03-29

申请人： Futureverse IP Limited

发明人： Boyu Chen , Peike Li , Yao Yao , Yijun Wang

IPC分类号： G10L13/027 , G10L15/00

CPC分类号： G10L13/027

摘要： The method involves configuring a pretrained text to music AI model that includes a neural network implementing a diffusion model. The process includes receiving audio sample data corresponding to a specific audio concept, generating a concept identifier token based on the audio sample data, adapting a loss function of the diffusion model based on the concept identifier token, selecting pivotal parameters in weight matrices in a self-attention layer of the neural network of the AI model based on the audio sample data, and further training the pivotal parameters of the AI model, to optimize the AI model for the specific audio concept.

8.

发明授权
Receiver, signaling device, and method for receiving emergency information time information 有权

公开(公告)号：US12106747B2

公开(公告)日：2024-10-01

申请号：US18095804

申请日：2023-01-11

申请人： SHARP KABUSHIKI KAISHA

发明人： Kiran Mukesh Misra , Sachin G. Deshpande , Sheau Ng , Christopher Andrew Segall

IPC分类号： G10L15/00 , G06F40/263 , H04H20/59 , H04H20/86 , H04H60/58 , H04N21/233 , H04N21/2362

CPC分类号： G10L15/005 , G06F40/263 , H04H20/59 , H04H20/86 , H04H60/58 , H04N21/233 , H04N21/2362

摘要： A device may be configured to parse a syntax element specifying the number of available languages within a presentation associated with an audio stream. A device may be configured to parse one or more syntax elements identifying each of the available languages and parse an accessibility syntax element for each language within the presentation.

9.

发明授权
Systems, methods and interfaces for multilingual processing 有权

公开(公告)号：US12100385B2

公开(公告)日：2024-09-24

申请号：US17237258

申请日：2021-04-22

申请人： MICROSOFT TECHNOLOGY LICENSING, LLC

发明人： David Peace Hung

IPC分类号： G10L15/00 , G06F40/58 , G10L15/26 , G10L15/32

CPC分类号： G10L15/005 , G06F40/58 , G10L15/26 , G10L15/32

摘要： Systems are provided for multilingual speech data processing. A language identification module is configured to analyze spoken utterances in an audio stream and to detect at least one language corresponding to the spoken language utterances. The language identification module detects that a first language corresponds to the first portion of the audio stream. A first transcription of the first portion of the audio stream in the first language is generated and stored in a cache. A second transcription of a second portion of the audio stream in the first language is also generated and stored. When the second portion of the audio stream corresponds to a second language, a third transcription is generated in the second language using a second speech recognition engine configured to transcribe spoken language utterances in the second language. Then, the second transcription is replaced with the third transcription in the cache and any displayed instances.

10.

发明授权
Electronic device and method of controlling thereof 有权

公开(公告)号：US12087298B2

公开(公告)日：2024-09-10

申请号：US17979078

申请日：2022-11-02

申请人： SAMSUNG ELECTRONICS CO., LTD.

发明人： Donghyeon Lee , Seonghan Ryu , Yubin Seo , Eunji Lee , Sungja Choi , Jiyeon Hong , Sechun Kang , Yongjin Cho , Seungchul Lee

IPC分类号： G10L15/00 , G06F3/16 , G06F40/30 , G10L13/00 , G10L15/22 , G10L15/30

CPC分类号： G10L15/22 , G06F3/167 , G06F40/30 , G10L13/00 , G10L15/30 , G10L2015/223

摘要： Disclosed is an electronic device. The electronic device may execute an application for transmitting and receiving at least one of text data or voice data with another electronic device using the communication module, in response to occurrence of at least one event, based on receiving at least one of text data or voice data from the another electronic device, identify that a confirmation is necessary using the digital assistant based on at least one of text data or voice data being generated based on a characteristic of ab utterance using a digital assistant, generate a notification to request confirmation using the digital assistant based on confirmation being necessary, and output the notification using the application.
A method for identifying that a confirmation is necessary may include identifying using voice data or text data that is received from another electronic device using a rule-based or AI algorithm.
When a confirmation is necessary is identified using the AI algorithm, the method may use machine learning, neural network, or a deep learning algorithm.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类