-
公开(公告)号:US20240347053A1
公开(公告)日:2024-10-17
申请号:US18629200
申请日:2024-04-08
申请人: TiVo Corporation
IPC分类号: G10L15/183 , G10L15/197
CPC分类号: G10L15/183 , G10L15/197
摘要: Speech recognition may be improved by generating and using a topic specific language model. A topic specific language model may be created by performing an initial pass on an audio signal using a generic or basis language model. A speech recognition device may then determine topics relating to the audio signal based on the words identified in the initial pass and retrieve a corpus of text relating to those topics. Using the retrieved corpus of text, the speech recognition device may create a topic specific language model. In one example, the speech recognition device may adapt or otherwise modify the generic language model based on the retrieved corpus of text.
-
2.
公开(公告)号:US20240347049A1
公开(公告)日:2024-10-17
申请号:US18630399
申请日:2024-04-09
申请人: 42dot Inc.
发明人: Cheoneum Park , Byeongyeol Kim , Juae Kim , Seohyoeng Jeong
IPC分类号: G10L15/16 , G10L15/02 , G10L15/06 , G10L15/183
CPC分类号: G10L15/16 , G10L15/02 , G10L15/063 , G10L15/183
摘要: An electronic device includes a memory configured to store instructions and a processor electrically connected to the memory and configured to execute the instructions, in which when the instructions are executed by the processor, the processor is configured to perform a plurality of operations, in which the plurality of operations includes deriving a frequently-asked-questions (FAQ) pair from speech data based on a neural network model trained in an end-to-end manner, in which the neural network model is based on a multi-modal language model (LM) capable of using text data and speech data simultaneously, and contrastive learning is performed on the neural network model based on symmetric loss to shift speech data, which is original data, to text data, which is augmented data
-
公开(公告)号:US12118371B2
公开(公告)日:2024-10-15
申请号:US17557790
申请日:2021-12-21
申请人: Meta Platforms, Inc.
发明人: Scott Martin
IPC分类号: G06F9/451 , G06F3/01 , G06F3/16 , G06F7/14 , G06F16/176 , G06F16/22 , G06F16/23 , G06F16/242 , G06F16/2455 , G06F16/2457 , G06F16/248 , G06F16/28 , G06F16/33 , G06F16/332 , G06F16/338 , G06F16/438 , G06F16/903 , G06F16/9032 , G06F16/9038 , G06F16/904 , G06F16/951 , G06F16/9535 , G06F18/2411 , G06F40/205 , G06F40/295 , G06F40/30 , G06F40/40 , G06N3/006 , G06N3/08 , G06N20/00 , G06Q50/00 , G06V20/10 , G06V40/16 , G06V40/20 , G10L13/00 , G10L13/04 , G10L15/02 , G10L15/06 , G10L15/07 , G10L15/16 , G10L15/18 , G10L15/183 , G10L15/187 , G10L15/22 , G10L15/26 , G10L17/06 , G10L17/22 , H04L12/28 , H04L41/00 , H04L41/22 , H04L43/0882 , H04L43/0894 , H04L51/02 , H04L51/046 , H04L51/216 , H04L51/52 , H04L67/10 , H04L67/306 , H04L67/50 , H04L67/53 , H04L67/5651 , H04L67/75 , H04W12/08
CPC分类号: G06F9/453 , G06F3/011 , G06F3/013 , G06F3/017 , G06F3/167 , G06F7/14 , G06F16/176 , G06F16/2255 , G06F16/2365 , G06F16/243 , G06F16/24552 , G06F16/24575 , G06F16/24578 , G06F16/248 , G06F16/285 , G06F16/3323 , G06F16/3329 , G06F16/3344 , G06F16/338 , G06F16/4393 , G06F16/90332 , G06F16/90335 , G06F16/9038 , G06F16/904 , G06F16/951 , G06F16/9535 , G06F18/2411 , G06F40/205 , G06F40/295 , G06F40/30 , G06F40/40 , G06N3/006 , G06N3/08 , G06N20/00 , G06Q50/01 , G06V20/10 , G06V40/172 , G06V40/28 , G10L15/02 , G10L15/063 , G10L15/07 , G10L15/16 , G10L15/1815 , G10L15/1822 , G10L15/183 , G10L15/187 , G10L15/22 , G10L15/26 , G10L17/06 , G10L17/22 , H04L12/2816 , H04L41/20 , H04L41/22 , H04L43/0882 , H04L43/0894 , H04L51/02 , H04L51/216 , H04L51/52 , H04L67/306 , H04L67/535 , H04L67/5651 , H04L67/75 , H04W12/08 , G06F2216/13 , G10L13/00 , G10L13/04 , G10L2015/223 , G10L2015/225 , H04L51/046 , H04L67/10 , H04L67/53
摘要: In one embodiment, a method includes receiving one or more voice inputs from a first user, determining a first language register associated with the first user based on the one or more voice inputs, selecting a second language register for a voice response based on the one or more voice inputs, generating the voice response based on the second language register, and providing the voice response in response to the one or more voice inputs.
-
4.
公开(公告)号:US20240331697A1
公开(公告)日:2024-10-03
申请号:US18739011
申请日:2024-06-10
发明人: Utku Yabas , Philipp Hubert , Karl Stahl
IPC分类号: G10L15/22 , G06F40/211 , G06F40/284 , G10L15/183 , G10L15/26
CPC分类号: G10L15/22 , G06F40/211 , G06F40/284 , G10L15/183 , G10L15/26 , G10L2015/223
摘要: A user specifies a natural language command to a device. Software on the device generates contextual metadata about the user interface of the device, such as data about all visible elements of the user interface, and sends the contextual metadata along with the natural language command to a natural language understanding engine. The natural language understanding engine parses the natural language query using a stored grammar (e.g., a grammar provided by a maker of the device) and as a result of the parsing identifies information about the command (e.g., the user interface elements referenced by the command) and provides that information to the device. The device uses that provided information to respond to the command.
-
公开(公告)号:US12100391B2
公开(公告)日:2024-09-24
申请号:US17450235
申请日:2021-10-07
申请人: Google LLC
发明人: William Chan , Navdeep Jaitly , Quoc V. Le , Oriol Vinyals , Noam M. Shazeer
IPC分类号: G10L15/16 , G06F40/12 , G06F40/197 , G06N3/044 , G06N3/045 , G10L15/183 , G10L15/26 , G10L25/30
CPC分类号: G10L15/16 , G06F40/12 , G06F40/197 , G06N3/044 , G06N3/045 , G10L15/183 , G10L15/26 , G10L25/30
摘要: Methods, systems, and apparatus, including computer programs encoded on computer storage media for speech recognition. One method includes obtaining an input acoustic sequence, the input acoustic sequence representing an utterance, and the input acoustic sequence comprising a respective acoustic feature representation at each of a first number of time steps; processing the input acoustic sequence using a first neural network to convert the input acoustic sequence into an alternative representation for the input acoustic sequence; processing the alternative representation for the input acoustic sequence using an attention-based Recurrent Neural Network (RNN) to generate, for each position in an output sequence order, a set of substring scores that includes a respective substring score for each substring in a set of substrings; and generating a sequence of substrings that represent a transcription of the utterance.
-
公开(公告)号:US20240311576A1
公开(公告)日:2024-09-19
申请号:US18296309
申请日:2023-04-05
IPC分类号: G06F40/35 , G06F3/0482 , G06F3/0484 , G10L15/183 , G10L15/22 , H04L51/02
CPC分类号: G06F40/35 , G06F3/0482 , G06F3/0484 , G10L15/183 , G10L15/22 , H04L51/02 , G10L2015/223
摘要: A system and method for providing a collaboration template including a brainstorming canvas to a display device of each of a plurality of participants coupled to the system, wherein the template includes a selection element configured to activate an artificial intelligence (AI) chat interface to receive a natural language command from at least one of the participants. The natural language command is received from the participant, combined with context prompts generated by a context prompt generator system to form a combined AI request, and transmitted to an AI system. In response to the natural language commands and context prompts transmitted to the AI system, a response is received from the AI system and instructions are provided to the client device of the participant to display the AI response on the brainstorming canvas of the template.
-
公开(公告)号:US12094458B2
公开(公告)日:2024-09-17
申请号:US17668968
申请日:2022-02-10
发明人: Vivek Kumar
IPC分类号: G10L15/16 , G06F40/30 , G10L15/18 , H04L51/212 , G06F40/289 , G06F40/35 , G10L15/183 , G10L15/22
CPC分类号: G10L15/1815 , G06F40/30 , G10L15/16 , G10L15/1822 , H04L51/212 , G06F40/289 , G06F40/35 , G10L15/183 , G10L2015/225
摘要: Techniques for extracting data from conversations across different types of communication channels are disclosed. A system applies a set of rules to extract data from conversations based, at least in part, on a type of communication channel used for conducting the conversation. The system applies a machine learning model to recognize semantic content in conversations. The system divides conversations into conversation segments and classifies the conversation segments based on the semantic content. The system selects conversation segments to be extracted based on the semantic content and the type of communication channel over which a conversation is conducted. The system maps conversation segments from different conversations conducted on different types of communication channels to a same set of transactions.
-
公开(公告)号:US12087304B2
公开(公告)日:2024-09-10
申请号:US17497668
申请日:2021-10-08
发明人: Youngdae Kim
IPC分类号: G10L15/26 , G10L15/02 , G10L15/183 , G10L15/22 , G10L21/10 , H04N21/488
CPC分类号: G10L15/26 , G10L15/02 , G10L15/183 , G10L15/22 , G10L21/10 , H04N21/488 , H04N21/4884
摘要: An electronic device for providing content including an image and a voice is disclosed. The electronic device comprises: a display configured to display an image; a memory in which a voice recognition module including various executable instructions is stored; and a processor configured to acquire expected words that will possibly be included in a voice, based on information about content, using the expected words to perform voice recognition for the voice through the voice recognition module, and displaying, on the display, text converted from the voice based on the voice recognition.
-
公开(公告)号:US12087281B2
公开(公告)日:2024-09-10
申请号:US17589693
申请日:2022-01-31
申请人: Salesforce, Inc.
发明人: Liang Qiu , Chien-Sheng Wu , Wenhao Liu , Caiming Xiong
IPC分类号: G10L15/183 , G10L15/05 , G10L15/06 , G06N20/00
CPC分类号: G10L15/063 , G10L15/05 , G10L15/183 , G06N20/00 , G10L2015/0631
摘要: Embodiments described herein propose an approach for unsupervised structure extraction in task-oriented dialogues. Specifically, a Slot Boundary Detection (SBD) module is adopted, for which utterances from training domains are tagged with the conventional BIO schema but without the slot names. A transformer-based classifier is trained to detect the boundary of potential slot tokens in the test domain. Next, while the state number is usually unknown, it is more reasonable to assume the slot number is given when analyzing a dialogue system. The detected tokens are clustered into the number of slot of groups. Finally, the dialogue state is represented with a vector recording the modification times of every slot. The slot values are then tracked through each dialogue session in the corpus and label utterances with their dialogue states accordingly. The semantic structure is portrayed by computing the transition frequencies among the unique states.
-
公开(公告)号:US12067097B2
公开(公告)日:2024-08-20
申请号:US17966252
申请日:2022-10-14
申请人: EMTEQ LIMITED
发明人: Charles Nduka , Mahyar Hamedi , Graeme Cox
IPC分类号: G06V40/16 , G02B27/01 , G06F21/32 , G06V10/145 , G06V10/147 , G06V40/20 , G06V40/70 , G10L13/02 , G10L15/183 , G10L15/22
CPC分类号: G06F21/32 , G02B27/017 , G06V10/145 , G06V10/147 , G06V40/166 , G06V40/167 , G06V40/176 , G06V40/20 , G06V40/70 , G10L13/02 , G10L15/183 , G10L15/22 , G02B2027/0178 , G10L2015/223
摘要: A biometric authentication system comprising headwear comprising a plurality of biosensors each configured to sample muscle activity so as to obtain a respective time-varying signal; a data store for storing a data set representing characteristic muscle activity for one or more users; and a processor configured to process the time-varying signals from the biosensors in dependence on the stored data set so as to determine a correspondence between a time-varying signal and characteristic muscle activity of one of the one or more users, and in dependence on the determined correspondence, authenticate the time-varying signals as being associated with that user.
-
-
-
-
-
-
-
-
-