-
公开(公告)号:US12131523B2
公开(公告)日:2024-10-29
申请号:US17182951
申请日:2021-02-23
申请人: Meta Platforms, Inc.
发明人: Xiaohu Liu , Baiyang Liu , Rajen Subba
IPC分类号: G06V10/82 , G06F3/01 , G06F3/16 , G06F7/14 , G06F9/451 , G06F16/176 , G06F16/22 , G06F16/23 , G06F16/242 , G06F16/2455 , G06F16/2457 , G06F16/248 , G06F16/33 , G06F16/332 , G06F16/338 , G06F16/903 , G06F16/9032 , G06F16/9038 , G06F16/904 , G06F16/951 , G06F16/9535 , G06F18/2411 , G06F40/205 , G06F40/295 , G06F40/30 , G06F40/40 , G06N3/006 , G06N3/08 , G06N7/01 , G06N20/00 , G06Q50/00 , G06V10/764 , G06V20/10 , G06V40/20 , G10L15/02 , G10L15/06 , G10L15/07 , G10L15/16 , G10L15/18 , G10L15/183 , G10L15/187 , G10L15/22 , G10L15/26 , G10L17/06 , G10L17/22 , H04L5/02 , H04L12/28 , H04L41/00 , H04L41/22 , H04L43/0882 , H04L43/0894 , H04L51/02 , H04L51/18 , H04L51/216 , H04L51/52 , H04L67/306 , H04L67/50 , H04L67/5651 , H04L67/75 , H04W12/08 , G10L13/00 , G10L13/04 , H04L51/046 , H04L67/10 , H04L67/53
CPC分类号: G06V10/82 , G06F3/011 , G06F3/013 , G06F3/017 , G06F3/167 , G06F7/14 , G06F9/453 , G06F16/176 , G06F16/2255 , G06F16/2365 , G06F16/243 , G06F16/24552 , G06F16/24575 , G06F16/24578 , G06F16/248 , G06F16/3323 , G06F16/3329 , G06F16/3344 , G06F16/338 , G06F16/90332 , G06F16/90335 , G06F16/9038 , G06F16/904 , G06F16/951 , G06F16/9535 , G06F18/2411 , G06F40/205 , G06F40/295 , G06F40/30 , G06F40/40 , G06N3/006 , G06N3/08 , G06N7/01 , G06N20/00 , G06Q50/01 , G06V10/764 , G06V20/10 , G06V40/28 , G10L15/02 , G10L15/063 , G10L15/07 , G10L15/16 , G10L15/1815 , G10L15/1822 , G10L15/183 , G10L15/187 , G10L15/22 , G10L15/26 , G10L17/06 , G10L17/22 , H04L5/02 , H04L12/2816 , H04L41/20 , H04L41/22 , H04L43/0882 , H04L43/0894 , H04L51/02 , H04L51/18 , H04L51/216 , H04L51/52 , H04L67/306 , H04L67/535 , H04L67/5651 , H04L67/75 , H04W12/08 , G06F2216/13 , G10L13/00 , G10L13/04 , G10L2015/223 , G10L2015/225 , H04L51/046 , H04L67/10 , H04L67/53
摘要: In one embodiment, a method includes by a client system associated with a user, receiving, at the client system, a user input from the user, parsing, by the client system, the first user input to identify a request to execute a function to be performed by an assistant system of several assistant systems associated with the client system, determining whether the user is authorized to access the assistant system by comparing a voiceprint of the user to several voiceprints stored on the client system, sending, from the client system to the assistant system in response to determining the user is authorized to access the assistant system, a request to set an assistant xbot of the assistant system into a listening mode, and receiving, at the client system from the assistant system, an indication that the assistant xbot is in listening mode.
-
公开(公告)号:US20240347036A1
公开(公告)日:2024-10-17
申请号:US18638155
申请日:2024-04-17
发明人: Nicholas S. WITHAM , Juan Pablo BOTERO TORRES , Colleen CHEMERKA , Tanner KRONE , Rami SHORTI , Thomas ODELL
摘要: A system for enabling conversion of speech pantomimes of a user into synthesized speech includes a headset connected to an artificial intelligence network hosted on a computing device. The headset can include an array of distance measurement devices distributed adjacent facial regions of the user associated with speech. The system can further include a microphone and a speaker. Sensor data captured by the distance measuring devices and audio data captured by the microphone are used to train the artificial intelligence network to correlate speech pantomimes of the user with phonemes. The system can output synthesized speech generated from the phonemes through the speaker.
-
公开(公告)号:US20240312450A1
公开(公告)日:2024-09-19
申请号:US18606270
申请日:2024-03-15
发明人: Xiaoyi ZHANG
IPC分类号: G10L13/08 , G06F18/241 , G10L13/04 , G10L13/07
CPC分类号: G10L13/08 , G06F18/241 , G10L13/04 , G10L13/07
摘要: A system, computer-readable storage medium, and computer-implemented method for automatically generating listening test audio from audioscript. The audioscript is managed by sections, and each section is parsed and section-configuration generated accordingly. All the section-configurations are composed to a configuration of the audioscript, or said, a configuration. The configuration can be transmitted to a client device such that the configuration is viewable and the parameter values in the configuration can be set/reviewed through a graphical user interface. Responsive to receiving the fulfilled configuration, each section-configuration in the configuration can be applied to the corresponding section to generate the audio and/or audio-generation-script of that section. The complete audio and/or audio-generation-script of the audioscript is generated by concatenating all the audio and/or audio-generation-script of each section.
-
公开(公告)号:US20240169972A1
公开(公告)日:2024-05-23
申请号:US18283433
申请日:2022-02-15
发明人: Jiaxin XIONG , Hong FENG , Hao ZENG , Tongxin ZHANG
IPC分类号: G10L13/04 , G10L21/055
CPC分类号: G10L13/04 , G10L21/055
摘要: Provided are a synchronization method and apparatus for audio and text, a device, and a medium. The method includes: determining a plurality of first text segments for audio conversion and a second text for reading display, in which the plurality of first text segments and the second text are from an initial text; converting the plurality of first text segments into audio segments, to obtain a first mapping relationship between the first text segments and the audio segments; performing matching on the first text segments and the second text, to obtain a second mapping relationship between the first text segments and second text segments in the second text; determining the second text segment synchronized with each of the audio segments based on the first mapping relationship and the second mapping relationship.
-
公开(公告)号:US20240161731A1
公开(公告)日:2024-05-16
申请号:US18415166
申请日:2024-01-17
发明人: Ji-sang YU , Sang-ha Kim , Jong-youb Ryu , Yoon-jung Choi , Eun-kyoung Kim , Jae-won Lee
CPC分类号: G10L15/005 , G06F40/58 , G10L13/04 , G10L15/26
摘要: A method and an electronic device for translating a speech signal between a first language and a second language with minimized translation delay by translating fewer than all words of the speech signal according to a level of understanding of the second language by a user that receives the translation.
-
公开(公告)号:US11984112B2
公开(公告)日:2024-05-14
申请号:US17244659
申请日:2021-04-29
申请人: Rovi Guides, Inc.
IPC分类号: G10L15/22 , G10L13/027 , G10L13/04
CPC分类号: G10L13/027 , G10L13/04
摘要: Systems and methods are disclosed for providing voice interactions based on user context. Data is received that causes a voice interaction to be generated for output at a user device. In response, current user contextual data of the user device is retrieved. A user availability level for consuming the voice interaction is determined based on the current user contextual data. The voice interaction is altered based on the user availability level. Content of the voice interaction may be altered to be suitable for consumption. The altered voice interaction is outputted at the user device.
-
7.
公开(公告)号:US11954403B1
公开(公告)日:2024-04-09
申请号:US18162735
申请日:2023-02-01
IPC分类号: G06F3/0484 , G06F3/0488 , G06F3/16 , G10L13/04
CPC分类号: G06F3/165 , G06F3/0484 , G06F3/0488 , G10L13/04
摘要: Embodiments are provided for communicating notifications and other textual data associated with applications installed on an electronic device. According to certain aspects, a user can interface with an input device to send (218) a wake up trigger to the electronic device. The electronic device retrieves (222) application notifications and converts (288) the application notifications to audio data. The electronic device also sends (230) the audio data to an audio output device for annunciation (232). The user may also use the input device to send (242) a request to the electronic device to activate the display screen. The electronic device identifies (248) an application corresponding to an annunciated notification, and activates (254) the display screen and initiates the application.
-
公开(公告)号:US20240112667A1
公开(公告)日:2024-04-04
申请号:US18525475
申请日:2023-11-30
申请人: Google LLC
发明人: Ye Jia , Zhifeng Chen , Yonghui Wu , Jonathan Shen , Ruoming Pang , Ron J. Weiss , Ignacio Lopez Moreno , Fei Ren , Yu Zhang , Quan Wang , Patrick An Phu Nguyen
摘要: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech synthesis. The methods, systems, and apparatus include actions of obtaining an audio representation of speech of a target speaker, obtaining input text for which speech is to be synthesized in a voice of the target speaker, generating a speaker vector by providing the audio representation to a speaker encoder engine that is trained to distinguish speakers from one another, generating an audio representation of the input text spoken in the voice of the target speaker by providing the input text and the speaker vector to a spectrogram generation engine that is trained using voices of reference speakers to generate audio representations, and providing the audio representation of the input text spoken in the voice of the target speaker for output.
-
公开(公告)号:US11922946B2
公开(公告)日:2024-03-05
申请号:US18181787
申请日:2023-03-10
申请人: Q (CUE) LTD.
发明人: Aviad Maizels , Avi Barliya , Yonatan Wexler
IPC分类号: G10L15/25 , G02B27/00 , G06F40/58 , G06V10/141 , G06V10/60 , G06V10/82 , G06V40/16 , G10L13/02 , G10L13/027 , G10L13/04 , G10L15/16 , G10L15/26 , G06F3/01
CPC分类号: G10L15/26 , G02B27/0093 , G06F40/58 , G06V10/141 , G06V10/60 , G06V10/82 , G06V40/171 , G06V40/174 , G06V40/176 , G10L13/02 , G10L13/027 , G10L13/04 , G10L15/16 , G10L15/25 , G06F3/015
摘要: Systems and methods are disclosed for determining textual transcription from minute facial skin movements. In one implementation, a system may include at least one coherent light source, at least one sensor configured to receive light reflections from the at least one coherent light source; and a processor configured to control the at least one coherent light source to illuminate a region of a face of a user. The processor may receive from the at least one sensor, reflection signals indicative of coherent light reflected from the face in a time interval. The reflection signals may be analyzed to determine minute facial skin movements in the time interval. Then, based on the determined minute facial skin movements in the time interval, the processor may determine a sequence of words associated with the minute facial skin movements, and output a textual transcription corresponding with the determined sequence of words.
-
公开(公告)号:US20240066396A1
公开(公告)日:2024-02-29
申请号:US18500338
申请日:2023-11-02
申请人: STEELSERIES ApS
发明人: Michael Aronzon
IPC分类号: A63F13/355 , A63F13/215 , A63F13/25 , A63F13/335 , A63F13/35 , A63F13/40 , A63F13/87 , G06F3/0489 , G06F9/451 , G06F40/263 , G06F40/58 , G10L13/00 , G10L13/04
CPC分类号: A63F13/355 , A63F13/215 , A63F13/25 , A63F13/335 , A63F13/35 , A63F13/40 , A63F13/87 , G06F3/0489 , G06F9/454 , G06F40/263 , G06F40/58 , G10L13/00 , G10L13/04 , A63F13/54
摘要: A system that incorporates teachings of the present disclosure may include, for example, a computing device having a controller to obtain a user input that was inputted into a first accessory operably coupled with the computing device where the first accessory provides a user interface for user interaction with a video game, determine a language of an intended recipient of the user input based on an identity of the intended recipient, access a multi-lingual library comprising a plurality of words associated with the video game, match the user input to one or more words of the plurality of words of the multi-lingual library to generate a translated message in the determined language of the intended recipient, and provide the translated message to a second accessory for presentation to the intended recipient in real-time. Additional embodiments are disclosed.
-
-
-
-
-
-
-
-
-