-
公开(公告)号:US11605380B1
公开(公告)日:2023-03-14
申请号:US16983668
申请日:2020-08-03
摘要: This disclosure describes, in part, techniques and systems for generating and outputting immersive, multi-device content items in user environment, such as connected homes, offices, and the like. For example, the techniques and systems may output different portions of content on different devices within a user environment based on information such as respective capabilities of the devices, a current location of the user within the environment, a time of day, which user(s) are present in the environment, and/or the like.
-
公开(公告)号:US20230031145A1
公开(公告)日:2023-02-02
申请号:US17388673
申请日:2021-07-29
发明人: Ganesh Narayanan
摘要: Methods and systems for processing voice commands are disclosed. A voice controlled device may receive audio data comprising a voice command. Location information indicative of the source of the audio data may be determined. One or more devices may be caused to determine signals based on the location information. The one or more devices may receive thermal data in response to the signals. The thermal data may be analyzed to determine if the thermal data indicates the presence of a person at the expected location. If a person is detected, then the audio data may processed to cause the voice command to be executed.
-
公开(公告)号:US20230017212A1
公开(公告)日:2023-01-19
申请号:US17953029
申请日:2022-09-26
申请人: GoPro, Inc.
摘要: After a command to stop recording a video is received, an image capture device may buffer footage in a buffer memory. The buffer memory may be used as a post-capture cache. The footage buffered in the buffer memory may be appended to the end of previously captured footage, appended to the beginning of subsequently captured footage, and/or used to bridge two separately captured footage.
-
公开(公告)号:US11532299B2
公开(公告)日:2022-12-20
申请号:US16896779
申请日:2020-06-09
申请人: Google LLC
IPC分类号: G10L15/00 , G10L15/07 , G10L15/197 , G10L15/183 , G10L15/24
摘要: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for modulating language model biasing. In some implementations, context data is received. A likely context associated with a user is determined based on at least a portion of the context data. One or more language model biasing parameters based at least on the likely context associated with the user is selected. A context confidence score associated with the likely context based on at least a portion of the context data is determined. One or more language model teasing parameters based at least on the context confidence score is adjusted. A baseline language model based at least on the one or more of the adjusted language model biasing parameters is biased. The baseline language model is provided for use by an automated speech recognizer (ASR).
-
公开(公告)号:US20220358919A1
公开(公告)日:2022-11-10
申请号:US17619055
申请日:2020-06-12
发明人: Wenhua Xu
摘要: A speech interaction method includes receiving, by a server, a first play message, where the first play message includes an identifier of first audio content corresponding to a first non-speech instruction. The server determines a first intent and first slot information that correspond to the first non-speech instruction. In response to the first play message, the server instructs a playback device to play the first audio content. The server receives a first speech instruction input by a user into the playback device, where a second intent or second slot information or both in the first speech instruction are incomplete. The server determines, based on the first intent and the first slot information, the second intent and the second slot information that correspond to the first speech instruction, and the server, based on the second intent and the second slot information, instructs the playback device to play second audio content.
-
公开(公告)号:US20220351729A1
公开(公告)日:2022-11-03
申请号:US17813367
申请日:2022-07-19
申请人: RingCentral, Inc.
IPC分类号: G10L15/22 , G10L15/04 , G10L15/16 , G10L25/90 , G10L15/30 , G06N3/08 , G10L15/24 , G10L13/00
摘要: A method for recognizing speech within a received audio signal includes separating, using a computer-based neural network model, a speech from an audio signal based on a speaker's audio profile, determining a command from the speech, determining, from the audio signal, a first score reflecting a percentage of confidence in determining the command based on a frequency of using the command by the speaker, determining, from the audio signal, a second score reflecting a percentage of importance of the command, and causing the command to be executed if the first score is above a first threshold value and the second score is below a second threshold value.
-
公开(公告)号:US20220343911A1
公开(公告)日:2022-10-27
申请号:US17811868
申请日:2022-07-11
申请人: BetterUp, Inc.
发明人: Andrew Reece , Peter Bull , Gus Cooney , Casey Fitzpatrick , Gabriella Rosen Kellerman , Ryan Sonnek
摘要: Technology is provided for conversation analysis. The technology includes, receiving multiple utterance representations, where each utterance representation represents a portion of a conversation performed by at least two users, and each utterance representation is associated with video data, acoustic data, and text data. The technology further includes generating a first utterance output by applying video data, acoustic data, and text data of the first utterance representation to a respective video processing part of the machine learning system to generate video, text, and acoustic-based outputs. A second utterance output is further generated for a second user. Conversation analysis indicators are generated by applying, to a sequential machine learning system the combined speaker features and a previous state of the sequential machine learning system.
-
公开(公告)号:US11482134B2
公开(公告)日:2022-10-25
申请号:US16536151
申请日:2019-08-08
发明人: Hye Dong Jung , Sang Ki Ko , Han Mu Park , Chang Jo Kim
IPC分类号: G09B21/00 , G06T11/00 , G10L15/22 , G10L15/24 , G10L25/63 , G10L25/90 , G06V40/20 , G06V40/16
摘要: Disclosed is a method of providing a sign language video reflecting an appearance of a conversation partner. The method includes recognizing a speech language sentence from speech information, and recognizing an appearance image and a background image from video information. The method further comprises acquiring multiple pieces of word-joint information corresponding to the speech language sentence from joint information database, sequentially inputting the word-joint information to a deep learning neural network to generate sentence-joint information, generating a motion model on the basis of the sentence-joint information, and generating a sign language video in which the background image and the appearance image are synthesized with the motion model. The method provides a natural communication environment between a sign language user and a speech language user.
-
公开(公告)号:US20220246147A1
公开(公告)日:2022-08-04
申请号:US17546838
申请日:2021-12-09
申请人: Rovi Guides, Inc.
IPC分类号: G10L15/22 , G10L15/30 , G10L15/24 , G06F16/587 , G10L15/18
摘要: A method of detecting establishment of a voice communication between a first voice communication equipment and a second voice communication equipment and automating requests for content. The method includes analyzing the voice communication to identify a request for content, analyzing the voice communication to identify an affirmative response to the request for content, and correlating the request for content with a first user account and correlating the affirmative response with a second user account. In response to identifying the affirmative response and based upon at least one of the first user account or the second user account, identifying from a data storage, the requested content and causing the transmission of the requested content.
-
公开(公告)号:US20220208193A1
公开(公告)日:2022-06-30
申请号:US17608386
申请日:2020-04-30
申请人: AUDI AG
IPC分类号: G10L15/24 , G02B6/42 , G10L15/22 , G10L25/51 , G06K9/62 , G01S7/4865 , G01S17/931
摘要: A detection device provides a voice signal of a person and includes a light source, a first planar carrier medium, a second planar carrier medium, a sensor device, and an evaluation device. The first planar carrier medium includes an input coupling region and an output coupling region. The second planar carrier medium includes a light input coupling region and a light output coupling region. Light from the light source is emitted into the second planar carrier medium and output in the direction of a front neck region of the person, and the light reflected on the neck region is input into the first planar carrier medium and output out of the first planar carrier medium. The output light is detected by the sensor device which provides sensor data to the evaluation device which converts the sensor data into vibrational data and provides the voice signal.
-
-
-
-
-
-
-
-
-