-
公开(公告)号:US20190130628A1
公开(公告)日:2019-05-02
申请号:US15858992
申请日:2017-12-29
Applicant: Snap Inc.
Abstract: The present invention relates to a joint automatic audio visual driven facial animation system that in some example embodiments includes a full scale state of the art Large Vocabulary Continuous Speech Recognition (LVCSR) with a strong language model for speech recognition and obtained phoneme alignment from the word lattice.
-
公开(公告)号:US12093607B2
公开(公告)日:2024-09-17
申请号:US17876842
申请日:2022-07-29
Applicant: Snap Inc.
Inventor: Xin Chen , Yurii Monastyrshyn , Fedir Poliakov , Shubham Vij
CPC classification number: G06F3/167 , G06F3/0482 , G06N3/044 , G06N3/08 , G06T11/001 , G10L15/08 , G10L2015/088 , G10L15/16
Abstract: An audio control system can control interactions with an application or device using keywords spoken by a user of the device. The audio control system can use machine learning models (e.g., a neural network model) trained to recognize one or more keywords. Which machine learning model is activated can depend on the active location in the application or device. Responsive to detecting keywords, different actions are performed by the device, such as navigation to a pre-specified area of the application.
-
公开(公告)号:US11620001B2
公开(公告)日:2023-04-04
申请号:US16948018
申请日:2020-08-27
Applicant: Snap Inc.
Inventor: William Brendel , Francesco Barbieri , Xin Chen , Wei Chu , Venkata Satya Pradeep Karuturi , Luis Carlos Dos Santos Marujo , Leonardo Ribas Machado das Neves
IPC: G06F40/166 , G06F3/023 , G06N3/084 , G06K9/62 , G06F3/04817 , H04L51/04 , G06F40/274
Abstract: Symbol prediction can be implemented using a multi-task system trained for different tasks. The tasks may include a single symbol prediction, symbol category prediction, and symbol subcategory prediction. Categories of symbols can be generated by clustering sets of training data using a clustering scheme.
-
公开(公告)号:US11611525B2
公开(公告)日:2023-03-21
申请号:US17493111
申请日:2021-10-04
Applicant: Snap Inc.
Inventor: Theresa Barton , Yanping Chen , Lucas Ou-Yang , Emre Yamangil , Keyang Zhang , Jiwoon Jeon , Jaewook Chung , Wisam Dakka , Xin Chen
IPC: H04L51/226 , G06F11/34 , G06F11/30 , H04L51/52 , H04L51/224
Abstract: Disclosed are methods and systems for ranking content. In one aspect, a method of ranking content for display includes identifying, via hardware processing circuitry, interactions by a single account with content pairs, each of the content in the content pairs included in a plurality of content, aggregating, via the hardware processing circuitry, the identified interactions across a plurality of accounts, associating, via the hardware processing circuitry, probabilities with each content in the plurality of content based on the aggregated interactions, ranking, via the hardware processing circuitry, the plurality of content based on the associated probabilities; and selecting, via the hardware processing circuitry, content ranked above a threshold for display.
-
公开(公告)号:US11120597B2
公开(公告)日:2021-09-14
申请号:US16749753
申请日:2020-01-22
Applicant: Snap Inc.
Abstract: The present invention relates to a joint automatic audio visual driven facial animation system that in some example embodiments includes a full scale state of the art Large Vocabulary Continuous Speech Recognition (LVCSR) with a strong language model for speech recognition and obtained phoneme alignment from the word lattice.
-
公开(公告)号:US11610354B2
公开(公告)日:2023-03-21
申请号:US17349015
申请日:2021-06-16
Applicant: Snap Inc.
Abstract: The present invention relates to a joint automatic audio visual driven facial animation system that in some example embodiments includes a full scale state of the art Large Vocabulary Continuous Speech Recognition (LVCSR) with a strong language model for speech recognition and obtained phoneme alignment from the word lattice.
-
公开(公告)号:US11487501B2
公开(公告)日:2022-11-01
申请号:US15981295
申请日:2018-05-16
Applicant: Snap Inc.
Inventor: Xin Chen , Yurii Monastyrshyn , Fedir Poliakov , Shubham Vij
Abstract: An audio control system can control interactions with an application or device using keywords spoken by a user of the device. The audio control system can use machine learning models (e.g., a neural network model) trained to recognize one or more keywords. Which machine learning model is activated can depend on the active location in the application or device. Responsive to detecting keywords, different actions are performed by the device, such as navigation to a pre-specified area of the application.
-
公开(公告)号:US10788900B1
公开(公告)日:2020-09-29
申请号:US16023912
申请日:2018-06-29
Applicant: Snap Inc.
Inventor: William Brendel , Francesco Barbieri , Xin Chen , Wei Chu , Venkata Satya Pradeep Karuturi , Luis Carlos Dos Santos Marujo , Leonardo Ribas Machado das Neves
IPC: G06F17/27 , G06F3/023 , G06N3/08 , G06K9/62 , G06F3/0481 , H04L12/58 , G06F40/166 , G06F40/274
Abstract: Symbol prediction can be implemented using a multi-task system trained for different tasks. The tasks may include a single symbol prediction, symbol category prediction, and symbol subcategory prediction. Categories of symbols can be generated by clustering sets of training data using a clustering scheme.
-
公开(公告)号:US20200160580A1
公开(公告)日:2020-05-21
申请号:US16749753
申请日:2020-01-22
Applicant: Snap Inc.
Abstract: The present invention relates to a joint automatic audio visual driven facial animation system that in some example embodiments includes a full scale state of the art Large Vocabulary Continuous Speech Recognition (LVCSR) with a strong language model for speech recognition and obtained phoneme alignment from the word lattice.
-
公开(公告)号:US10296638B1
公开(公告)日:2019-05-21
申请号:US15839454
申请日:2017-12-12
Applicant: Snap Inc.
Inventor: Xin Chen , Jaewook Chung , Yu Hu , Jinhua Jiang , Xing Mei , Kirk Ouimet , Ning Xu
IPC: G06F17/30
Abstract: Systems and methods provide for capturing a plurality of segments of an audio stream and, for each segment of the plurality of segments of the audio stream: performing feature extraction on an audio signal of the segment using a feature extraction machine learning model that analyzes the audio signal to generate a feature vector for the segment and generating a prediction value for the segment for whether there is music in the segment using the extracted feature vector and a music detector machine learning model. The systems and methods further provide for generating a probability value that there is music in the audio stream based on the prediction value for each of the plurality of segments and causing the audio stream to be identified based on determining that the probability value that there is music in the audio stream meets a predetermined threshold.
-
-
-
-
-
-
-
-
-