-
公开(公告)号:US20250166633A1
公开(公告)日:2025-05-22
申请号:US18511095
申请日:2023-11-16
Applicant: Google LLC
Inventor: Dimitri Kanevsky , Sharlene Yuan , Artem Dementyev , Sagar Savla , Vinton Gray Cerf
IPC: G10L17/06 , G06F3/0482 , G06F3/0484 , G09B21/00 , H04H20/86
Abstract: Systems and methods for generating transcriptions of audio data for presentation at a client device are provided. One or more audio streams, provided by one or more audio sources of one or more client devices of a plurality of users, are received by a broadcasting system. Sensory modality information comprising one or more of auditory, visual, or haptic characteristics of a first user is determined. First audio data from the one or more audio streams corresponding to the first user and additional audio data from the one or more audio streams corresponding to other users are determined using one or more machine learning models. At least one of a first transcription of the first audio data or one or more additional transcription of one or more of the additional audio data are provided for presentation at a first client device according to the auditory, visual, or haptic characteristics.
-
公开(公告)号:US20230342108A1
公开(公告)日:2023-10-26
申请号:US18044831
申请日:2021-08-31
Applicant: Google LLC
Inventor: Dimitri Kanevsky , Sagar Savla , Ausmus Chang , Chiawei Liu , Daniel P W Ellis , Jinho Kim , Justin Stuart Paul , Sharlene Yuan , Alex Huang , Yun Che Chung , Chelsey Fleming
Abstract: An example method includes receiving, by one or more processors of a computing device, audio data recorded by one or more microphones of the computing device; and generating, based on the audio data and by the one or more processors, one or more structured sound records, a first structured sound record of the one or more structured sound records including: a description of a first sound, the description including a descriptive label of the first sound, the descriptive label different than a text transcription of the first sound, and a time stamp indicating a time at which the first sound occurred; and outputting a graphical user interface including timeline representation of the one or more structured sound records.
-