-
公开(公告)号:US11622218B2
公开(公告)日:2023-04-04
申请号:US17367629
申请日:2021-07-06
发明人: Hyunoh Oh , Taegyu Lee
IPC分类号: H04S5/00 , H04S3/00 , H03H17/02 , G10L19/02 , G10L25/48 , G10L19/008 , H04R5/04 , H04S7/00 , H03H21/00 , H04S3/02
摘要: The present invention relates to a method and an apparatus for processing a signal, which are used for effectively reproducing a multimedia signal, and more particularly, to a method and an apparatus for processing a signal, which are used for implementing filtering for multimedia signal having a plurality of subbands with a low calculation amount.
To this end, provided are a method for processing a multimedia signal including: receiving a multimedia signal having a plurality of subbands; receiving at least one proto-type filter coefficients for filtering each subband signal of the multimedia signal; converting the proto-type filter coefficients into a plurality of subband filter coefficients; truncating each subband filter coefficients based on filter order information obtained by at least partially using characteristic information extracted from the corresponding subband filter coefficients, the length of at least one truncated subband filter coefficients being different from the length of truncated subband filter coefficients of another subband; and filtering the multimedia signal by using the truncated subband filter coefficients corresponding to each subband signal and an apparatus for processing a multimedia signal using the same.-
公开(公告)号:US11593063B2
公开(公告)日:2023-02-28
申请号:US17240515
申请日:2021-04-26
申请人: SUPER HI FI, LLC
IPC分类号: G06F3/16 , G11B27/036 , G11B27/28 , G05B15/02 , G10L25/48
摘要: Some embodiments include a production content server system with a computing device processing operations include causing a content reader server to couple to a content source with content using a wired or wireless link, and downloading at least one content file associated with content retrieved from the content source, where content file includes audio and/or a video. The operations include transcoding at least a portion of the at least one content file with a dynamic range compression to a specified dynamic range, equalization and duration, and processing at least one content audio file from the at least one content file. The operations further include storing the at least one content audio file to a production content database. Some embodiments include processing a production break audio file including blending the at least one production break audio file with at least one other content file.
-
公开(公告)号:US20230026085A1
公开(公告)日:2023-01-26
申请号:US17956303
申请日:2022-09-29
发明人: Toshitaka MURATA , Yasuo YAMADA , Keita HAYASHI
摘要: A recording device includes an imaging data acquisition unit configured to acquire imaging data including video data and audio data imaging an inside of a vehicle or an outside of the vehicle to which the recording device is mounted, an event detection unit configured to detect occurrence of an event for the vehicle, a recording control unit configured to record first imaging data in a recording unit when recording of the imaging data in the recording unit is caused by the event detected by the event detection unit, and record second imaging data in the recording unit when recording of the imaging data in the recording unit is not caused by the event detected by the event detection unit, and a reproduction control unit configured to reproduce the video data and the audio data included in the first imaging data when reproducing the first imaging data.
-
公开(公告)号:US11557280B2
公开(公告)日:2023-01-17
申请号:US17101946
申请日:2020-11-23
申请人: Google LLC
发明人: Jason Sanders , Gabriel Taubman , John J. Lee
IPC分类号: G10L15/22 , G10L15/08 , H04M3/493 , G06F16/683 , G10L15/18 , G10L25/48 , G10L21/0272 , G10L15/26 , G10L21/0208
摘要: Implementations relate to techniques for providing context-dependent search results. A computer-implemented method includes receiving an audio stream at a computing device during a time interval, the audio stream comprising user speech data and background audio, separating the audio stream into a first substream that includes the user speech data and a second substream that includes the background audio, identifying concepts related to the background audio, generating a set of terms related to the identified concepts, influencing a speech recognizer based on at least one of the terms related to the background audio, and obtaining a recognized version of the user speech data using the speech recognizer.
-
5.
公开(公告)号:US20220358926A1
公开(公告)日:2022-11-10
申请号:US17870815
申请日:2022-07-21
申请人: Staton Techiya LLC
发明人: Charles Cella , John Keady
IPC分类号: G10L15/22 , G10L15/16 , G10L15/30 , G10L15/18 , G10L25/48 , H04R1/10 , G10L15/02 , G06F40/284
摘要: According to some embodiments of the disclosure, a method is disclosed. The method includes receiving, by a processing device of an in-ear device, an audio signal from one or more microphones of the in-ear device. The method further includes extracting, by the processing device, one or more features of the audio signal and generating, by the processing device, an in-ear data object based on the one or more features. The method also includes publishing, by the processing device, the in-ear data object to an external system via a network.
-
公开(公告)号:US11490057B2
公开(公告)日:2022-11-01
申请号:US17326137
申请日:2021-05-20
发明人: Toshitaka Murata , Yasuo Yamada , Keita Hayashi
IPC分类号: H04N7/18 , H04N5/765 , G10L25/48 , H04N9/802 , H04N5/91 , G07C5/00 , H04N5/77 , G07C5/08 , G10L25/78
摘要: A recording device according to an embodiment includes an imaging data acquisition unit configured to acquire imaging data including video data and audio data, an event detection unit configured to detect occurrence of an event; and a recording control unit configured to record first imaging data including the video data and the audio data in a recording unit when recording of the imaging data in the recording unit is caused by the event detected by the event detection unit, and record second imaging data including the video data and not including the audio data in the recording unit when recording of the imaging data in the recording unit is not caused by the event.
-
公开(公告)号:US11380348B2
公开(公告)日:2022-07-05
申请号:US17004015
申请日:2020-08-27
发明人: Chuan-Yu Chang , Jun-Ying Li
摘要: A method for correcting infant crying identification includes the following steps: a detecting step provides an audio unit to detect a sound around an infant to generate a plurality of audio samples. A converting step provides a processing unit to convert the audio samples to generate a plurality of audio spectrograms. An extracting step provides a common model to extract the audio spectrograms to generate a plurality of infant crying features. An incremental training step provides an incremental model to train the infant crying features to generate an identification result. A judging step provides the processing unit to judge whether the identification result is correct according to a real result of the infant. When the identification result is different from the real result, an incorrect result is generated. A correcting step provides the processing unit to correct the incremental model according to the incorrect result.
-
公开(公告)号:US11355117B2
公开(公告)日:2022-06-07
申请号:US16990525
申请日:2020-08-11
申请人: Google LLC
IPC分类号: G10L15/22 , G10L25/48 , G10L15/08 , G10L15/18 , G10L15/26 , G06F16/332 , G06F3/16 , G10L15/30
摘要: Embodiments of the disclosure generally relate to a dialog system allowing for automatically reactivating a speech acquiring mode after the dialog system delivers a response to a user request. The reactivation parameters, such as a delay, depend on a number of predetermined factors and conversation scenarios. The embodiments further provide for a method of operating of the dialog system. An exemplary method comprises the steps of: activating a speech acquiring mode, receiving a first input of a user, deactivating the speech acquiring mode, obtaining a first response associated with the first input, delivering the first response to the user, determining that a conversation mode is activated, and, based on the determination, automatically re-activating the speech acquiring mode within a first predetermined time period after delivery of the first response to the user.
-
9.
公开(公告)号:US11355033B2
公开(公告)日:2022-06-07
申请号:US15949425
申请日:2018-04-10
申请人: Meta Platforms, Inc.
IPC分类号: G09B21/00 , G01L5/00 , G06N20/00 , G06N3/04 , G06N3/08 , G10L13/00 , G10L21/02 , G08B6/00 , G09B21/04 , G10L15/02 , G10L15/22 , G10L21/0272 , G06F3/01 , G06F3/16 , G10L25/18 , G10L25/48 , G10L19/00 , G10L21/06 , G10L15/16
摘要: A method comprises inputting an audio signal into a machine learning circuit to compress the audio signal into a sequence of actuator signals. The machine learning circuit being trained by: receiving a training set of acoustic signals and pre-processing the training set of acoustic signals into pre-processed audio data. The pre-processed audio data including at least a spectrogram. The training further includes training the machine learning circuit using the pre-processed audio data. The neural network has a cost function based on a reconstruction error and a plurality of constraints. The machine learning circuit generates a sequence of haptic cues corresponding to the audio input. The sequence of haptic cues is transmitted to a plurality of cutaneous actuators to generate a sequence of haptic outputs.
-
10.
公开(公告)号:US11327710B2
公开(公告)日:2022-05-10
申请号:US16832883
申请日:2020-03-27
申请人: Adobe Inc.
发明人: Nico Becherer , Sven Duwenhorst
摘要: A computer-implemented method for audio signal processing includes analyzing a foreground audio signal to determine metrics corresponding to audio slices of the foreground audio signal. Each such metric indicates a value for an audio property of a respective audio slice. The method further includes computing a total metric for an audio slice as a function of a set of the metrics corresponding to a set of the audio slices including the audio slice. The method further includes adding a key frame to a track based on the total metric. The track includes the foreground audio signal and a background audio signal, and a location of the key frame corresponds to a location of the audio slice on the track. The key frame indicates a change to the audio property of the background audio signal at the location on the track, and the key frame is utilizable for audio ducking.
-
-
-
-
-
-
-
-
-