Method and apparatus for processing multimedia signals

    公开(公告)号:US11622218B2

    公开(公告)日:2023-04-04

    申请号:US17367629

    申请日:2021-07-06

    发明人: Hyunoh Oh Taegyu Lee

    摘要: The present invention relates to a method and an apparatus for processing a signal, which are used for effectively reproducing a multimedia signal, and more particularly, to a method and an apparatus for processing a signal, which are used for implementing filtering for multimedia signal having a plurality of subbands with a low calculation amount.
    To this end, provided are a method for processing a multimedia signal including: receiving a multimedia signal having a plurality of subbands; receiving at least one proto-type filter coefficients for filtering each subband signal of the multimedia signal; converting the proto-type filter coefficients into a plurality of subband filter coefficients; truncating each subband filter coefficients based on filter order information obtained by at least partially using characteristic information extracted from the corresponding subband filter coefficients, the length of at least one truncated subband filter coefficients being different from the length of truncated subband filter coefficients of another subband; and filtering the multimedia signal by using the truncated subband filter coefficients corresponding to each subband signal and an apparatus for processing a multimedia signal using the same.

    Audio content production, audio sequencing, and audio blending system and method

    公开(公告)号:US11593063B2

    公开(公告)日:2023-02-28

    申请号:US17240515

    申请日:2021-04-26

    申请人: SUPER HI FI, LLC

    摘要: Some embodiments include a production content server system with a computing device processing operations include causing a content reader server to couple to a content source with content using a wired or wireless link, and downloading at least one content file associated with content retrieved from the content source, where content file includes audio and/or a video. The operations include transcoding at least a portion of the at least one content file with a dynamic range compression to a specified dynamic range, equalization and duration, and processing at least one content audio file from the at least one content file. The operations further include storing the at least one content audio file to a production content database. Some embodiments include processing a production break audio file including blending the at least one production break audio file with at least one other content file.

    RECORDING DEVICE, RECORDING METHOD, REPRODUCTION METHOD, AND PROGRAM

    公开(公告)号:US20230026085A1

    公开(公告)日:2023-01-26

    申请号:US17956303

    申请日:2022-09-29

    摘要: A recording device includes an imaging data acquisition unit configured to acquire imaging data including video data and audio data imaging an inside of a vehicle or an outside of the vehicle to which the recording device is mounted, an event detection unit configured to detect occurrence of an event for the vehicle, a recording control unit configured to record first imaging data in a recording unit when recording of the imaging data in the recording unit is caused by the event detected by the event detection unit, and record second imaging data in the recording unit when recording of the imaging data in the recording unit is not caused by the event detected by the event detection unit, and a reproduction control unit configured to reproduce the video data and the audio data included in the first imaging data when reproducing the first imaging data.

    Background audio identification for speech disambiguation

    公开(公告)号:US11557280B2

    公开(公告)日:2023-01-17

    申请号:US17101946

    申请日:2020-11-23

    申请人: Google LLC

    摘要: Implementations relate to techniques for providing context-dependent search results. A computer-implemented method includes receiving an audio stream at a computing device during a time interval, the audio stream comprising user speech data and background audio, separating the audio stream into a first substream that includes the user speech data and a second substream that includes the background audio, identifying concepts related to the background audio, generating a set of terms related to the identified concepts, influencing a speech recognizer based on at least one of the terms related to the background audio, and obtaining a recognized version of the user speech data using the speech recognizer.

    Method and system for correcting infant crying identification

    公开(公告)号:US11380348B2

    公开(公告)日:2022-07-05

    申请号:US17004015

    申请日:2020-08-27

    摘要: A method for correcting infant crying identification includes the following steps: a detecting step provides an audio unit to detect a sound around an infant to generate a plurality of audio samples. A converting step provides a processing unit to convert the audio samples to generate a plurality of audio spectrograms. An extracting step provides a common model to extract the audio spectrograms to generate a plurality of infant crying features. An incremental training step provides an incremental model to train the infant crying features to generate an identification result. A judging step provides the processing unit to judge whether the identification result is correct according to a real result of the infant. When the identification result is different from the real result, an incorrect result is generated. A correcting step provides the processing unit to correct the incremental model according to the incorrect result.

    Dialog system with automatic reactivation of speech acquiring mode

    公开(公告)号:US11355117B2

    公开(公告)日:2022-06-07

    申请号:US16990525

    申请日:2020-08-11

    申请人: Google LLC

    摘要: Embodiments of the disclosure generally relate to a dialog system allowing for automatically reactivating a speech acquiring mode after the dialog system delivers a response to a user request. The reactivation parameters, such as a delay, depend on a number of predetermined factors and conversation scenarios. The embodiments further provide for a method of operating of the dialog system. An exemplary method comprises the steps of: activating a speech acquiring mode, receiving a first input of a user, deactivating the speech acquiring mode, obtaining a first response associated with the first input, delivering the first response to the user, determining that a conversation mode is activated, and, based on the determination, automatically re-activating the speech acquiring mode within a first predetermined time period after delivery of the first response to the user.

    Automatic audio ducking with real time feedback based on fast integration of signal levels

    公开(公告)号:US11327710B2

    公开(公告)日:2022-05-10

    申请号:US16832883

    申请日:2020-03-27

    申请人: Adobe Inc.

    摘要: A computer-implemented method for audio signal processing includes analyzing a foreground audio signal to determine metrics corresponding to audio slices of the foreground audio signal. Each such metric indicates a value for an audio property of a respective audio slice. The method further includes computing a total metric for an audio slice as a function of a set of the metrics corresponding to a set of the audio slices including the audio slice. The method further includes adding a key frame to a track based on the total metric. The track includes the foreground audio signal and a background audio signal, and a location of the key frame corresponds to a location of the audio slice on the track. The key frame indicates a change to the audio property of the background audio signal at the location on the track, and the key frame is utilizable for audio ducking.