ELECTRONIC DEVICE FOR RECORDING CONTENTS DATA AND METHOD OF THE SAME

    公开(公告)号:US20230050178A1

    公开(公告)日:2023-02-16

    申请号:US17887178

    申请日:2022-08-12

    IPC分类号: G10L21/055 G10L19/022

    摘要: An electronic device according to various embodiments of the disclosure includes: a display configured to output image data of content based on execution of an application, a sound output module comprising circuitry configured to output audio data of the content, and a processor adaptively connected to the display and the sound output module, wherein the processor is configured to: identify a schedule for sequentially receiving read tasks (RTs) at a specified time interval to encode audio segments sequentially input in a specified size into an audio buffer from the audio data, and control time points at which the RTs are called, based on at least one of a situation in which the RTs are received according to the schedule and an audio buffer state and encode the audio segments corresponding to the RTs received at the controlled time points.

    Hypothesis stitcher for speech recognition of long-form audio

    公开(公告)号:US11574639B2

    公开(公告)日:2023-02-07

    申请号:US17127938

    申请日:2020-12-18

    摘要: A hypothesis stitcher for speech recognition of long-form audio provides superior performance, such as higher accuracy and reduced computational cost. An example disclosed operation includes: segmenting the audio stream into a plurality of audio segments; identifying a plurality of speakers within each of the plurality of audio segments; performing automatic speech recognition (ASR) on each of the plurality of audio segments to generate a plurality of short-segment hypotheses; merging at least a portion of the short-segment hypotheses into a first merged hypothesis set; inserting stitching symbols into the first merged hypothesis set, the stitching symbols including a window change (WC) symbol; and consolidating, with a network-based hypothesis stitcher, the first merged hypothesis set into a first consolidated hypothesis. Multiple variations are disclosed, including alignment-based stitchers and serialized stitchers, which may operate as speaker-specific stitchers or multi-speaker stitchers, and may further support multiple options for differing hypothesis configurations.

    Grouping and transport of audio objects

    公开(公告)号:US11570564B2

    公开(公告)日:2023-01-31

    申请号:US16753698

    申请日:2018-09-24

    摘要: An apparatus for audio signal processing audio objects within at least one audio scene, the apparatus comprising at least one processor configured to:define for at least one time period at least one contextual grouping comprising at least two of a plurality of audio objects and at least one further audio object of the plurality of audio objects outside of the at least one contextual grouping, the plurality of audio objects within at least one audio scene; anddefine with respect to the at least one contextual grouping at least one first parameter and/or parameter rule type which is configured to be applied with respect to a common element associated with the at least two of the plurality of audio objects and wherein the at least one first parameter and/or parameter rule type is configured to be applied with respect to individual element associatedwith the at least one further audio object outside of the at least one contextual grouping, the at least one first parameter and/or parameter rule type being applied in audio rendering of both the at least two of the plurality of audio objects and the at least one further audio object.