Audio signal encoding for device control or content access in telecommunications

    公开(公告)号:US12192863B1

    公开(公告)日:2025-01-07

    申请号:US18651756

    申请日:2024-05-01

    Applicant: Bandwidth Inc.

    Inventor: Steve McKinnon

    Abstract: Embodiments of the present disclosure provide techniques for improving interaction dynamics between users and telecommunications devices through the use of encoded audio signals. An interaction system can obtain user inputs during a telecommunications session, obtain encoded audio signals based on these inputs, and transmit the encoded signals. A telecommunications device can receive these encoded audio signals, decode them to identify embedded instructions, and execute the corresponding operations, thereby facilitating enhanced interaction dynamics.

    Electronic apparatus for audio signal processing and operating method thereof

    公开(公告)号:US12192738B2

    公开(公告)日:2025-01-07

    申请号:US17727202

    申请日:2022-04-22

    Abstract: A method of an electronic apparatus for audio signal processing includes obtaining a parameter related to spatialization of an audio object, obtaining rendering information based on the parameter related to spatialization, and rendering the audio object based on the rendering information. The parameter related to spatialization includes at least one of an object parameter of a feature of at least one of the audio object or a video object associated with the audio object, an electronic apparatus parameter of a feature of the electronic apparatus, or a user parameter of a feature of a user.

    Deep learning system for real time maximum sound pressure level prediction

    公开(公告)号:US12192714B2

    公开(公告)日:2025-01-07

    申请号:US17805403

    申请日:2022-06-03

    Abstract: A method, apparatus, system, and computer program product for predicting sequential maximum sound pressure levels generated by an aircraft. A first set of sequential maximum sound pressure levels recorded by a first consecutive set of the microphones along a flight path during a flight of the aircraft using the flight path is identified. A second set of sequential maximum sound pressure levels that will be recorded by a second consecutive set of the microphones along the flight path during the flight of the aircraft using the flight path over the location is predicted. Predicting the second set of sequential maximum sound pressure levels using the set of deep learning models after training the set of deep learning models using a training dataset comprising historical aircraft sensor data for selected parameters, historical atmospheric data, and historical sound data recorded by microphones in a microphone system for flight paths over the location.

    Radiology report editing method and system

    公开(公告)号:US12191013B2

    公开(公告)日:2025-01-07

    申请号:US17901228

    申请日:2022-09-01

    Abstract: The invention provides a radiology report editing method and system. The method comprises providing a radiology report; recording a speech command from a user to generate a speech file; processing the speech file using a speech recognition model that has been trained using a speech dataset comprising Vietnamese speeches labeled with ground truth text transcriptions to generate a text command; processing the text command using a natural language understanding model that has been trained to perform a classification task and a sequence tagging task, wherein the classification task classifies the text command into an intent, and wherein the sequence tagging task tags each word in the text command with a tagging sequence indicates whether the each word express an intent, a content or a position; extracting a content to be edited, a position of a sentence to be edited in the text command based on the output of the sequence tagging task; and editing the radiology report based on the extracted content, the extracted position, and the extracted intent of the text command.

    Method and system for optimizing healthcare delivery

    公开(公告)号:US12191012B2

    公开(公告)日:2025-01-07

    申请号:US16840015

    申请日:2020-04-03

    Applicant: OpticSurg Inc.

    Inventor: Tran Tu Huynh

    Abstract: A healthcare delivery system, a wearable computing device, and a method are disclosed. The method includes capturing, by the wearable computing device worn by a medical practitioner, medical multimedia data of a patient during a medical procedure. The method further includes displaying one or more selectable options associated with the medical multimedia data on a display screen of the wearable computing device to the medical practitioner. Further, the method includes receiving a selection of a selectable option from among the one or more selectable options. Thereafter, at least one action is performed on the medical multimedia data based on the selection of the selectable option. An example of the at least one action includes sharing at least a part of the medical multimedia data with a third party device.

    Methods and apparatus to perform speed-enhanced playback of recorded media

    公开(公告)号:US12190911B2

    公开(公告)日:2025-01-07

    申请号:US18544650

    申请日:2023-12-19

    Abstract: Methods, apparatus, systems, and articles of manufacture to perform speed-enhanced playback of recorded media are disclosed. Example apparatus to playback media disclosed herein comprise at least one memory, machine-readable instructions, and processor circuitry to execute the machine-readable instructions to parse an audio frame included in the media to determine a number of skip bytes included in the audio frame, compare the number of skip bytes to a threshold, associate the audio frame with a plurality of candidate frames identified in the media when the number of skip bytes satisfies the threshold, and calculate a speed-enhanced playback rate for the media based on the plurality of candidate frames identified in the media.

    Speaker recognition with quality indicators

    公开(公告)号:US12190905B2

    公开(公告)日:2025-01-07

    申请号:US17408281

    申请日:2021-08-20

    Abstract: Embodiments described herein provide for a machine-learning architecture for modeling quality measures for enrollment signals. Modeling these enrollment signals enables the machine-learning architecture to identify deviations from expected or ideal enrollment signal in future test phase calls. These differences can be used to generate quality measures for the various audio descriptors or characteristics of audio signals. The quality measures can then be fused at the score-level with the speaker recognition's embedding comparisons for verifying the speaker. Fusing the quality measures with the similarity scoring essentially calibrates the speaker recognition's outputs based on the realities of what is actually expected for the enrolled caller and what was actually observed for the current inbound caller.

    Signal processing
    90.
    发明授权

    公开(公告)号:US12190901B2

    公开(公告)日:2025-01-07

    申请号:US17437835

    申请日:2020-02-28

    Inventor: Emmanuel Deruty

    Abstract: A signal processing method comprises comparing a first frequency domain representation of a sequence of power values for respective windows of source input samples of a source input signal with a second frequency domain representation of a sequence of power values for respective windows of target input samples of a target input signal so as to generate a frequency domain difference representation; inverse-frequency-transforming the frequency domain difference representation to generate a modification indication; and applying the modification indication to the source input samples to generate respective output samples of an output signal.

Patent Agency Ranking