-
公开(公告)号:US20240420709A1
公开(公告)日:2024-12-19
申请号:US18813149
申请日:2024-08-23
Applicant: DEKA Products Limited Partnership
Inventor: Dean Kamen , Derek G. Kane
IPC: G10L19/02 , G10L15/04 , G10L15/08 , G10L15/187 , G10L21/003 , G10L21/038 , H04M3/493
Abstract: A method for phoneme identification. The method includes receiving an audio signal from a speaker, performing initial processing comprising filtering the audio signal to remove audio features, the initial processing resulting in a modified audio signal, transmitting the modified audio signal to a phoneme identification method and a phoneme replacement method to further process the modified audio signal, and transmitting the modified audio signal to a speaker. Also, a system for identifying and processing audio signals. The system includes at least one speaker, at least one microphone, and at least one processor, wherein the processor processes audio signals received using a method for phoneme replacement.
-
公开(公告)号:US11948572B2
公开(公告)日:2024-04-02
申请号:US17971997
申请日:2022-10-24
Applicant: GOOGLE LLC
Inventor: Gaurav Bhaya , Robert Stets
IPC: G10L15/22 , G06F40/205 , G10L13/027 , G10L15/08 , G10L15/18 , G10L15/30 , G10L21/003 , G10L21/0316 , H04L65/1069
CPC classification number: G10L15/22 , G06F40/205 , G10L13/027 , G10L15/1815 , G10L15/1822 , G10L15/30 , G10L21/003 , G10L21/0316 , H04L65/1069 , G10L2015/088 , G10L2015/223
Abstract: Modulating packetized audio signals in a voice activated data packet based computer network environment is provided. A system can receive audio signals detected by a microphone of a device. The system can parse the audio signal to identify trigger keyword and request, and generate a first action data structure. The system can identify a content item object based on the trigger keyword, and generate an output signal comprising a first portion corresponding to the first action data structure and a second portion corresponding to the content item object. The system can apply a modulation to the first or second portion of the output signal, and transmit the modulated output signal to the device.
-
公开(公告)号:US11894007B2
公开(公告)日:2024-02-06
申请号:US17667891
申请日:2022-02-09
Applicant: Huawei Technologies Co., Ltd.
Inventor: Yang Gao , Fengyan Qi
Abstract: A method includes detecting whether there is a very short pitch lag in a speech or audio signal that is shorter than a conventional minimum pitch limitation using a combination of time domain and frequency domain pitch detection techniques. The pitch detection techniques include using pitch correlations in a time domain and detecting a lack of low frequency energy in the speech or audio signal in a frequency domain. The detected very short pitch lag is coded using a pitch range from a predetermined minimum very short pitch limitation that is smaller than the conventional minimum pitch limitation.
-
公开(公告)号:US20230351059A1
公开(公告)日:2023-11-02
申请号:US17732708
申请日:2022-04-29
Applicant: Zoom Video Communications, Inc.
Inventor: Shane P. SPRINGER , Alexander Waibel
IPC: G06F21/84 , H04L12/18 , G10L21/003 , G10L15/02
CPC classification number: G06F21/84 , H04L12/1831 , G10L21/003 , G10L15/02 , H04L12/1822
Abstract: Systems and methods for providing automated personal privacy during virtual meetings are provided herein. The method may include establishing, by a video conference provider, a video conference having a plurality of participants. The method may also include receiving, from a first client device associated with one of the plurality of participants, a first audio stream and a first video stream, and recording responsive to an indication from one of the plurality of participants, one or more audio or video streams within a recording. The method may include receiving, from the first client device, a personal privacy request. In response to the personal privacy request, the method may include modifying, by the video conference provider, at least one of the first audio stream or the first video stream in the recording and storing the least one of the first audio stream or the first video stream as modified to the recording.
-
公开(公告)号:US11792577B2
公开(公告)日:2023-10-17
申请号:US17230750
申请日:2021-04-14
Applicant: Orcam Technologies Ltd.
Inventor: Yonatan Wexler , Amnon Shashua
IPC: H04R25/00 , G06F3/16 , H04N5/38 , G06K9/00 , H04N5/225 , H04R1/08 , G10L17/00 , G10L25/51 , G10L17/04 , G10L17/06 , G10L17/18 , G10L21/003 , G10L21/034 , G03B31/00 , G06F1/16 , G10L21/0272 , H04N7/18 , G10L15/26 , G06V20/10 , G06V40/10 , G06V40/16 , G06V40/20 , G06F18/21 , G06F18/25 , H04N23/51 , G06V10/80 , G06F18/00
CPC classification number: H04R25/407 , G03B31/00 , G06F1/163 , G06F1/1686 , G06F3/165 , G06F3/167 , G06F18/21 , G06F18/251 , G06V10/803 , G06V20/10 , G06V40/10 , G06V40/16 , G06V40/165 , G06V40/171 , G06V40/172 , G06V40/20 , G10L15/26 , G10L17/00 , G10L17/04 , G10L17/06 , G10L17/18 , G10L21/003 , G10L21/0272 , G10L21/034 , G10L25/51 , H04N5/38 , H04N7/185 , H04N23/51 , H04R1/08 , H04R25/405 , H04R25/45 , H04R25/505 , H04R25/554 , H04R25/558 , H04R25/60 , H04R25/606 , H04R25/65 , G06F18/00 , H04R2225/025 , H04R2225/41 , H04R2225/43 , H04R2225/55 , H04R2460/01 , H04R2460/13
Abstract: A system may include a wearable camera configured to capture images and a microphone configured to capture sounds. The system may also include a processor programmed to receive the images; identify a representation of one or more individuals in the images; receive from the microphone a first audio signal associated with a voice; determine, based on analysis of the images, that the first audio signal is not associated with a voice of any of the one or more individuals; receive from the microphone a second audio signal associated with a voice; determine, based on analysis of the images, that the second audio signal is associated with a voice of one of the one or more individuals; and cause a first amplification of the first audio signal and a second amplification of the second audio signal. The first amplification may differ from the second amplification in one aspect.
-
公开(公告)号:US11763831B2
公开(公告)日:2023-09-19
申请号:US17198125
申请日:2021-03-10
Applicant: Yahoo Japan Corporation
Inventor: Kota Tsubouchi , Teruhiko Teraoka , Hidehito Gomi , Junichi Sato
IPC: G10L21/003 , G10L15/06 , G10L15/22
CPC classification number: G10L21/003 , G10L15/06 , G10L15/22
Abstract: An output apparatus according to the present application includes a prediction unit and an output unit. The prediction unit predicts whether or not waveform information having a predetermined context is generated on the basis of detection information detected by a predetermined detection device. The output unit outputs waveform information having an opposite phase to the waveform information having the predetermined context in a case where it has been predicted that the waveform information having the predetermined context is generated.
-
公开(公告)号:US11736656B1
公开(公告)日:2023-08-22
申请号:US17880100
申请日:2022-08-03
Applicant: Lemon Inc.
Inventor: Long Jiang , Keting Pan , Ryan Northway
IPC: H04N5/262 , G10L21/003 , H04N5/272 , G10L25/57 , H04N23/63
CPC classification number: H04N5/2621 , G10L21/003 , G10L25/57 , H04N5/272 , H04N23/632
Abstract: Provided are an effect video determination method and apparatus, an electronic device and a storage medium. The method includes: acquiring effect operation information in a process of shooting a video, where the effect operation information includes at least one of a speech effect operation, a touch effect operation or a gesture effect operation; retrieving a target to-be-added effect corresponding to the effect operation information from an effect repository; fusing the target to-be-added effect and a to-be-processed video frame to determine a target effect video frame; and determining a target effect video based on a plurality of target effect video frames.
-
公开(公告)号:US20230111040A1
公开(公告)日:2023-04-13
申请号:US17971997
申请日:2022-10-24
Applicant: GOOGLE LLC
Inventor: Gaurav Bhaya , Robert Stets
IPC: G10L15/22 , H04L65/1069 , G10L15/18 , G10L15/30 , G10L21/003 , G10L21/0316 , G10L13/027 , G06F40/205
Abstract: Modulating packetized audio signals in a voice activated data packet based computer network environment is provided. A system can receive audio signals detected by a microphone of a device. The system can parse the audio signal to identify trigger keyword and request, and generate a first action data structure. The system can identify a content item object based on the trigger keyword, and generate an output signal comprising a first portion corresponding to the first action data structure and a second portion corresponding to the content item object. The system can apply a modulation to the first or second portion of the output signal, and transmit the modulated output signal to the device.
-
公开(公告)号:US20230080418A1
公开(公告)日:2023-03-16
申请号:US18056932
申请日:2022-11-18
Applicant: OrCam Technologies Ltd.
Inventor: Yonatan WEXLER , Amnon SHASHUA
IPC: H04R25/00 , G06F3/16 , G06K9/62 , G10L17/04 , G10L17/06 , G10L17/18 , G10L21/003 , G10L21/034 , G10L25/51 , H04R1/08 , G03B31/00 , G06F1/16 , G10L21/0272 , H04N7/18 , G10L17/00 , H04N5/225 , H04N5/38 , G10L15/26 , G06V20/10 , G06V40/10 , G06V40/16 , G06V40/20
Abstract: A system for selectively amplifying audio signals may include a microphone configured to capture sounds from an environment of a user. The system may also include a processor programmed to: receive audio signals representative of the sounds captured by the microphone; cause selective conditioning of at least one audio signal received by the microphone from a region associated with the recognized individual; and cause transmission of the at least one conditioned audio signal to a hearing interface device configured to provide sound to an ear of the user.
-
公开(公告)号:US11605369B2
公开(公告)日:2023-03-14
申请号:US17197323
申请日:2021-03-10
Applicant: Spotify AB
Inventor: Marco Marchini
IPC: G10L13/033 , G10L21/003 , G06N20/00 , G10L25/45 , G10L25/75 , G10L15/06
Abstract: Audio translation system includes a feature extractor and a style transfer machine learning model. The feature extractor generates for each of a plurality of source voice files one or more source voice parameters encoded as a collection of source feature vectors, and generates for each of a plurality of target voice files one or more target voice parameters encoded as a collection of target feature vectors. The style transfer machine learning model trained on the collection of source feature vectors for the plurality of source voice files and the collection of target feature vectors for the plurality of target voice files to generate a style transformed feature vector.
-
-
-
-
-
-
-
-
-