Patent search ipc:"G10L25/54" Page 1

1.

发明申请
智能设备的唤醒方法和装置、存储介质及电子装置审中-公开

公开(公告)号：WO2023273747A1

公开(公告)日：2023-01-05

申请号：PCT/CN2022/095732

申请日：2022-05-27

Applicant: 青岛海尔科技有限公司 , 海尔智家股份有限公司

Inventor： 郝斌

IPC: G10L21/0208 , G10L15/22 , G10L25/54 , G10L2015/223 , G10L2021/02082

Abstract: 一种智能设备的唤醒方法和装置、存储介质及电子装置，其中，该方法包括：从多个智能设备中获取允许被唤醒信号唤醒的智能设备作为候选设备；在候选设备的数量为多个的情况下，确定多个候选设备中每个候选设备对应的目标唤醒角度以及目标唤醒能量；根据目标唤醒角度和目标唤醒能量，从多个候选设备中确定目标设备，其中，目标设备用于响应唤醒信号。解决了相关技术中，确定响应唤醒指令的智能设备的准确性较低等问题。

2.

发明申请
KEYWORD VARIATION FOR QUERYING FOREIGN LANGUAGE AUDIO RECORDINGS 审中-公开

公开(公告)号：WO2022272281A1

公开(公告)日：2022-12-29

申请号：PCT/US2022/073113

申请日：2022-06-23

Applicant: SRI INTERNATIONAL

Inventor： KATHOL, Andreas , RICHEY, Colleen , ABRASH, Victor , KWON, Homin

IPC: G06F16/632 , G10L15/08 , G10L15/187 , G10L15/26 , G10L21/06 , G10L25/54 , G06F16/638 , G10L21/12

Abstract: Techniques are disclosed for searching audio recordings in a second language with a key phrase in a first language. For example, a system as described herein receives a first key phrase in the first language and an audio recording in the second language. The system converts the first key phrase into a second key phrase in the second language. The system processes the second key phrase to produce a second key phrase variant. The system identifies, from a graph of words in the second language generated from the audio recording, instances of the second key phrase or the second key phrase variant within the audio recording. The system displays the identified instances of the second key phrase or the second key phrase variant within the audio recording to enhance searchability of the audio recording in the second language.

3.

发明申请
INTERACTIVE AUDIO ENTERTAINMENT SYSTEM FOR VEHICLES 审中-公开

公开(公告)号：WO2022178122A1

公开(公告)日：2022-08-25

申请号：PCT/US2022/016788

申请日：2022-02-17

Applicant: CERENCE OPERATING COMPANY

Inventor： BEN GIGI, Yitshak Lior , VACHON, Caitlin

IPC: G10L15/22 , B60W50/08 , G06F3/16 , G10L13/00 , G10L25/54

Abstract: A system for interacting with an audio stream to obtain lyric information, control playback of the audio stream, and control aspects of the audio stream. In some instances, end users can request that the audio stream play with a lead vocal track or without a lead vocal track. Obtaining lyric information includes receiving via a text to speech module an audio playback of the lyric information.

4.

发明申请
CLASSIFICATION OF AUDITORY AND VISUAL MEETING DATA TO INFER IMPORTANCE OF USER UTTERANCES 审中-公开

公开(公告)号：WO2021247156A1

公开(公告)日：2021-12-09

申请号：PCT/US2021/028278

申请日：2021-04-21

Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventor： KIKIN-GIL, Erez , PARISH, Daniel Yancy

IPC: G10L25/54 , G06F16/34 , G10L15/16 , G10L15/26 , G06T7/00 , G06F16/345 , G06K9/6256 , G06K9/6293 , G06N3/0445 , G06N3/08 , G06V10/82 , G06V20/41 , G06V20/47 , G06V2201/10 , G06V40/16 , G06V40/172 , G10L17/00 , G10L17/18 , H04N7/155

Abstract: In non-limiting examples of the present disclosure, systems, methods and devices for generating summary content are presented. Voice audio data and video data for an electronic meeting may be received. A language processing model may be applied to a transcript of the audio data and textual importance scores may be calculated. A video/image model may be applied to the video data and visual importance scores may be calculated. A combined importance score may be calculated for sections of the electronic meeting based on the textual importance scores and the visual importance scores. A meeting summary that includes summary content from sections for which combined importance scores exceed a threshold value may be generated.

5.

发明申请
一种交互方法、装置、耳机和耳机收纳装置审中-公开

公开(公告)号：WO2021244057A1

公开(公告)日：2021-12-09

申请号：PCT/CN2021/074912

申请日：2021-02-02

Applicant: 北京搜狗智能科技有限公司

Inventor： 崔文华 , 赵楠

IPC: G10L15/22 , G10L15/26 , G10L15/18 , G10L15/30 , G10L25/54 , G10L15/1822 , G10L2015/223

Abstract: 本发明实施例提供了一种交互方法、一种交互装置、一种耳机和耳机收纳装置，所述耳机与耳机收纳装置通信连接，所述耳机具有交互助手，所述方法包括：所述耳机从所述耳机收纳装置获取用户语音的语音识别结果；调用所述交互助手根据所述语音识别结果执行交互操作。不需要用户使用手操作耳机，实现耳机的多种交互功能。

6.

发明申请
INPUT SPEECH QUALITY MATCHING 审中-公开
Title translation: 输入语音质量匹配

公开(公告)号：WO2016209924A1

公开(公告)日：2016-12-29

申请号：PCT/US2016/038708

申请日：2016-06-22

Applicant: AMAZON TECHNOLOGIES, INC.

Inventor： BASYE, Kenneth John , TOTH, Arthur Richard , BARTON, William Folwell

IPC: G10L15/22 , G10L25/54 , G06F17/30 , G10L15/26 , G10L13/033 , G10L17/26

CPC classification number: G10L15/26 , G10L13/02 , G10L13/033 , G10L15/18 , G10L15/22 , G10L17/26 , G10L25/54 , G10L2015/223 , G10L2015/225

Abstract: A system matches text-to-speech (TTS) or other output to a quality of an input spoken utterance. The system uses trained models to detect a speech quality and generates an indicator of the speech quality. The speech quality may be determined from audio or non-audio data. The indicator is sent to downstream components of the system such as a command processor or TTS system. The output of the system is then determined using the indicator of speech quality, thus customizing an output of the system to the manner in which the utterance was spoken.

Abstract translation: 系统将文本到语音（TTS）或其他输出与输入口头语音的质量相匹配。该系统使用经过训练的模型来检测语音质量并产生语音质量的指标。可以从音频或非音频数据确定语音质量。指示符发送到系统的下游组件，如命令处理器或TTS系统。然后使用语音质量的指示符来确定系统的输出，从而将系统的输出定制成说话的方式。

7.

发明申请
PERCEPTION BASED MULTIMEDIA PROCESSING 审中-公开
Title translation: 基于情感的多媒体处理

公开(公告)号：WO2016003735A1

公开(公告)日：2016-01-07

申请号：PCT/US2015/037484

申请日：2015-06-24

Applicant: DOLBY LABORATORIES LICENSING CORPORATION

Inventor： BAUER, Claus , LU, Lie , HU, Mingqing , WANG, Jun , CRUM, Poppy , WILSON, Rhonda , RADHAKRISHNAN, Regunathan

IPC: G06K9/62 , G10L25/03 , G10L25/54

CPC classification number: G10L25/54 , G06K9/6259 , G06K9/6261 , G10L25/03

Abstract: Example embodiments disclosed herein relate to perception based multimedia processing. There is provided a method for processing multimedia data, the method includes automatically determining user perception on a segment of the multimedia data based on a plurality of clusters, the plurality of clusters obtained in association with predefined user perceptions and processing the segment of the multimedia data at least in part based on determined user perception on the segment. Corresponding system and computer program products are disclosed as well.

Abstract translation: 本文公开的示例实施例涉及基于感知的多媒体处理。提供了一种用于处理多媒体数据的方法，所述方法包括基于多个聚类自动确定多媒体数据的段上的用户感知，所述多个群集与预定义的用户感知相关联地获得并且处理多媒体数据的段至少部分地基于对段的确定的用户感知。还公开了相应的系统和计算机程序产品。

8.

发明申请
情報処理装置及び情報処理方法审中-公开
Title translation: 信息处理设备和信息处理方法

公开(公告)号：WO2014155526A1

公开(公告)日：2014-10-02

申请号：PCT/JP2013/058791

申请日：2013-03-26

Applicant: 株式会社東芝

Inventor： 舘野　剛

IPC: G10L25/54 , G06F17/30 , G11B27/10

CPC classification number: G10H1/0041 , G06F17/30743 , G10H2240/151 , G10L25/54 , G11B27/28

Abstract: 　実施の形態によれば、情報処理装置は、検索手段２８ｃと解析手段２８ｄとを備える。検索手段２８ｃは、　解析対象となるコンテンツに対して所定の時間間隔で曲検索を行なう。解析手段２８ｄは、検索手段２８ｃにより所定の時間間隔で得られる曲検索結果に基づいて、コンテンツに含まれる曲の再生状態を解析する。

Abstract translation: 在一个实施例中，信息处理装置具有搜索装置（28c）和分析装置（28d）。搜索装置（28c）以规定的时间间隔对要分析的内容执行歌曲搜索。分析装置（28d）基于通过搜索装置（28c）以规定的间隔获得的歌曲搜索结果，分析包含在内容中的歌曲的重放状态。

9.

发明申请
APPARATUS, METHOD AND COMPUTER PROGRAM CODE FOR PROCESSING AUDIO STREAM 审中-公开

公开(公告)号：WO2023285425A1

公开(公告)日：2023-01-19

申请号：PCT/EP2022/069393

申请日：2022-07-12

Applicant: UTOPIA MUSIC AG

Inventor： WAHLGREN, Linus , FLACH, Max

IPC: G10L25/54 , G10L25/18

Abstract: Apparatus, method, and computer program code for processing audio stream. The method includes: obtaining (202) first peaks of an audio stream, wherein the first peak comprises a first peak amplitude at a first frequency and at a first time offset from a beginning of the audio stream; for each first peak, detecting (216, 218) a second peak in a window with a predetermined offset from the first peak, wherein the second peak comprises a second peak amplitude at a second frequency and at a second time offset from the beginning of the audio stream; and for each first peak, generating (216, 222) a fingerprint hash based on the first frequency, a time difference between the first time offset and the second time offset, a frequency difference between the first frequency and the second frequency, and an amplitude difference between the first amplitude and the second amplitude.

10.

发明申请
SYSTEM AND METHOD FOR SMART BROADCAST MANAGEMENT 审中-公开

公开(公告)号：WO2022243778A1

公开(公告)日：2022-11-24

申请号：PCT/IB2022/054124

申请日：2022-05-04

Applicant: COCHLEAR LIMITED

Inventor： WINDEYER, Jamon , CHEN, Henry, Hu , FRIEDING, Jan, Patrick , FUNG, Stephen

IPC: G10L25/78 , G10L15/04 , G10L25/54 , H04R1/10 , H04R25/00

Abstract: An apparatus includes voice activity detection (VAD) circuitry configured to analyze one or more audio broadcast streams and to identify first segments of the one or more broadcast streams in which the audio data includes speech data. The apparatus further includes derivation circuitry configured to receive the first segments and, for each first segment, to derive one or more words from the speech data of the first segment. The apparatus further includes keyword detection circuitry configured to, for each first segment, receive the one or more words and to generate keyword information indicative of whether at least one word of the one or more words is among a set of stored keywords. The apparatus further includes decision circuitry configured to receive the first segments, the one or more words of each of the first segments, and the keyword information for each of the first segments and, for each first segment, to select, based at least in part on the keyword information, among a plurality of options regarding communication of information indicative of the first segment to a recipient.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification