KEYWORD VARIATION FOR QUERYING FOREIGN LANGUAGE AUDIO RECORDINGS

    公开(公告)号:WO2022272281A1

    公开(公告)日:2022-12-29

    申请号:PCT/US2022/073113

    申请日:2022-06-23

    Abstract: Techniques are disclosed for searching audio recordings in a second language with a key phrase in a first language. For example, a system as described herein receives a first key phrase in the first language and an audio recording in the second language. The system converts the first key phrase into a second key phrase in the second language. The system processes the second key phrase to produce a second key phrase variant. The system identifies, from a graph of words in the second language generated from the audio recording, instances of the second key phrase or the second key phrase variant within the audio recording. The system displays the identified instances of the second key phrase or the second key phrase variant within the audio recording to enhance searchability of the audio recording in the second language.

    INPUT SPEECH QUALITY MATCHING
    6.
    发明申请
    INPUT SPEECH QUALITY MATCHING 审中-公开
    输入语音质量匹配

    公开(公告)号:WO2016209924A1

    公开(公告)日:2016-12-29

    申请号:PCT/US2016/038708

    申请日:2016-06-22

    Abstract: A system matches text-to-speech (TTS) or other output to a quality of an input spoken utterance. The system uses trained models to detect a speech quality and generates an indicator of the speech quality. The speech quality may be determined from audio or non-audio data. The indicator is sent to downstream components of the system such as a command processor or TTS system. The output of the system is then determined using the indicator of speech quality, thus customizing an output of the system to the manner in which the utterance was spoken.

    Abstract translation: 系统将文本到语音(TTS)或其他输出与输入口头语音的质量相匹配。 该系统使用经过训练的模型来检测语音质量并产生语音质量的指标。 可以从音频或非音频数据确定语音质量。 指示符发送到系统的下游组件,如命令处理器或TTS系统。 然后使用语音质量的指示符来确定系统的输出,从而将系统的输出定制成说话的方式。

    PERCEPTION BASED MULTIMEDIA PROCESSING
    7.
    发明申请
    PERCEPTION BASED MULTIMEDIA PROCESSING 审中-公开
    基于情感的多媒体处理

    公开(公告)号:WO2016003735A1

    公开(公告)日:2016-01-07

    申请号:PCT/US2015/037484

    申请日:2015-06-24

    CPC classification number: G10L25/54 G06K9/6259 G06K9/6261 G10L25/03

    Abstract: Example embodiments disclosed herein relate to perception based multimedia processing. There is provided a method for processing multimedia data, the method includes automatically determining user perception on a segment of the multimedia data based on a plurality of clusters, the plurality of clusters obtained in association with predefined user perceptions and processing the segment of the multimedia data at least in part based on determined user perception on the segment. Corresponding system and computer program products are disclosed as well.

    Abstract translation: 本文公开的示例实施例涉及基于感知的多媒体处理。 提供了一种用于处理多媒体数据的方法,所述方法包括基于多个聚类自动确定多媒体数据的段上的用户感知,所述多个群集与预定义的用户感知相关联地获得并且处理多媒体数据的段 至少部分地基于对段的确定的用户感知。 还公开了相应的系统和计算机程序产品。

    情報処理装置及び情報処理方法
    8.
    发明申请
    情報処理装置及び情報処理方法 审中-公开
    信息处理设备和信息处理方法

    公开(公告)号:WO2014155526A1

    公开(公告)日:2014-10-02

    申请号:PCT/JP2013/058791

    申请日:2013-03-26

    Inventor: 舘野 剛

    Abstract:  実施の形態によれば、情報処理装置は、検索手段28cと解析手段28dとを備える。検索手段28cは、 解析対象となるコンテンツに対して所定の時間間隔で曲検索を行なう。解析手段28dは、検索手段28cにより所定の時間間隔で得られる曲検索結果に基づいて、コンテンツに含まれる曲の再生状態を解析する。

    Abstract translation: 在一个实施例中,信息处理装置具有搜索装置(28c)和分析装置(28d)。 搜索装置(28c)以规定的时间间隔对要分析的内容执行歌曲搜索。 分析装置(28d)基于通过搜索装置(28c)以规定的间隔获得的歌曲搜索结果,分析包含在内容中的歌曲的重放状态。

    APPARATUS, METHOD AND COMPUTER PROGRAM CODE FOR PROCESSING AUDIO STREAM

    公开(公告)号:WO2023285425A1

    公开(公告)日:2023-01-19

    申请号:PCT/EP2022/069393

    申请日:2022-07-12

    Abstract: Apparatus, method, and computer program code for processing audio stream. The method includes: obtaining (202) first peaks of an audio stream, wherein the first peak comprises a first peak amplitude at a first frequency and at a first time offset from a beginning of the audio stream; for each first peak, detecting (216, 218) a second peak in a window with a predetermined offset from the first peak, wherein the second peak comprises a second peak amplitude at a second frequency and at a second time offset from the beginning of the audio stream; and for each first peak, generating (216, 222) a fingerprint hash based on the first frequency, a time difference between the first time offset and the second time offset, a frequency difference between the first frequency and the second frequency, and an amplitude difference between the first amplitude and the second amplitude.

    SYSTEM AND METHOD FOR SMART BROADCAST MANAGEMENT

    公开(公告)号:WO2022243778A1

    公开(公告)日:2022-11-24

    申请号:PCT/IB2022/054124

    申请日:2022-05-04

    Abstract: An apparatus includes voice activity detection (VAD) circuitry configured to analyze one or more audio broadcast streams and to identify first segments of the one or more broadcast streams in which the audio data includes speech data. The apparatus further includes derivation circuitry configured to receive the first segments and, for each first segment, to derive one or more words from the speech data of the first segment. The apparatus further includes keyword detection circuitry configured to, for each first segment, receive the one or more words and to generate keyword information indicative of whether at least one word of the one or more words is among a set of stored keywords. The apparatus further includes decision circuitry configured to receive the first segments, the one or more words of each of the first segments, and the keyword information for each of the first segments and, for each first segment, to select, based at least in part on the keyword information, among a plurality of options regarding communication of information indicative of the first segment to a recipient.

Patent Agency Ranking