Attention based sequential image processing

    公开(公告)号:US11205123B2

    公开(公告)日:2021-12-21

    申请号:US16773456

    申请日:2020-01-27

    Abstract: Techniques facilitating attention based sequential image processing are provided. A system can comprise a memory that stores computer executable components and a processor that executes the computer executable components stored in the memory. The computer executable components can comprise an initialization component that can perform self-attention based training on a model that comprises context information associated with a sequence of images. Images of the sequence of images can be selected during the self-attention based training. The computer executable components can also comprise a localization component that can extract local information from the images selected during the self-attention based training based on the context information. In addition, the computer executable components can also comprise an integration component that can update the model based on an end-to-end integrated attention training framework comprising the context information and the local information.

    Attention based sequential image processing
    3.
    发明授权

    公开(公告)号:US10671918B2

    公开(公告)日:2020-06-02

    申请号:US15792051

    申请日:2017-10-24

    Abstract: Techniques facilitating attention based sequential image processing are provided. A system can comprise a memory that stores computer executable components and a processor that executes the computer executable components stored in the memory. The computer executable components can comprise an initialization component that can perform self-attention based training on a model that comprises context information associated with a sequence of images. Images of the sequence of images can be selected during the self-attention based training. The computer executable components can also comprise a localization component that can extract local information from the images selected during the self-attention based training based on the context information. In addition, the computer executable components can also comprise an integration component that can update the model based on an end-to-end integrated attention training framework comprising the context information and the local information.

    Pattern based audio searching method and system

    公开(公告)号:US10671666B2

    公开(公告)日:2020-06-02

    申请号:US15185316

    申请日:2016-06-17

    Abstract: A pattern based audio searching method includes labeling a plurality of source audio data based on patterns to obtain audio label sequences of the source audio data; obtaining, with a processing device, an audio label sequence of target audio data; determining matching degree between the target audio data and the source audio data according to a predetermined matching rule based on the audio label sequence of the target audio data and the audio label sequences of the source audio data; and outputting source audio data having matching degree higher than a predetermined matching threshold as a search result.

    EMOTION CLASSIFICATION BASED ON EXPRESSION VARIATIONS ASSOCIATED WITH SAME OR SIMILAR EMOTIONS

    公开(公告)号:US20200026957A1

    公开(公告)日:2020-01-23

    申请号:US16587701

    申请日:2019-09-30

    Abstract: Techniques are described that facilitate automatically distinguishing between different expressions of a same or similar emotion. In one embodiment, a computer-implemented is provided that comprises partitioning, by a device operatively coupled to a processor, a data set comprising facial expression data into different clusters of the facial expression data based on one or more distinguishing features respectively associated with the different clusters, wherein the facial expression data reflects facial expressions respectively expressed by people. The computer-implemented method can further comprise performing, by the device, a multi-task learning process to determine a final number of the different clusters for the data set using a multi-task learning process that is dependent on an output of an emotion classification model that classifies emotion types respectively associated with the facial expressions.

    Method and system for achieving emotional text to speech utilizing emotion tags expressed as a set of emotion vectors
    7.
    发明授权
    Method and system for achieving emotional text to speech utilizing emotion tags expressed as a set of emotion vectors 有权
    使用表达为一组情绪向量的情感标签来实现情感文本到语音的方法和系统

    公开(公告)号:US09570063B2

    公开(公告)日:2017-02-14

    申请号:US14807052

    申请日:2015-07-23

    CPC classification number: G10L13/10 G10L13/02 G10L13/08

    Abstract: A method and system for achieving emotional text to speech. The method includes: receiving text data; generating emotion tag for the text data by a rhythm piece; and achieving TTS to the text data corresponding to the emotion tag, where the emotion tags are expressed as a set of emotion vectors; where each emotion vector includes a plurality of emotion scores given based on a plurality of emotion categories. A system for the same includes: a text data receiving module; an emotion tag generating module; and a TTS module for achieving TTS, wherein the emotion tag is expressed as a set of emotion vectors; and wherein emotion vector includes a plurality of emotion scores given based on a plurality of emotion categories.

    Abstract translation: 用于实现情感文字到语音的方法和系统。 该方法包括:接收文本数据; 通过节奏片产生文本数据的情感标签; 并且对于与情感标签相对应的文本数据实现TTS,其中情感标签被表达为一组情绪向量; 其中每个情绪向量包括基于多个情绪类别给出的多个情感评分。 一种系统,包括:文本数据接收模块; 情感标签生成模块; 以及用于实现TTS的TTS模块,其中所述情感标签被表达为一组情感向量; 并且其中情绪向量包括基于多个情绪类别给出的多个情绪评分。

    Pattern based audio searching method and system
    8.
    发明授权
    Pattern based audio searching method and system 有权
    基于模式的音频搜索方法和系统

    公开(公告)号:US09396256B2

    公开(公告)日:2016-07-19

    申请号:US14105874

    申请日:2013-12-13

    CPC classification number: G06F17/30743 G06F17/30752 G06N99/005

    Abstract: A pattern based audio searching method includes labeling a plurality of source audio data based on patterns to obtain audio label sequences of the source audio data; obtaining, with a processing device, an audio label sequence of target audio data; determining matching degree between the target audio data and the source audio data according to a predetermined matching rule based on the audio label sequence of the target audio data and the audio label sequences of the source audio data; and outputting source audio data having matching degree higher than a predetermined matching threshold as a search result.

    Abstract translation: 基于图案的音频搜索方法包括基于模式来标记多个源音频数据以获得源音频数据的音频标签序列; 利用处理设备获得目标音频数据的音频标签序列; 基于目标音频数据的音频标签序列和源音频数据的音频标签序列,根据预定匹配规则确定目标音频数据和源音频数据之间的匹配度; 并输出具有高于预定匹配阈值的匹配度的源音频数据作为搜索结果。

    Data processing method, presentation method, and corresponding apparatuses
    9.
    发明授权
    Data processing method, presentation method, and corresponding apparatuses 有权
    数据处理方法,呈现方法和相应的装置

    公开(公告)号:US09158753B2

    公开(公告)日:2015-10-13

    申请号:US13943308

    申请日:2013-07-16

    CPC classification number: G06F17/27 G06F17/2765 G10L15/18 G10L15/187 G10L15/22

    Abstract: A data processing method includes obtaining text information corresponding to a presented content, the presented content comprising a plurality of areas; performing text analysis on the text information to obtain a first keyword sequence, the first keyword sequence including area keywords associated with at least one area of the plurality of areas; obtaining speech information related to the presented content, the speech information at least comprising a current speech segment; and using a first model network to perform analysis on the current speech segment to determine the area corresponding to the current speech segment, wherein the first model network comprises the first keyword sequence.

    Abstract translation: 一种数据处理方法,包括获取与呈现的内容相对应的文本信息,所呈现的内容包括多个区域; 对文本信息执行文本分析以获得第一关键字序列,所述第一关键字序列包括与所述多个区域中的至少一个区域相关联的区域关键字; 获取与所呈现的内容相关的语音信息,所述语音信息至少包括当前语音段; 以及使用第一模型网络对当前语音段执行分析以确定与当前语音段相对应的区域,其中所述第一模型网络包括所述第一关键字序列。

    VOICE BASED BIOMETRIC AUTHENTICATION METHOD AND APPARATUS
    10.
    发明申请
    VOICE BASED BIOMETRIC AUTHENTICATION METHOD AND APPARATUS 有权
    基于语音的生物识别方法和设备

    公开(公告)号:US20140359739A1

    公开(公告)日:2014-12-04

    申请号:US14291059

    申请日:2014-05-30

    CPC classification number: G06F21/32 G06F21/46 G10L17/02 G10L17/24

    Abstract: Voice based biometric authentication method, apparatus (system), and computer program product. Provided is voice verification solution with a high accuracy rate that can prevent cheating via recording. The method includes: transmitting to the user a question prompt requiring the user to speak out a voice segment and an answer to a dynamic question, the voice segment having a corresponding text dependent speaker verification model enrolled before the authentication; segmenting, in response to receiving the voice answer, the voice segment part and the dynamic question answer part out from the voice answer; and verifying boundary smoothness between the voice segment and the answer to the dynamic question within the voice answer. With this method, whether a voice answer relates to cheating via recording is determined according to the degree of smoothness at a detected boundary. The apparatus and computer program product carry out the steps of the above-mentioned method.

    Abstract translation: 基于语音的生物识别方法,装置(系统)和计算机程序产品。 提供了具有高准确率的语音验证解决方案,可以通过录制来防止作弊。 该方法包括:向用户发送要求用户说出语音段和对动态问题的答案的问题提示,该语音段具有在认证之前注册的相应的文本相关说明者验证模型; 响应于接收到语音回答,语音段部分和动态问题回答部分从语音答案中分割; 并在语音答案中验证语音段与动态问题的答案之间的边界平滑度。 使用该方法,根据检测到的边界处的平滑度来确定语音应答与记录作弊有关。 该装置和计算机程序产品执行上述方法的步骤。

Patent Agency Ranking