System and method for combining frame and segment level processing, via temporal pooling, for phonetic classification
    11.
    发明授权
    System and method for combining frame and segment level processing, via temporal pooling, for phonetic classification 有权
    用于组合帧和段级处理的系统和方法,通过时间池进行语音分类

    公开(公告)号:US09208778B2

    公开(公告)日:2015-12-08

    申请号:US14537400

    申请日:2014-11-10

    CPC classification number: G10L15/02 G10L15/08 G10L15/16

    Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for combining frame and segment level processing, via temporal pooling, for phonetic classification. A frame processor unit receives an input and extracts the time-dependent features from the input. A plurality of pooling interface units generates a plurality of feature vectors based on pooling the time-dependent features and selecting a plurality of time-dependent features according to a plurality of selection strategies. Next, a plurality of segmental classification units generates scores for the feature vectors. Each segmental classification unit (SCU) can be dedicated to a specific pooling interface unit (PIU) to form a PIU-SCU combination. Multiple PIU-SCU combinations can be further combined to form an ensemble of combinations, and the ensemble can be diversified by varying the pooling operations used by the PIU-SCU combinations. Based on the scores, the plurality of segmental classification units selects a class label and returns a result.

    Abstract translation: 本文公开了用于通过时间池来组合帧和段级处理用于语音分类的系统,方法和非暂时的计算机可读存储介质。 帧处理器单元接收输入并从输入中提取与时间相关的特征。 多个池化接口单元基于集合时间依赖特征并根据多个选择策略选择多个时间相关特征来生成多个特征向量。 接下来,多个分段分类单元生成特征向量的得分。 每个分段分类单元(SCU)可专用于特定的汇聚接口单元(PIU)以形成PIU-SCU组合。 可以进一步组合多个PIU-SCU组合以形成组合的集合,并且可以通过改变PIU-SCU组合使用的合并操作来使集合多样化。 基于分数,多个分段分类单元选择分类标签并返回结果。

    AUGMENTED MULTI-TIER CLASSIFIER FOR MULTI-MODAL VOICE ACTIVITY DETECTION
    12.
    发明申请
    AUGMENTED MULTI-TIER CLASSIFIER FOR MULTI-MODAL VOICE ACTIVITY DETECTION 有权
    用于多模式语音活动检测的增强型多分类器

    公开(公告)号:US20150058004A1

    公开(公告)日:2015-02-26

    申请号:US13974453

    申请日:2013-08-23

    CPC classification number: G10L25/78 G06K9/00335 G10L15/24 G10L25/84

    Abstract: Disclosed herein are systems, methods, and computer-readable storage media for detecting voice activity in a media signal in an augmented, multi-tier classifier architecture. A system configured to practice the method can receive, from a first classifier, a first voice activity indicator detected in a first modality for a human subject. Then, the system can receive, from a second classifier, a second voice activity indicator detected in a second modality for the human subject, wherein the first voice activity indicator and the second voice activity indicators are based on the human subject at a same time, and wherein the first modality and the second modality are different. The system can concatenate, via a third classifier, the first voice activity indicator and the second voice activity indicator with original features of the human subject, to yield a classifier output, and determine voice activity based on the classifier output.

    Abstract translation: 这里公开了用于在增强的多层分类器架构中检测媒体信号中的语音活动的系统,方法和计算机可读存储介质。 被配置为实施该方法的系统可以从第一分类器接收在人对象的第一模态中检测到的第一语音活动指示符。 然后,系统可以从第二分类器接收以人类对象的第二模式检测到的第二语音活动指示符,其中第一语音活动指示符和第二语音活动指示符同时基于人类对象, 并且其中所述第一模态和所述第二模态是不同的。 该系统可以经由第三分类器将第一语音活动指示符和具有人类受试者的原始特征的第二语音活动指示符连接起来,以产生分类器输出,并且基于分类器输出来确定语音活动。

    Navigation Route Updates
    13.
    发明申请
    Navigation Route Updates 有权
    导航路线更新

    公开(公告)号:US20140372021A1

    公开(公告)日:2014-12-18

    申请号:US14470160

    申请日:2014-08-27

    Abstract: Concepts and technologies are disclosed herein for providing navigation routes and/or providing navigation route updates. According to various embodiments of the concepts and technologies disclosed herein, a navigation application can be configured to obtain route data from a routing service. The routing service can be configured to use navigation data locally stored and/or obtained from a number of sources to generate navigation routes and/or to update navigation routes. The generated and/or updated navigation routes can be provided to the user device as route data that can be used to provide navigation directions to a user.

    Abstract translation: 本文公开了用于提供导航路线和/或提供导航路线更新的概念和技术。 根据本文公开的概念和技术的各种实施例,导航应用可以被配置为从路由服务获得路由数据。 可以将路由服务配置为使用本地存储和/或从多个源获得的导航数据来生成导航路线和/或更新导航路线。 生成和/或更新的导航路线可以作为可用于向用户提供导航方向的路线数据提供给用户设备。

    EXPLOITING VISUAL INFORMATION FOR ENHANCING AUDIO SIGNALS VIA SOURCE SEPARATION AND BEAMFORMING

    公开(公告)号:US20210049362A1

    公开(公告)日:2021-02-18

    申请号:US17086561

    申请日:2020-11-02

    Abstract: A system for exploiting visual information for enhancing audio signals via source separation and beamforming is disclosed. The system may obtain visual content associated with an environment of a user, and may extract, from the visual content, metadata associated with the environment. The system may determine a location of the user based on the extracted metadata. Additionally, the system may load, based on the location, an audio profile corresponding to the location of the user. The system may also load a user profile of the user that includes audio data associated with the user. Furthermore, the system may cancel, based on the audio profile and user profile, noise from the environment of the user. Moreover, the system may include adjusting, based on the audio profile and user profile, an audio signal generated by the user so as to enhance the audio signal during a communications session of the user.

    Exploiting visual information for enhancing audio signals via source separation and beamforming

    公开(公告)号:US10853653B2

    公开(公告)日:2020-12-01

    申请号:US16556476

    申请日:2019-08-30

    Abstract: A system for exploiting visual information for enhancing audio signals via source separation and beamforming is disclosed. The system may obtain visual content associated with an environment of a user, and may extract, from the visual content, metadata associated with the environment. The system may determine a location of the user based on the extracted metadata. Additionally, the system may load, based on the location, an audio profile corresponding to the location of the user. The system may also load a user profile of the user that includes audio data associated with the user. Furthermore, the system may cancel, based on the audio profile and user profile, noise from the environment of the user. Moreover, the system may include adjusting, based on the audio profile and user profile, an audio signal generated by the user so as to enhance the audio signal during a communications session of the user.

    PRE-DISTORTION SYSTEM FOR CANCELLATION OF NONLINEAR DISTORTION IN MOBILE DEVICES

    公开(公告)号:US20200028970A1

    公开(公告)日:2020-01-23

    申请号:US16586269

    申请日:2019-09-27

    Abstract: A pre-distortion system for improved mobile device communications via cancellation of nonlinear distortion is disclosed. The pre-distortion system may transmit an acoustic signal from a network to a device, wherein the acoustic signal includes a linear signal and a nonlinear cancellation signal that cancels at least a portion of nonlinear distortions created once a loudspeaker in the device emits the linear signal. Thus, when a loudspeaker of a mobile device is operating and nonlinear distortions are generated by the loudspeaker or adjacent components of the mobile device in close proximity to the loudspeaker, the pre-distortion system may create one or more nonlinear cancellation signals in the network. The nonlinear cancellation signal may be combined with the linear signal sent to the loudspeaker to cancel the nonlinear distortion signal created by the loudspeaker emitting acoustic sounds from the linear signal. Thus, the nonlinear cancellation signal becomes a pre-distortion signal.

    Exploiting Visual Information For Enhancing Audio Signals Via Source Separation And Beamforming

    公开(公告)号:US20190384979A1

    公开(公告)日:2019-12-19

    申请号:US16556476

    申请日:2019-08-30

    Abstract: A system for exploiting visual information for enhancing audio signals via source separation and beamforming is disclosed. The system may obtain visual content associated with an environment of a user, and may extract, from the visual content, metadata associated with the environment. The system may determine a location of the user based on the extracted metadata. Additionally, the system may load, based on the location, an audio profile corresponding to the location of the user. The system may also load a user profile of the user that includes audio data associated with the user. Furthermore, the system may cancel, based on the audio profile and user profile, noise from the environment of the user. Moreover, the system may include adjusting, based on the audio profile and user profile, an audio signal generated by the user so as to enhance the audio signal during a communications session of the user.

    Pre-distortion system for cancellation of nonlinear distortion in mobile devices

    公开(公告)号:US10432797B2

    公开(公告)日:2019-10-01

    申请号:US15978592

    申请日:2018-05-14

    Abstract: A pre-distortion system for improved mobile device communications via cancellation of nonlinear distortion is disclosed. The pre-distortion system may transmit an acoustic signal from a network to a device, wherein the acoustic signal includes a linear signal and a nonlinear cancellation signal that cancels at least a portion of nonlinear distortions created once a loudspeaker in the device emits the linear signal. Thus, when a loudspeaker of a mobile device is operating and nonlinear distortions are generated by the loudspeaker or adjacent components of the mobile device in close proximity to the loudspeaker, the pre-distortion system may create one or more nonlinear cancellation signals in the network. The nonlinear cancellation signal may be combined with the linear signal sent to the loudspeaker to cancel the nonlinear distortion signal created by the loudspeaker emitting acoustic sounds from the linear signal. Thus, the nonlinear cancellation signal becomes a pre-distortion signal.

Patent Agency Ranking