Speech recognition analysis via identification information
    2.
    发明授权
    Speech recognition analysis via identification information 有权
    通过识别信息进行语音识别分析

    公开(公告)号:US08676581B2

    公开(公告)日:2014-03-18

    申请号:US12692538

    申请日:2010-01-22

    IPC分类号: G10L15/00

    摘要: Embodiments are disclosed that relate to the use of identity information to help avoid the occurrence of false positive speech recognition events in a speech recognition system. One embodiment provides a method comprising receiving speech recognition data comprising a recognized speech segment, acoustic locational data related to a location of origin of the recognized speech segment as determined via signals from the microphone array, and confidence data comprising a recognition confidence value, and also receiving image data comprising visual locational information related to a location of each person in an image. The acoustic locational data is compared to the visual locational data to determine whether the recognized speech segment originated from a person in the field of view of the image sensor, and the confidence data is adjusted depending on this determination.

    摘要翻译: 公开了涉及使用身份信息来帮助避免在语音识别系统中发生假阳性语音识别事件的实施例。 一个实施例提供了一种方法,包括接收包括识别的语音段的语音识别数据,与通过来自麦克风阵列的信号确定的识别的语音段的原点的位置有关的声学位置数据,以及包括识别置信度的置信度数据,以及 接收图像数据,其包括与图像中的每个人的位置相关的视觉位置信息。 将声学位置数据与视觉位置数据进行比较,以确定识别的语音片段是否源于图像传感器的视野中的人,并且根据该确定来调整置信度数据。

    SPEECH RECOGNITION ANALYSIS VIA IDENTIFICATION INFORMATION
    4.
    发明申请
    SPEECH RECOGNITION ANALYSIS VIA IDENTIFICATION INFORMATION 有权
    通过识别信息进行语音识别分析

    公开(公告)号:US20110184735A1

    公开(公告)日:2011-07-28

    申请号:US12692538

    申请日:2010-01-22

    摘要: Embodiments are disclosed that relate to the use of identity information to help avoid the occurrence of false positive speech recognition events in a speech recognition system. One embodiment provides a method comprising receiving speech recognition data comprising a recognized speech segment, acoustic locational data related to a location of origin of the recognized speech segment as determined via signals from the microphone array, and confidence data comprising a recognition confidence value, and also receiving image data comprising visual locational information related to a location of each person in an image. The acoustic locational data is compared to the visual locational data to determine whether the recognized speech segment originated from a person in the field of view of the image sensor, and the confidence data is adjusted depending on this determination.

    摘要翻译: 公开了涉及使用身份信息来帮助避免在语音识别系统中发生假阳性语音识别事件的实施例。 一个实施例提供了一种方法,包括接收包括识别的语音段的语音识别数据,与通过来自麦克风阵列的信号确定的识别的语音段的原点的位置有关的声学位置数据,以及包括识别置信度的置信度数据,以及 接收图像数据,其包括与图像中的每个人的位置相关的视觉位置信息。 将声学位置数据与视觉位置数据进行比较,以确定识别的语音片段是否源于图像传感器的视野中的人,并且根据该确定来调整置信度数据。

    Routing of resource information in a network
    5.
    发明授权
    Routing of resource information in a network 有权
    资源信息在网络中的路由

    公开(公告)号:US07668939B2

    公开(公告)日:2010-02-23

    申请号:US10742588

    申请日:2003-12-19

    IPC分类号: G06F15/177

    摘要: A media server in a Universal Plug and Play (UPnP) network includes a resource sharing service to govern the distribution of resource information regarding resources to rendering devices. In one case, the resource sharing service consults a criterion to determine whether an identified network device is authorized to receive resource information. In another case, the resource sharing service consults another criterion to determine whether a specified individual associated with the media server must consent to the transfer of the resource information in order for the transfer to occur. The resource information may include resource metadata that describes high level information regarding resources, as well as resource content. The media server includes various user interface presentations that allow the media server user to specify shared resources and distribution criteria.

    摘要翻译: 通用即插即用(UPnP)网络中的媒体服务器包括资源共享服务,以管理关于资源的资源信息的分发给呈现设备。 在一种情况下,资源共享服务参考标准来确定所识别的网络设备是否被授权接收资源信息。 在另一种情况下,资源共享服务咨询另一个标准,以确定与媒体服务器相关联的指定个人是否必须同意转移资源信息以便进行转移。 资源信息可以包括描述关于资源的高级信息的资源元数据以及资源内容。 媒体服务器包括允许媒体服务器用户指定共享资源和分发标准的各种用户界面演示。

    Server architecture for network resource information routing
    6.
    发明授权
    Server architecture for network resource information routing 有权
    网络资源信息路由的服务器架构

    公开(公告)号:US07555543B2

    公开(公告)日:2009-06-30

    申请号:US10742570

    申请日:2003-12-19

    IPC分类号: G06F15/173

    摘要: A media server in a Universal Plug and Play (UPnP) network includes a resource sharing service to govern the distribution of media resource information to rendering devices. The media server includes: a media service module operating in a clamped down user context (e.g., a local service user context) and configured to share resource information over the network; a supplemental module operating in a local system user context and configured to assist the media service module in sharing resource information over the network; and a control panel module operating in a logged on user context and configured to interact with a user via a user interface display. The local system user context provides a higher level of access to media server resources compared to the clamped down user context. The media server also provides fast user switching (FUS) functionality that allows multiple users to have respective instances of the control panel module pending at the same time. Further, the media server includes a mechanism to prevent rogue applications from masquerading as the control panel module and thereby gaining unauthorized access to the media service module.

    摘要翻译: 通用即插即用(UPnP)网络中的媒体服务器包括资源共享服务,以管理媒体资源信息到呈现设备的分发。 媒体服务器包括:以钳位的用户上下文(例如,本地服务用户上下文)操作并被配置为通过网络共享资源信息的媒体服务模块; 在本地系统用户环境中操作并被配置为辅助媒体服务模块通过网络共享资源信息的补充模块; 以及操作在登录用户上下文中并被配置为经由用户界面显示与用户交互的控制面板模块。 与被压缩的用户上下文相比,本地系统用户上下文提供对媒体服务器资源的更高级别的访问。 媒体服务器还提供快速用户切换(FUS)功能,允许多个用户同时具有控制面板模块的相应实例待处理。 此外,媒体服务器包括防止流氓应用伪装成控制面板模块并从而获得对媒体服务模块的未经授权的访问的机制。

    Using parameterized URLs for retrieving resource content items
    7.
    发明申请
    Using parameterized URLs for retrieving resource content items 审中-公开
    使用参数化的URL来检索资源内容项

    公开(公告)号:US20050138137A1

    公开(公告)日:2005-06-23

    申请号:US10742635

    申请日:2003-12-19

    摘要: A UPnP network provides a flexible technique for retrieving a resource content item from a media server using a parameterized uniform resource locator (URL). In operation, the media server sends a control point a parameterized URL in response to a consumer's browse or search request. The URL includes at least one parameter that specifies a characteristic attribute of the resource content item, which determines the manner in which the resource content item can be presented. For example, the parameter can describe a format type of the resource content item, a format resolution of the resource content item, and/or other property of the resource content item. The control point can modify a value associated with the parameter to produce a modified URL. This modified URL is submitted to the media server, whereupon the media server locates the resource content item and converts it to the characteristic state specified by the modified URL (if conversion is needed). The media server then provides the located (and potentially converted) resource content item to a rendering device for presentation thereat.

    摘要翻译: UPnP网络提供了一种灵活的技术,用于使用参数化的统一资源定位符(URL)从媒体服务器检索资源内容项。 在操作中,媒体服务器响应于消费者的浏览或搜索请求向控制点发送参数化的URL。 URL包括指定资源内容项目的特性属性的至少一个参数,该参数确定可呈现资源内容项目的方式。 例如,该参数可以描述资源内容项目的格式类型,资源内容项目的格式分辨率和/或资源内容项目的其他属性。 控制点可以修改与参数相关联的值以产生修改的URL。 该修改的URL被提交给媒体服务器,于是媒体服务器定位资源内容项并将其转换为由修改的URL指定的特征状态(如果需要转换)。 然后,媒体服务器将定位的(和可能转换的)资源内容项目提供给呈现设备以在其上呈现。

    ADAPTIVE AMBIENT SOUND SUPPRESSION AND SPEECH TRACKING
    9.
    发明申请
    ADAPTIVE AMBIENT SOUND SUPPRESSION AND SPEECH TRACKING 审中-公开
    自适应声音抑制和语音跟踪

    公开(公告)号:US20120245933A1

    公开(公告)日:2012-09-27

    申请号:US13491952

    申请日:2012-06-08

    摘要: A device for suppressing ambient sounds from speech received by a microphone array is provided. One embodiment of the device comprises a microphone array, a processor, an analog-to-digital converter, and memory comprising instructions stored therein that are executable by the processor. The instructions stored in the memory are configured to receive a plurality of digital sound signals, each digital sound signal based on an analog sound signal originating at the microphone array, receive a multi-channel speaker signal, generate a monophonic approximation signal of the multi-channel speaker signal, apply a linear acoustic echo canceller to suppress a first ambient sound portion of each digital sound signal, generate a combined directionally-adaptive sound signal from a combination of each digital sound signal by a combination of time-invariant and adaptive beamforming techniques, and apply one or more nonlinear noise suppression techniques to suppress a second ambient sound portion of the combined directionally-adaptive sound signal.

    摘要翻译: 提供了一种用于抑制由麦克风阵列接收的语音的环境声音的装置。 该设备的一个实施例包括麦克风阵列,处理器,模数转换器和包含可由处理器执行的存储在其中的指令的存储器。 存储在存储器中的指令被配置为接收多个数字声音信号,基于源自麦克风阵列的模拟声音信号的每个数字声音信号,接收多声道扬声器信号,产生多声道扬声器信号的单声道近似信号, 应用线性声学回声消除器来抑制每个数字声音信号的第一环境声音部分,通过时间不变和自适应波束成形技术的组合从每个数字声音信号的组合产生组合的定向自适应声音信号 并且应用一个或多个非线性噪声抑制技术来抑制组合的定向自适应声音信号的第二环境声音部分。

    INTERACTIVE CONTENT CREATION
    10.
    发明申请
    INTERACTIVE CONTENT CREATION 有权
    互动内容创建

    公开(公告)号:US20120165964A1

    公开(公告)日:2012-06-28

    申请号:US12978799

    申请日:2010-12-27

    IPC分类号: G06F17/00

    摘要: An audio/visual system (e.g., such as an entertainment console or other computing device) plays a base audio track, such as a portion of a pre-recorded song or notes from one or more instruments. Using a depth camera or other sensor, the system automatically detects that a user (or a portion of the user) enters a first collision volume of a plurality of collision volumes. Each collision volume of the plurality of collision volumes is associated with a different audio stem. In one example, an audio stem is a sound from a subset of instruments playing a song, a portion of a vocal track for a song, or notes from one or more instruments. In response to automatically detecting that the user (or a portion of the user) entered the first collision volume, the appropriate audio stem associated with the first collision volume is added to the base audio track or removed from the base audio track.

    摘要翻译: 音频/视频系统(例如,诸如娱乐控制台或其他计算设备)播放基本音频轨道,例如预先录制的歌曲的一部分或来自一个或多个乐器的音符。 使用深度相机或其他传感器,系统自动检测用户(或用户的一部分)进入多个碰撞体积的第一碰撞体积。 多个碰撞体积中的每个碰撞体积与不同的音频干扰相关联。 在一个示例中,音频干音是来自播放歌曲的乐器的子集,歌曲的声道的一部分或来自一个或多个乐器的音符的声音。 响应于自动检测用户(或用户的一部分)进入第一碰撞体积,与第一碰撞体积相关联的适当音频干扰被添加到基本音频轨道或从基本音频轨道移除。