Method and system for producing multi-view 3D visual contents
    1.
    发明授权
    Method and system for producing multi-view 3D visual contents 有权
    制作多视图3D视觉内容的方法和系统

    公开(公告)号:US09225965B2

    公开(公告)日:2015-12-29

    申请号:US13128199

    申请日:2008-11-07

    摘要: A method for producing 3D multi-view visual contents including capturing a visual scene from at least one first point of view for generating a first bidimensional image of the scene and a corresponding first depth map indicative of a distance of different parts of the scene from the first point of view. The method further includes capturing the visual scene from at least one second point of view for generating a second bidimensional image; processing the first bidimensional image to derive at least one predicted second bidimensional image predicting the visual scene captured from the at least one second point of view; deriving at least one predicted second depth map predictive of a distance of different parts of the scene from the at least one second point of view by processing the first depth map, the at least one predicted second bidimensional image and the second bidimensional image.

    摘要翻译: 一种用于产生3D多视角视觉内容的方法,包括从至少一个第一观点捕获视觉场景,用于生成场景的第一二维图像,以及相应的第一深度图,其指示场景的不同部分与 第一点。 该方法还包括从至少一个第二观点捕获视觉场景,以产生第二二维图像; 处理所述第一二维图像以导出至少一个预测的从所述至少一个第二观察点捕获的所述视觉场景的预测的第二二维图像; 通过处理所述第一深度图,所述至少一个预测的第二二维图像和所述第二二维图像,得出预测所述场景的不同部分的距离与所述至少一个第二观察点的至少一个预测的第二深度图。

    Speech synthesis with incremental databases of speech waveforms on user terminals over a communications network
    2.
    发明授权
    Speech synthesis with incremental databases of speech waveforms on user terminals over a communications network 有权
    通过通信网络对用户终端上的语音波形增量数据库进行语音合成

    公开(公告)号:US08583437B2

    公开(公告)日:2013-11-12

    申请号:US11921403

    申请日:2005-05-31

    IPC分类号: G10L13/00 G10L13/08

    CPC分类号: G10L13/047 G10L15/30

    摘要: Service architecture for providing to a user terminal of a communications network textual information and relative speech synthesis, the user terminal being provided with a speech synthesis engine and a basic database of speech waveforms includes: a content server for downloading textual information requested by means of a browser application on the user terminal; a context manager for extracting context information from the textual information requested by the user terminal; a context selector for selecting an incremental database of speech waveforms associated with extracted context information and for downloading the incremental database into the user terminal; a database manager on the user terminal for managing the composition of an enlarged database of speech waveforms for the speech synthesis engine including the basic and the incremental databases of speech waveforms.

    摘要翻译: 用于向通信网络的用户终端提供文本信息和相对语音合成的服务架构,提供有语音合成引擎和语音波形的基本数据库的用户终端包括:内容服务器,用于下载通过 浏览器应用在用户终端上; 用于从用户终端请求的文本信息中提取上下文信息的上下文管理器; 用于选择与提取的上下文信息相关联的语音波形的增量数据库并用于将增量数据库下载到用户终端中的上下文选择器; 用户终端上的数据库管理器,用于管理包括语音波形的基本和增量数据库的语音合成引擎的语音波形的放大数据库的组成。

    Method and circuit for noise estimation, related filter, terminal and communication network using same, and computer program product therefor
    3.
    发明授权
    Method and circuit for noise estimation, related filter, terminal and communication network using same, and computer program product therefor 有权
    用于噪声估计的方法和电路,相关滤波器,使用其的终端和通信网络及其计算机程序产品

    公开(公告)号:US07613608B2

    公开(公告)日:2009-11-03

    申请号:US10579058

    申请日:2003-11-12

    IPC分类号: G10L21/02 G10L19/14 G10L15/20

    CPC分类号: G10L21/0208

    摘要: A filter such as a Wiener filter for noise reduction in a signal, such as a speech signal, affected by background noise includes a circuit for determining values of an update function relating new value of estimated noise power to a previous value of estimated noise power, the update function being a function of said previous estimated noise power and a mean input power spectral density. The circuit includes a look-up table having values for the update function stored therein with the previous value of estimated noise power and the mean input power spectral density as a first and a second search entry, respectively. These search entries are entered via an input module and exploited by search circuitry associated with the look-up table for selectively searching values for the update function in the look-up table. The search is preferably carried out on the basis of an index computed starting from said first and second search entries.

    摘要翻译: 诸如用于噪声降低的维纳滤波器的滤波器(例如语音信号)受背景噪声的影响包括用于确定估计噪声功率的新值与估计噪声功率的先前值相关联的更新功能的值的电路, 所述更新功能是所述先前估计的噪声功率和平均输入功率谱密度的函数。 该电路包括一个查找表,其具有存储在其中的具有估计噪声功率的先前值和作为第一和第二搜索条目的平均输入功率谱密度的更新功能的值。 这些搜索条目通过输入模块输入,并由与查找表相关联的搜索电路利用,用于选择性地搜索查找表中的更新功能的值。 搜索优选地基于从所述第一和第二搜索条目开始计算的索引来执行。

    Method for generating a vector codebook, method and device for compressing data, and distributed speech recognition system
    4.
    发明申请
    Method for generating a vector codebook, method and device for compressing data, and distributed speech recognition system 有权
    用于生成矢量码本的方法,用于压缩数据的方法和装置以及分布式语音识别系统

    公开(公告)号:US20090037172A1

    公开(公告)日:2009-02-05

    申请号:US11658090

    申请日:2004-07-23

    CPC分类号: G10L15/30

    摘要: A method for compressing data, the data being represented by an input vector having Q features, wherein Q is an integer higher than 1, including the steps of 1) providing a vector codebook of sub-sets of indexed Q-feature reference vectors and threshold values associated with the sub-sets for a prefixed feature; 2) identifying a sub-set of reference vectors among the sub-sets by progressively comparing the value of a feature of the input vector which corresponds to the prefixed feature, with the threshold values associated with the sub-sets; and 3) identifying the reference vector which, within the sub-set identified in step 2), provides the lowest distortion with respect to the input vector.

    摘要翻译: 一种用于压缩数据的方法,所述数据由具有Q特征的输入向量表示,其中Q是高于1的整数,包括以下步骤:1)提供索引Q特征参考向量和阈值的子集的向量码本 与前缀特征的子集相关联的值; 2)通过将与前缀特征相对应的输入向量的特征值与与子集相关联的阈值逐渐比较来识别子集中的参考矢量子集; 以及3)识别在步骤2)中识别的子集内的相对于输入向量提供最低失真的参考矢量。

    Customizable method and system for emotional recognition
    5.
    发明授权
    Customizable method and system for emotional recognition 有权
    可定制的情感识别方法和系统

    公开(公告)号:US08538755B2

    公开(公告)日:2013-09-17

    申请号:US12449259

    申请日:2007-01-31

    IPC分类号: G10L11/00 G10L11/04 G06F15/00

    摘要: An automated emotional recognition system is adapted to determine emotional states of a speaker based on the analysis of a speech signal. The emotional recognition system includes at least one server function and at least one client function in communication with the at least one server function for receiving assistance in determining the emotional states of the speaker. The at least one client function includes an emotional features calculator adapted to receive the speech signal and to extract therefrom a set of speech features indicative of the emotional state of the speaker. The emotional state recognition system further includes at least one emotional state decider adapted to determine the emotional state of the speaker exploiting the set of speech features based on a decision model. The server function includes at least a decision model trainer adapted to update the selected decision model according to the speech signal. The decision model to be used by the emotional state decider for determining the emotional state of the speaker is selectable based on a context of use of the recognition system.

    摘要翻译: 自动情绪识别系统适于基于语音信号的分析来确定说话人的情绪状态。 所述情绪识别系统包括与所述至少一个服务器功能通信的至少一个服务器功能和至少一个客户端功能,用于在确定所述说话者的情绪状态时接收帮助。 所述至少一个客户端功能包括适于接收所述语音信号并从其中提取指示所述说话者的情绪状态的一组语音特征的情绪特征计算器。 情绪状态识别系统还包括至少一个情绪状态判定器,其适于基于决策模型来确定利用该组语音特征的说话者的情绪状态。 服务器功能至少包括决定模型训练器,其适于根据语音信号来更新所选择的决定模型。 情绪状态决定者用于确定说话者的情感状态的决定模型是基于使用识别系统的上下文来选择的。

    Method and apparatus for transmitting speech data to a remote device in a distributed speech recognition system
    6.
    发明授权
    Method and apparatus for transmitting speech data to a remote device in a distributed speech recognition system 有权
    用于在分布式语音识别系统中向远程设备发送语音数据的方法和装置

    公开(公告)号:US08494849B2

    公开(公告)日:2013-07-23

    申请号:US11922500

    申请日:2005-06-20

    IPC分类号: G10L15/20

    CPC分类号: G10L25/78 G10L15/30

    摘要: A method of transmitting speech data to a remote device in a distributed speech recognition system, includes the steps of: dividing an input speech signal into frames; calculating, for each frame, a voice activity value representative of the presence of speech activity in the frame; grouping the frames into multiframes, each multiframe including a predetermined number of frames; calculating, for each multiframe, a voice activity marker representative of the number of frames in the multiframe representing speech activity; and selectively transmitting, on the basis of the voice activity marker associated with each multiframe, the multiframes to the remote device.

    摘要翻译: 一种在分布式语音识别系统中向远程设备发送语音数据的方法包括以下步骤:将输入语音信号划分成帧; 针对每个帧计算代表帧中语音活动的存在的语音活动值; 将帧分组成多帧,每个复帧包括预定数量的帧; 针对每个多帧计算表示代表语音活动的多帧中的帧数的语音活动标记; 并且基于与每个复帧相关联的语音活动标记,将多帧选择性地发送到远程设备。

    Method for generating a vector codebook, method and device for compressing data, and distributed speech recognition system
    7.
    发明授权
    Method for generating a vector codebook, method and device for compressing data, and distributed speech recognition system 有权
    用于生成矢量码本的方法,用于压缩数据的方法和装置以及分布式语音识别系统

    公开(公告)号:US08214204B2

    公开(公告)日:2012-07-03

    申请号:US11658090

    申请日:2004-07-23

    IPC分类号: G10L19/12 G10L19/00

    CPC分类号: G10L15/30

    摘要: A method for compressing data, the data being represented by an input vector having Q features, wherein Q is an integer higher than 1, including the steps of 1) providing a vector codebook of sub-sets of indexed Q-feature reference vectors and threshold values associated with the sub-sets for a prefixed feature; 2) identifying a sub-set of reference vectors among the sub-sets by progressively comparing the value of a feature of the input vector which corresponds to the prefixed feature, with the threshold values associated with the sub-sets; and 3) identifying the reference vector which, within the sub-set identified in step 2), provides the lowest distortion with respect to the input vector.

    摘要翻译: 一种用于压缩数据的方法,所述数据由具有Q特征的输入向量表示,其中Q是高于1的整数,包括以下步骤:1)提供索引Q特征参考向量和阈值的子集的向量码本 与前缀特征的子集相关联的值; 2)通过将与前缀特征相对应的输入向量的特征值与与子集相关联的阈值逐渐比较来识别子集中的参考矢量子集; 以及3)识别在步骤2)中识别的子集内的相对于输入向量提供最低失真的参考矢量。

    Early-late synchronizer having reduced timing jitter
    8.
    发明申请
    Early-late synchronizer having reduced timing jitter 有权
    早期同步器具有降低的定时抖动

    公开(公告)号:US20060251154A1

    公开(公告)日:2006-11-09

    申请号:US10534992

    申请日:2002-11-15

    IPC分类号: H04B1/707 H04B1/00

    CPC分类号: H04B1/7085 H04B1/709

    摘要: A device for maintaining fine alignment between an incoming spread spectrum signal and a locally generated code in a digital communication receiver comprises:—delay line (56) for storing a plurality of consecutive samples (E−1, E, M, L, L+1) of the incoming spread spectrum signal;—three digitally controlled interpolators (24, 26, 28) for determining by interpolation between consecutive samples an interpolated early sample (e), an interpolated middle sample (m) and an interpolated late sample (1);—two correlators (30, 32) for calculating an error signal (ξ) as the difference between the energy of the symbols computed from the interpolated early (e) and late (1) samples;—a circuit for generating a control signal (SOUT?) for controlling the interpolation phase of the digitally controlled interpolator (24) for the early sample (e), and—a digital non-linear filter (68), for smoothing the control signal (SOUT?) of the interpolator (24) for the early sample (e), enabling the update operation of the control signal only when the absolute value (|ξ(n)|) of the error signal at a time instant n is smaller than the absolute value (|ξ(n−1)|) of the same error signal at a time instant n−1.

    摘要翻译: 一种用于在数字通信接收机中保持输入扩频信号与本地产生代码之间的精确对准的装置,包括: - 用于存储多个连续样本(E-1,E,M,L,L + 1); - 三个数字控制内插器(24,26,28),用于通过内插的早期采样(e),内插中间采样(m)和内插后期采样(1)之间的插值来确定连续采样 ); - 两个相关器(30,32),用于计算误差信号(xi)作为从内插的早期(e)和晚期(1)样本计算的符号的能量之间的差; - 用于产生控制信号的电路 (S OUT?),用于控制用于耳机的数字控制内插器(24)的内插相位 y样本(e)和数字非线性滤波器(68),用于平滑控制信号(S OUT?), 只有当时刻n的误差信号的绝对值(| xi(n)|)小于相同误差信号的绝对值(| xi(n-1)|)时,才更新控制信号的操作 时刻n-1。

    CUSTOMIZABLE METHOD AND SYSTEM FOR EMOTIONAL RECOGNITION
    9.
    发明申请
    CUSTOMIZABLE METHOD AND SYSTEM FOR EMOTIONAL RECOGNITION 有权
    用于情感识别的自定义方法和系统

    公开(公告)号:US20100088088A1

    公开(公告)日:2010-04-08

    申请号:US12449259

    申请日:2007-01-31

    IPC分类号: G10L19/00 G10L15/06

    摘要: An automated emotional recognition system is adapted to determine emotional states of a speaker based on the analysis of a speech signal. The emotional recognition system includes at least one server function and at least one client function in communication with the at least one server function for receiving assistance in determining the emotional states of the speaker. The at least one client function includes an emotional features calculator adapted to receive the speech signal and to extract therefrom a set of speech features indicative of the emotional state of the speaker. The emotional state recognition system further includes at least one emotional state decider adapted to determine the emotional state of the speaker exploiting the set of speech features based on a decision model. The server function includes at least a decision model trainer adapted to update the selected decision model according to the speech signal. The decision model to be used by the emotional state decider for determining the emotional state of the speaker is selectable based on a context of use of the recognition system.

    摘要翻译: 自动情绪识别系统适于基于语音信号的分析来确定说话人的情绪状态。 所述情绪识别系统包括与所述至少一个服务器功能通信的至少一个服务器功能和至少一个客户端功能,用于在确定所述说话者的情绪状态时接收帮助。 所述至少一个客户端功能包括适于接收所述语音信号并从其中提取指示所述说话者的情绪状态的一组语音特征的情绪特征计算器。 情绪状态识别系统还包括至少一个情绪状态判定器,其适于基于决策模型来确定利用该组语音特征的说话者的情绪状态。 服务器功能至少包括决定模型训练器,其适于根据语音信号来更新所选择的决定模型。 情绪状态决定者用于确定说话者的情感状态的决定模型是基于使用识别系统的上下文来选择的。

    Method and system for providing speech synthesis on user terminals over a communications network
    10.
    发明申请
    Method and system for providing speech synthesis on user terminals over a communications network 有权
    通过通信网络在用户终端上提供语音合成的方法和系统

    公开(公告)号:US20090306986A1

    公开(公告)日:2009-12-10

    申请号:US11921403

    申请日:2005-05-31

    IPC分类号: G10L13/08 G10L13/00 G06F17/30

    CPC分类号: G10L13/047 G10L15/30

    摘要: Service architecture for providing to a user terminal of a communications network textual information and relative speech synthesis, the user terminal being provided with a speech synthesis engine and a basic database of speech waveforms includes: a content server for downloading textual information requested by means of a browser application on the user terminal; a context manager for extracting context information from the textual information requested by the user terminal; a context selector for selecting an incremental database of speech waveforms associated with extracted context information and for downloading the incremental database into the user terminal; a database manager on the user terminal for managing the composition of an enlarged database of speech waveforms for the speech synthesis engine including the basic and the incremental databases of speech waveforms.

    摘要翻译: 用于向通信网络的用户终端提供文本信息和相对语音合成的服务架构,提供有语音合成引擎和语音波形的基本数据库的用户终端包括:内容服务器,用于下载通过 浏览器应用在用户终端上; 用于从用户终端请求的文本信息中提取上下文信息的上下文管理器; 用于选择与提取的上下文信息相关联的语音波形的增量数据库并用于将增量数据库下载到用户终端中的上下文选择器; 用户终端上的数据库管理器,用于管理包括语音波形的基本和增量数据库的语音合成引擎的语音波形的放大数据库的组成。