Method of setting optimum-partitioned classified neural network and method and apparatus for automatic labeling using optimum-partitioned classified neural network
    3.
    发明授权
    Method of setting optimum-partitioned classified neural network and method and apparatus for automatic labeling using optimum-partitioned classified neural network 有权
    一种用于使用最佳分区,klassifierten神经网络形成的最佳分配,klassifierten神经网络,和方法和装置用于自动标记处理

    公开(公告)号:EP1453037B1

    公开(公告)日:2010-06-09

    申请号:EP04251145.1

    申请日:2004-02-27

    IPC分类号: G10L15/04

    CPC分类号: G10L15/04

    摘要: A method of setting an optimum-partitioned classified neural network and a method and apparatus for automatic labeling using an optimum-partitioned classified neural network are provided. The method of automatic labeling using an optimum-partitioned classified neural network comprises (a) searching for neural networks having minimum errors with respect to a number of L phoneme combinations from a number of K neural network combinations generated at an initial stage or updated, updating weights during learning of the K neural networks by K phoneme combination groups searched with the same neural networks, and composing an optimum-partitioned classified neural network combination using the K neural networks of which a total error sum has converged; and (b) tuning a phoneme boundary of a first label file by using the phoneme combination group classification result and the optimum-partitioned classified neural network combination, and generating a final label file reflecting the tuning result.

    System and method for providing information using spoken dialogue interface
    4.
    发明公开
    System and method for providing information using spoken dialogue interface 有权
    装置和方法用于使用语音对话接口提供信息

    公开(公告)号:EP1349145A2

    公开(公告)日:2003-10-01

    申请号:EP03251975.3

    申请日:2003-03-28

    IPC分类号: G10L15/22

    CPC分类号: G10L15/22

    摘要: There are provided a system and method for providing information using a spoken dialogue interface. The system includes a speech recognizer for transforming voice signals into sentences; a sentence analyzer for analyzing the sentences by their structural elements; a dialogue manager for extracting information on speech acts or intentions from the structural elements, and generating information on system's speech acts or intentions for a response to the extracted information on speech acts or intentions; a sentence generator for generating sentences based on the information on the system's speech acts or intentions for the response; a speech synthesizer for synthesizing the generated sentences into voices; an information extractor for extracting information required for the response from the Internet in real time; and a user modeling means for analyzing and classifying users' tendencies. Information demanded by a user can be detected in real time and provided through a voice interface with versatile and familiar dialogues based on the user's tendencies.

    摘要翻译: 本发明提供了使用语音对话界面提供信息的系统和方法。 该系统包括用于将语音信号转换成句子的语音识别器; 一个句子分析器,用于分析由它们的结构元件的句子; 用于从所述结构元素提取关于言语行为或意图的信息,以及用于对言语行为或意图所提取的信息的响应产生系统上的言语行为或意图的信息的对话管理器; 一个句子发生器,用于基于对系统的言语行为或意图用于响应中的信息的句子; 一个语音合成器用于合成生成的句子翻译成声音; 以提取用于从互联网实时响应所需的信息提取; 和用户建模用于分析和分类用户的倾向。 用户需要的信息可以在实时检测,并通过基于用户的倾向,通用和熟悉的对话语音接口提供。

    Audio apparatus and method of converting audio signal thereof
    5.
    发明公开
    Audio apparatus and method of converting audio signal thereof 审中-公开
    Audiovorrichtung und Verfahren zur Umwandlung eines Audiosignals davon

    公开(公告)号:EP2645749A2

    公开(公告)日:2013-10-02

    申请号:EP13161624.5

    申请日:2013-03-28

    IPC分类号: H04S3/00

    摘要: An audio apparatus and a method of converting an audio signal are provided. The method includes: receiving a first audio signal including a plurality of channels (S810); comparing audio signals of the plurality of channels to estimate a source position of the first audio signal (S830); localizing a source of the first audio signal toward a three-dimensional (3D) position having an elevation component based on the estimated source position (S840); converting the first audio signal into a second audio signal including the plurality of channels and at least one channel having, based on the localized source, a different elevation from the plurality of channels (S850); and outputting the second audio signal (S860).

    摘要翻译: 提供音频装置和转换音频信号的方法。 该方法包括:接收包括多个信道的第一音频信号(S810); 比较多个频道的音频信号以估计第一音频信号的源位置(S830); 基于估计的源位置将第一音频信号的源定位成具有高度分量的三维(3D)位置(S840); 将所述第一音频信号转换成包括所述多个频道的第二音频信号,以及至少一个信道,其具有基于所述局部源的与所述多个频道不同的高程(S850)。 并输出第二音频信号(S860)。

    Positioning and reproducing screen sound source with high resolution
    6.
    发明公开
    Positioning and reproducing screen sound source with high resolution 有权
    Positionierung und Wiedergabe einer Bildschirmtonquelle mit hoherAuflösung

    公开(公告)号:EP2187658A2

    公开(公告)日:2010-05-19

    申请号:EP09165810.4

    申请日:2009-07-17

    IPC分类号: H04R3/12 H04S1/00

    摘要: A virtual screen sound source is spatially synchronized with a visual object displayed on a display. A plurality of loudspeaker sets, which each include at least three of a plurality of loudspeakers installed at the periphery of a display, are selected, individual sound sources corresponding to the respective selected loudspeaker sets are generated, and a multi-sound source is generated by overlapping the generated individual sound sources and output through loudspeakers included in the loudspeaker sets.

    摘要翻译: 虚拟屏幕声源与显示器上显示的视觉对象在空间上同步。 选择多个扬声器组,每个扬声器组包括安装在显示器周边的多个扬声器中的至少三个扬声器组,产生对应于各个选定扬声器组的各个声源,并且通过以下方式产生多声源: 与所产生的各个声源重叠并通过包括在扬声器组中的扬声器输出。

    Audio apparatus and method of converting audio signal thereof
    9.
    发明公开
    Audio apparatus and method of converting audio signal thereof 审中-公开
    音频设备和方法,用于将它们的音频信号

    公开(公告)号:EP2645749A3

    公开(公告)日:2015-10-21

    申请号:EP13161624.5

    申请日:2013-03-28

    IPC分类号: H04S3/00 H04S7/00

    摘要: An audio apparatus and a method of converting an audio signal are provided. The method includes: receiving a first audio signal including a plurality of channels (S810); comparing audio signals of the plurality of channels to estimate a source position of the first audio signal (S830); localizing a source of the first audio signal toward a three-dimensional (3D) position having an elevation component based on the estimated source position (S840); converting the first audio signal into a second audio signal including the plurality of channels and at least one channel having, based on the localized source, a different elevation from the plurality of channels (S850); and outputting the second audio signal (S860).

    System and method for speech synthesis using a smoothing filter
    10.
    发明公开
    System and method for speech synthesis using a smoothing filter 有权
    系统和方法,用于使用Glattungsfilters语音合成

    公开(公告)号:EP1308928A3

    公开(公告)日:2005-03-09

    申请号:EP02257456.0

    申请日:2002-10-28

    IPC分类号: G10L13/06

    CPC分类号: G10L13/07

    摘要: Disclosed is a speech synthesis system and method using a smoothing filter. A speech synthesis system for controlling a discontinuous distortion occurred at the transition portion between concatenated phonemes which are speech units of a synthesized speech using a smoothing technique, comprising: a discontinuous distortion processing means adapted to predict a discontinuity occurred at the transition portion between concatenated samples of phonemes used for a speech synthesis through a predetermined learning process, and control a discontinuity occurred at the transition portion between the concatenated phonemes of the synthesized speech in such a fashion that it is smoothed adaptively to correspond to a degree of the predicted discontinuity. The smoothing filter smoothes the synthesized speech so that the discontinuity degree of synthesized speech follows the predicted discontinuity degree according to the filter coefficient (a) changed adaptively to correspond to a ratio of the predicted discontinuity degree to the real discontinuity degree. That is, since a discontinuity occurred at a transition portion between concatenated phonemes of the synthesized speech (IN) is adaptively smoothed to follow that occurred in the actually spoken sound, the synthesized speech (IN) can be approximated more closely to a real human voice.