Patent search ap:("International Business Machines Corporation") AND inv:"Slava Shechtman" Page 1

1.

发明授权
Pitch marking in speech processing 有权

公开(公告)号：US09685170B2

公开(公告)日：2017-06-20

申请号：US14918601

申请日：2015-10-21

Applicant: International Business Machines Corporation

Inventor： Slava Shechtman

IPC: G10L21/00 , G10L15/00 , G10L25/00 , G10L21/01 , G10L25/09 , G10L25/06 , G10L25/90

CPC classification number: G10L21/01 , G10L21/013 , G10L25/06 , G10L25/90

Abstract: According to some embodiments of the present invention, there is provided a computerized method for selecting and correcting pitch marks in speech processing and modification. The method comprises an action of receiving a continuous speech signal representing audible speech recorded by a microphone, where a sequence of pitch values and two or more pitch mark temporal values are computed from the continuous speech signal. The method comprises an action of computing for each of the pitch mark temporal values a lower limit temporal value and an upper limit temporal value by a cross-correlation function of the continuous speech signal around the pitch mark temporal values associated with pairs of elements in the sequence and replacing one or more of the pitch mark temporal values with one or more new temporal value between the lower limit temporal value and the upper limit temporal value.

2.

发明申请
PITCH MARKING IN SPEECH PROCESSING 有权

公开(公告)号：US20170117001A1

公开(公告)日：2017-04-27

申请号：US14918601

申请日：2015-10-21

Applicant: International Business Machines Corporation

Inventor： Slava Shechtman

IPC: G10L21/01 , G10L25/06 , G10L25/90 , G10L25/09

CPC classification number: G10L21/01 , G10L21/013 , G10L25/06 , G10L25/90

Abstract: According to some embodiments of the present invention, there is provided a computerized method for selecting and correcting pitch marks in speech processing and modification. The method comprises an action of receiving a continuous speech signal representing audible speech recorded by a microphone, where a sequence of pitch values and two or more pitch mark temporal values are computed from the continuous speech signal. The method comprises an action of computing for each of the pitch mark temporal values a lower limit temporal value and an upper limit temporal value by a cross-correlation function of the continuous speech signal around the pitch mark temporal values associated with pairs of elements in the sequence and replacing one or more of the pitch mark temporal values with one or more new temporal value between the lower limit temporal value and the upper limit temporal value.

3.

发明申请
System and method for generating expressive prosody for speech synthesis 审中-公开

公开(公告)号：US20190172443A1

公开(公告)日：2019-06-06

申请号：US15832793

申请日：2017-12-06

Applicant: International Business Machines Corporation

Inventor： Slava Shechtman , Zvi Kons

IPC: G10L13/10 , G10L13/047 , G10L13/027

CPC classification number: G10L13/10 , G10L13/027 , G10L13/033 , G10L13/047 , G10L25/30

Abstract: A method for producing speech comprises: accessing an expressive prosody model, wherein the model is generated by: receiving a plurality of non-neutral prosody vector sequences, each vector associated with one of a plurality of time-instances; receiving a plurality of expression labels, each having a time-instance selected from a plurality of non-neutral time-instances of the plurality of time-instances; producing a plurality of neutral prosody vector sequences equivalent to the plurality of non-neutral sequences by applying a linear combination of a plurality of statistical measures to a plurality of sub-sequences selected according to an identified proximity test applied to a plurality of neutral time-instances of the plurality of time-instances; and training at least one machine learning module using the plurality of non-neutral sequences and the plurality of neutral sequences to produce an expressive prosodic model; and using the model within a Text-To-Speech-System to produce an audio waveform from an input text.

4.

发明授权
Real-time system for determining current video scale 有权

公开(公告)号：US09892335B2

公开(公告)日：2018-02-13

申请号：US15173656

申请日：2016-06-05

Applicant: International Business Machines Corporation

Inventor： Ophir Azulai , Udi Barzelay , Mattias Marder , Dror Porat , Slava Shechtman

IPC: G06K9/20 , G06K9/62 , G06T3/40 , G06K9/00

CPC classification number: G06K9/2081 , G06K9/00664 , G06K9/6202

Abstract: Embodiments of the present invention may provide the capability to identify a specific object being interacted with that may be cheaply and easily included in mass-produced objects. In an embodiment, a computer-implemented method for object identification may comprise receiving a signal produced by a physical interaction with an object to be identified, the signal produced by an identification structure coupled to the object during physical interaction with the object, processing the signal to form digital data representing the signal, and accessing a database using the digital data to retrieve information identifying the object.

5.

发明授权
System and method for generating expressive prosody for speech synthesis 有权

公开(公告)号：US10418025B2

公开(公告)日：2019-09-17

申请号：US15832793

申请日：2017-12-06

Applicant: International Business Machines Corporation

Inventor： Slava Shechtman , Zvi Kons

IPC: G10L13/10 , G10L25/30 , G10L13/027 , G10L13/033 , G10L13/047

Abstract: A method for producing speech comprises: accessing an expressive prosody model, wherein the model is generated by: receiving a plurality of non-neutral prosody vector sequences, each vector associated with one of a plurality of time-instances; receiving a plurality of expression labels, each having a time-instance selected from a plurality of non-neutral time-instances of the plurality of time-instances; producing a plurality of neutral prosody vector sequences equivalent to the plurality of non-neutral sequences by applying a linear combination of a plurality of statistical measures to a plurality of sub-sequences selected according to an identified proximity test applied to a plurality of neutral time-instances of the plurality of time-instances; and training at least one machine learning module using the plurality of non-neutral sequences and the plurality of neutral sequences to produce an expressive prosodic model; and using the model within a Text-To-Speech-System to produce an audio waveform from an input text.

6.

发明授权
Vehicle entertainment system 有权

公开(公告)号：US10226702B2

公开(公告)日：2019-03-12

申请号：US14720849

申请日：2015-05-25

Applicant: International Business Machines Corporation

Inventor： Ron Hoory , Mattias Marder , Slava Shechtman

IPC: A63F13/00 , A63F9/24 , A63F13/65 , G06K9/00 , A63F13/80 , A63F13/46 , A63F13/54 , A63F13/52 , A63F13/215 , A63F13/213

Abstract: A computer-implemented method, computerized apparatus and computer program product. The method comprises capturing one or more images of a scene in which a driver is driving a vehicle; analyzing the images to retrieve an event or detail; conveying to the driver the a question or a challenge related to the event or detail; receiving a response from the driver; analyzing the response; and determining a score related to the driver.

7.

发明申请
Real-Time system for determining current video scale 有权

公开(公告)号：US20170351930A1

公开(公告)日：2017-12-07

申请号：US15173656

申请日：2016-06-05

Applicant: International Business Machines Corporation

Inventor： Ophir Azulai , Udi Barzelay , Mattias Marder , Dror Porat , Slava Shechtman

IPC: G06K9/20 , G06K9/00 , G06K9/62 , G06T3/40

CPC classification number: G06K9/2081 , G06K9/00664 , G06K9/6202

Abstract: Embodiments of the present invention may provide the capability to identify a specific object being interacted with that may be cheaply and easily included in mass-produced objects. In an embodiment, a computer-implemented method for object identification may comprise receiving a signal produced by a physical interaction with an object to be identified, the signal produced by an identification structure coupled to the object during physical interaction with the object, processing the signal to form digital data representing the signal, and accessing a database using the digital data to retrieve information identifying the object.

8.

发明申请
WIDEBAND SPEECH PARAMETERIZATION FOR HIGH QUALITY SYNTHESIS, TRANSFORMATION AND QUANTIZATION 有权
Title translation: 用于高质量合成，转换和量化的宽带语音参数

公开(公告)号：US20150095035A1

公开(公告)日：2015-04-02

申请号：US14040765

申请日：2013-09-30

Applicant: International Business Machines Corporation

Inventor： Slava Shechtman

IPC: G10L19/038 , G10L19/02

CPC classification number: G10L19/038 , G10L19/02 , G10L19/093

Abstract: A method for speech parameterization and coding of a continuous speech signal. The method comprises dividing said speech signal into a plurality of speech frames, and for each one of the plurality of speech frames, modeling said speech frame by a first harmonic modeling to produce a plurality of harmonic model parameters, reconstructing an estimated frame signal from the plurality of harmonic model parameters, subtracting the estimated frame signal from the speech frame to produce a harmonic model residual, performing at least one second harmonic modeling analysis on the first harmonic model residual to determine at least one set of second harmonic model components, removing the at least one set of second harmonic model components from the first harmonic model residual to produce a harmonically-filtered residual signal, and processing the harmonically-filtered residual signal with analysis by synthesis techniques to produce vectors of codebook indices and corresponding gains.

Abstract translation: 一种用于连续语音信号的语音参数化和编码的方法。该方法包括将所述语音信号分成多个语音帧，并且对于多个语音帧中的每一个，通过一次谐波建模对所述语音帧进行建模以产生多个谐波模型参数，从多个谐波模型参数，从所述语音帧中减去所估计的帧信号以产生谐波模型残差，对所述一次谐波模型残差执行至少一次二次谐波建模分析以确定至少一组二次谐波模型分量，来自第一谐波模型残差的至少一组二次谐波模型分量，以产生谐波滤波的残余信号，以及通过合成技术的分析来处理谐波滤波的残余信号以产生码本索引和相应增益的向量。

9.

发明申请
VEHICLE ENTERTAINMENT SYSTEM 审中-公开
Title translation: 车辆娱乐系统

公开(公告)号：US20160346695A1

公开(公告)日：2016-12-01

申请号：US14720849

申请日：2015-05-25

Applicant: International Business Machines Corporation

Inventor： Ron Hoory , Mattias Marder , Slava Shechtman

IPC: A63F13/65 , G06K9/00 , B60R1/00 , A63F13/213 , A63F13/46 , A63F13/54 , A63F13/52 , A63F13/215 , B60R11/04 , A63F13/80

CPC classification number: A63F13/65 , A63F13/213 , A63F13/215 , A63F13/46 , A63F13/52 , A63F13/54 , A63F13/80 , G06K9/00832

Abstract: A computer-implemented method, computerized apparatus and computer program product. The method comprises capturing one or more images of a scene in which a driver is driving a vehicle; analyzing the images to retrieve an event or detail; conveying to the driver the a question or a challenge related to the event or detail; receiving a response from the driver; analyzing the response; and determining a score related to the driver.

Abstract translation: 计算机实现的方法，计算机化设备和计算机程序产品。该方法包括捕获驾驶员驾驶车辆的场景的一个或多个图像; 分析图像以检索事件或细节; 向驾驶员传达与事件或细节相关的问题或挑战; 收到司机的回应; 分析响应; 并确定与驾驶员有关的分数。

10.

发明授权
Wideband speech parameterization for high quality synthesis, transformation and quantization 有权
Title translation: 宽带语音参数化，用于高质量的合成，变换和量化

公开(公告)号：US09224402B2

公开(公告)日：2015-12-29

申请号：US14040765

申请日：2013-09-30

Applicant: International Business Machines Corporation

Inventor： Slava Shechtman

IPC: G10L13/06 , G10L19/00 , G10L13/08 , G10L19/06 , G10L19/038 , G10L19/02

CPC classification number: G10L19/038 , G10L19/02 , G10L19/093

Abstract: A method for speech parameterization and coding of a continuous speech signal. The method comprises dividing said speech signal into a plurality of speech frames, and for each one of the plurality of speech frames, modeling said speech frame by a first harmonic modeling to produce a plurality of harmonic model parameters, reconstructing an estimated frame signal from the plurality of harmonic model parameters, subtracting the estimated frame signal from the speech frame to produce a harmonic model residual, performing at least one second harmonic modeling analysis on the first harmonic model residual to determine at least one set of second harmonic model components, removing the at least one set of second harmonic model components from the first harmonic model residual to produce a harmonically-filtered residual signal, and processing the harmonically-filtered residual signal with analysis by synthesis techniques to produce vectors of codebook indices and corresponding gains.

Abstract translation: 一种用于连续语音信号的语音参数化和编码的方法。该方法包括将所述语音信号分成多个语音帧，并且对于多个语音帧中的每一个，通过一次谐波建模对所述语音帧进行建模以产生多个谐波模型参数，从多个谐波模型参数，从所述语音帧中减去所估计的帧信号以产生谐波模型残差，对所述一次谐波模型残差执行至少一次二次谐波建模分析以确定至少一组二次谐波模型分量，来自第一谐波模型残差的至少一组二次谐波模型分量，以产生谐波滤波的残余信号，以及通过合成技术的分析来处理谐波滤波的残余信号以产生码本索引和相应增益的向量。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification