专利检索 ap:("International Business Machines Corporation") AND inv:"Samuel Thomas" 第 4 页

31.

发明授权
Denoising a signal 有权

公开(公告)号：US10657980B2

公开(公告)日：2020-05-19

申请号：US15793884

申请日：2017-10-25

申请人： International Business Machines Corporation

发明人： Dimitrios B. Dimitriadis , Samuel Thomas , Colin C. Vaz

IPC分类号： G10L15/00 , G10L21/0208 , G10L15/20

摘要： A computer-implemented method according to one embodiment includes creating a clean dictionary, utilizing a clean signal, creating a noisy dictionary, utilizing a first noisy signal, determining a time varying projection, utilizing the clean dictionary and the noisy dictionary, and denoising a second noisy signal, utilizing the time varying projection.

32.

发明申请
CONSTRUCTING, EVALUATING, AND IMPROVING A SEARCH STRING FOR RETRIEVING IMAGES INDICATING ITEM USE 审中-公开

公开(公告)号：US20190205435A1

公开(公告)日：2019-07-04

申请号：US15856511

申请日：2017-12-28

申请人： International Business Machines Corporation

发明人： Sujatha Kashyap , Anne E. Gattiker , Kaipeng Li , Samuel Thomas , Minh Ngoc Binh Nguyen , Thomas Hubregtsen

IPC分类号： G06F17/30 , G06T1/00

摘要： Examples of techniques for constructing, evaluating, and improving a search string for retrieving images are disclosed. In one example implementation according to aspects of the present disclosure, a computer-implemented method includes constructing, by a processing device, a search string based at least in part on a tuple including an item class, an action, and an actor. The method further includes retrieving, by the processing device, a plurality of images based at least in part on the search string for an item. The method further includes evaluating, by the processing device, the retrieved plurality of images based on a similarity to determine whether the search string is effective at indicating a common item use. The method further includes, based at least in part on determining that the search string is ineffective at indicating the item use, generating, by the processing device, an alternative search string.

33.

发明授权
Multi-pass speech activity detection strategy to improve automatic speech recognition 有权

公开(公告)号：US09959887B2

公开(公告)日：2018-05-01

申请号：US15064441

申请日：2016-03-08

申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION

发明人： Hong-Kwang J. Kuo , Lidia L. Mangu , Samuel Thomas

IPC分类号： G01L15/00 , G10L25/87 , G10L15/22 , G10L15/14 , G10L25/30 , G10L15/30

CPC分类号： G10L25/87 , G10L15/142 , G10L15/22 , G10L15/30 , G10L15/32 , G10L25/30 , G10L25/78 , G10L2015/225

摘要： An automatic speech recognition system and a method performed by an automatic speech recognition system are provided. The method includes performing at least two passes of speech activity detection on an acoustic utterance uttered by a speaker. The at least two passes include an initial pass and a subsequent pass. The method further includes estimating at least one of feature statistics and transforms for acoustic feature extraction and acoustic modeling based on an output of an initial pass. The method further includes performing automatic speech recognition using an output of the subsequent pass while bypassing an output of the initial pass to recognize the acoustic utterance.

34.

发明申请
ACOUSTIC MODEL TRAINING 审中-公开

公开(公告)号：US20170287469A1

公开(公告)日：2017-10-05

申请号：US15479304

申请日：2017-04-05

申请人： International Business Machines Corporation

发明人： Hong-Kwang J. Kuo , Lidia L. Mangu , Samuel Thomas

IPC分类号： G10L15/06 , G10L15/05 , G10L15/197

CPC分类号： G10L15/063 , G10L15/02 , G10L15/04 , G10L15/05 , G10L15/197 , G10L15/26

摘要： A method, executed by a computer, includes receiving a channel recording corresponding to a conversation, receiving a transcription for the conversation, generating a conversation-specific language model for the conversation using the transcription, and conducting speech recognition on the channel recording using the conversation-specific language model to provide time boundaries and written language corresponding to utterances within the channel recording. The method further includes determining sentence or phrase boundaries for the transcription, aligning written language within the one or more transcriptions with the written language corresponding to the utterances with the channel recording to provide sentence or phrase boundaries for the channel recording, and training a speech recognizer according to the sentence or phrase boundaries for the transcription and the sentence or phrase boundaries for the channel recording. A computer system and computer program product corresponding to the method are also disclosed herein.

35.

发明授权
Acoustic model training 有权

公开(公告)号：US09697835B1

公开(公告)日：2017-07-04

申请号：US15086949

申请日：2016-03-31

申请人： International Business Machines Corporation

发明人： Hong-Kwang J. Kuo , Lidia L. Mangu , Samuel Thomas

IPC分类号： G10L15/02 , G10L15/04 , G10L15/06 , G10L15/26

CPC分类号： G10L15/063 , G10L15/02 , G10L15/04 , G10L15/05 , G10L15/197 , G10L15/26

摘要： A method, executed by a computer, includes receiving a channel recording corresponding to a conversation, receiving a transcription for the conversation, generating a conversation-specific language model for the conversation using the transcription, and conducting speech recognition on the channel recording using the conversation-specific language model to provide time boundaries and written language corresponding to utterances within the channel recording. The method further includes determining sentence or phrase boundaries for the transcription, aligning written language within the one or more transcriptions with the written language corresponding to the utterances with the channel recording to provide sentence or phrase boundaries for the channel recording, and training a speech recognizer according to the sentence or phrase boundaries for the transcription and the sentence or phrase boundaries for the channel recording. A computer system and computer program product corresponding to the method are also disclosed herein.

36.

发明授权
Combining installed audio-visual sensors with ad-hoc mobile audio-visual sensors for smart meeting rooms 有权
标题翻译：将安装的视听传感器与智能会议室的特殊移动视听传感器相结合

公开(公告)号：US09584758B1

公开(公告)日：2017-02-28

申请号：US14952751

申请日：2015-11-25

申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION

发明人： Stanley Chen , Kenneth W. Church , Vaibhava Goel , Lidia L. Mangu , Etienne Marcheret , Bhuvana Ramabhadran , Laurence P. Sansone , Abhinav Sethy , Samuel Thomas

IPC分类号： H04N7/14 , H04N7/15 , H04L29/06

CPC分类号： H04N7/15 , G10L25/60 , H04L63/08 , H04N7/147 , H04N2007/145 , H04W12/06

摘要： A method of combining data streams from fixed audio-visual sensors with data streams from personal mobile devices including, forming a communication link with at least one of one or more personal mobile devices; receiving at least one of an audio data stream and/or a video data stream from the at least one of the one or more personal mobile devices; determining the quality of the at least one of the audio data stream and/or the video data stream, wherein the audio data stream and/or the video data stream having a quality above a threshold quality is retained; and combining the retained audio data stream and/or the video data stream with the data streams from the fixed audio-visual sensors.

摘要翻译： 将来自固定视听传感器的数据流与来自个人移动设备的数据流组合的方法，包括与一个或多个个人移动设备中的至少一个形成通信链路; 从所述一个或多个个人移动设备中的至少一个接收音频数据流和/或视频数据流中的至少一个; 确定音频数据流和/或视频数据流中的至少一个的质量，其中保留具有高于阈值质量的质量的音频数据流和/或视频数据流; 以及将所保留的音频数据流和/或视频数据流与来自固定视听传感器的数据流组合。

37.

发明授权
Multi-modal lung capacity measurement for respiratory illness prediction 有权

公开(公告)号：US12023146B2

公开(公告)日：2024-07-02

申请号：US17065936

申请日：2020-10-08

申请人： International Business Machines Corporation

发明人： Samuel Thomas , Nalini K. Ratha , Jonathan Hudson Connell, II

IPC分类号： A61B5/091 , A61B5/00 , G06N3/08

CPC分类号： A61B5/091 , A61B5/0022 , A61B5/7275 , A61B5/749 , G06N3/08

摘要： Determining lung capacity of includes capturing an audio waveform of the user performing an utterance presented to a user. A video of the user performing the utterance can be captured. The captured audio waveform and the video are analyzed for compliance. Based on the audio waveform, an indicator of respiratory function is determined. The indicator is compared with a reference indicator to determine health of the user. A machine learning model such as neural network can be trained to predict the indicator of the respiratory function based on input features comprising audio spectral and temporal characteristics of utterances. Determining the indicator or respiratory function can include running the trained machine learning model.

38.

发明授权
Training teacher machine learning models using lossless and lossy branches 有权

公开(公告)号：US11907845B2

公开(公告)日：2024-02-20

申请号：US16994656

申请日：2020-08-17

申请人： International Business Machines Corporation

发明人： Takashi Fukuda , Samuel Thomas

IPC分类号： G06N3/084 , G10L15/16 , G06N3/045

CPC分类号： G06N3/084 , G06N3/045 , G10L15/16

摘要： Some embodiments of the present invention are directed to techniques for training teacher neural networks (TNNs) and student neural networks (SNNs). A training data set is received with a lossless set of data and a corresponding lossy set of data. Two branches of a TNN are established, with one branch trained using the lossless data (a lossless branch) and one trained using the lossy data (a lossy branch). Weights for the two branches are tied together. The lossy branch, now isolated from the lossless branch, generates a set of soft targets for initializing an SNN. These generated soft targets benefit from the training of lossless branch through the weights that were tied together between each branch, despite isolating the lossless branch from the lossy branch during soft-target generation.

39.

发明公开
GLOBAL NEURAL TRANSDUCER MODELS LEVERAGING SUB-TASK NETWORKS 审中-公开

公开(公告)号：US20230153601A1

公开(公告)日：2023-05-18

申请号：US17526350

申请日：2021-11-15

申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION

发明人： Takashi Fukuda , Samuel Thomas

IPC分类号： G06N3/08 , G06N3/04 , G10L15/00

CPC分类号： G06N3/08 , G06N3/0454 , G10L15/00

摘要： A computer-implemented method for training a neural transducer for speech recognition is provided. The method includes initializing the neural transducer having a prediction network and an encoder network and a joint network. The method further includes expanding the prediction network by changing the prediction network to a plurality of prediction-net branches. Each of the prediction-net branches is a prediction network for a respective specific sub-task from among a plurality of specific sub-tasks. The method also includes training, by a hardware processor, an entirety of the neural transducer by using training data sets for all of the plurality of specific sub-tasks. The method additionally includes obtaining a trained neural transducer by fusing the plurality of prediction-net branches.

40.

发明授权
Constructing, evaluating, and improving a search string for retrieving images indicating item use 有权

公开(公告)号：US11645329B2

公开(公告)日：2023-05-09

申请号：US15856505

申请日：2017-12-28

申请人： International Business Machines Corporation

发明人： Anne E. Gattiker , Sujatha Kashyap , Minh Ngoc Binh Nguyen , Samuel Thomas , Kaipeng Li , Thomas Hubregtsen

IPC分类号： G06F16/242 , G06F16/583 , G06F16/9535 , G06F16/2457

CPC分类号： G06F16/583 , G06F16/2425 , G06F16/24578 , G06F16/9535

摘要： Examples of techniques for constructing, evaluating, and improving a search string for retrieving images are disclosed. In one example implementation according to aspects of the present disclosure, a computer-implemented method includes receiving, by a processing device, a plurality of images as search results returned based at least in part on a search string for an item in the form of a tuple including an item class, an action and an actor. The method further includes determining, by the processing device, whether the search string is effective at indicating a common item use based on image similarity. The method further includes, based at least in part on determining that the search string is ineffective at indicating the item use, generating, by the processing device, an alternative search string.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类