专利检索 ap:("Yaxin Zhang" OR "Jianming Song" OR "Anton Madievski") AND inv:"Jianming Song" 第 1 页

1.

发明授权
Voiced/unvoiced speech classifier 有权
标题翻译：有声/无声语音分类器

公开(公告)号：US06640208B1

公开(公告)日：2003-10-28

申请号：US09659318

申请日：2000-09-12

申请人： Yaxin Zhang , Jianming Song , Anton Madievski

发明人： Yaxin Zhang , Jianming Song , Anton Madievski

IPC分类号： G10L1106

CPC分类号： G10L25/93

摘要： A voiced/unvoiced speech classifier (30) includes a speech segmentor (34) which segments an input digitized speech waveform into frames of speech and a band-pass filter (36) which filters the frames of speech. A relative energy generator (38) generates a relative energy value for each filtered frame of speech and a decision parameter generator (52) including an autocorrelation calculator (54) and a pitch calculator (56) generates a decision parameter based on an autocorrelation function and a pitch frequency index for the filtered frames of speech. A normalized energy calculator (46) adjusts the threshold and then normalizes the relative energy. A comparator (60) provides a signal indicative of whether a frame of speech is voiced speech or unvoiced speech depending on a comparison of the decision parameter and the normalized relative energy value for each filtered frame of speech.

摘要翻译： 有声/无声语音分类器（30）包括将输入的数字化语音波形分成语音帧的语音分割器（34）和对语音帧进行滤波的带通滤波器（36）。相对能量发生器（38）为每个经滤波的语音帧产生相对能量值，并且包括自相关计算器（54）和音高计算器（56）的判定参数发生器（52）基于自相关函数产生决策参数，并且用于滤波的语音帧的音调频率索引。归一化能量计算器（46）调整阈值，然后使相对能量归一化。比较器（60）根据决定参数与每个被滤波的语音帧的归一化相对能量值的比较，提供指示语音帧是语音语音还是无声语音的信号。

2.

发明授权
Tone based speech recognition 有权
标题翻译：基于语音识别

公开(公告)号：US06553342B1

公开(公告)日：2003-04-22

申请号：US09496868

申请日：2000-02-02

申请人： Yaxin Zhang , Jianming Song , Anton Madievski

发明人： Yaxin Zhang , Jianming Song , Anton Madievski

IPC分类号： G10L1502

CPC分类号： G10L15/02 , G10L25/15

摘要： A method and apparatus for speech recognition involves classifying (38) a digitized speech segment according to whether the speech segment comprises voiced or unvoiced speech and utilizing that classification to generate tonal feature vectors (41) of the speech segment when the speech is voiced. The tonal feature vectors are then combined (42) with other non-tonal feature vectors (40) to provide speech feature vectors. The speech feature vectors are compared (35) with previously stored models of speech feature vectors (37) for different segments of speech to determine which previously stored model is a most likely match for the segment to be recognized.

摘要翻译： 用于语音识别的方法和装置涉及根据语音段是否包括有声或无声语音来分类（38）数字化语音段，并且当语音被语音时利用该分类来生成语音段的音调特征向量（41）。然后将音调特征向量与其他非音调特征向量（40）组合（42）以提供语音特征向量。将语音特征向量与先前存储的用于不同语音段的语音特征向量（37）的模型进行比较（35），以确定先前存储的模型是否将被识别的段最可能匹配。

3.

发明公开
SMART WEARABLE DEVICE FOR VISION ENHANCEMENT AND METHOD FOR REALIZING STEREOSCOPIC VISION TRANSPOSITION 审中-公开

公开(公告)号：US20230239447A1

公开(公告)日：2023-07-27

申请号：US17667527

申请日：2022-02-08

申请人： Jianming Song , Tieshan Zhang , Jie Hu

发明人： Jianming Song , Tieshan Zhang , Jie Hu

IPC分类号： H04N13/122 , H04N5/225 , H04N5/243 , H04N13/344 , G02B27/01

CPC分类号： H04N13/122 , G02B27/0172 , G02B27/0176 , H04N5/243 , H04N5/2254 , H04N5/2257 , H04N13/344 , G02B30/34 , G02B2027/014 , G02B2027/0134 , G02B2027/0138 , G02B2027/0159

摘要： The invent discloses a smart wearable device for vision enhancement and a method for realizing stereoscopic vision transposition, comprising a wearable device body, wherein the wearable device body is provided with camera lenses, image sensors, an image information receiving and transmitting unit, image enhancement units, and near-to-eye optical systems; the optical axis and field angle of the near-to-eye optical system are matched with the optical axis and field angle of the camera lens; the image sensor is arranged behind the camera lens; the real scene enters the image sensor through an image imaging device for image acquisition, and through the image enhancement unit, the low-light environment image collected by the smart wearable device in the low-light environment is enhanced and displayed clearly. The invention can ensure the enhancement of the real stereoscopic vision in the dark environment and the interchange of the remote and barrier-free stereoscopic real vision.

4.

发明申请
Method and apparatus of increasing speech intelligibility in noisy environments 有权
标题翻译：在嘈杂环境中增加语音清晰度的方法和设备

公开(公告)号：US20060270467A1

公开(公告)日：2006-11-30

申请号：US11137182

申请日：2005-05-25

申请人： Jianming Song , John Johnson

发明人： Jianming Song , John Johnson

IPC分类号： H04B1/38 , H04M1/00

CPC分类号： H03G3/3089 , G10L21/0208 , G10L21/0232 , G10L25/15 , H04M1/6025

摘要： A method (400, 600, 700) and apparatus (220) for enhancing the intelligibility of speech emitted into a noisy environment. After filtering (408) ambient noise with a filter (304) that simulates the physical blocking of noise by a at least a part of a voice communication device (102) a frequency dependent SNR of received voice audio relative to ambient noise is computed (424) on a perceptual (e.g. Bark) frequency scale. Formants are identified (426, 600, 700) and the SNR in bands including certain formants are modified (508, 510) with formant enhancement gain factors in order to improve intelligibility. A set of high pass filter gains (338) is combined (516) with the formant enhancement gains factors yielding combined gains which are clipped (518), scaled (520) according to a total SNR, normalized (526), smoothed across time (530) and frequency (532) and used to reconstruct (532, 534) an audio signal.

摘要翻译： 一种用于增强发射到嘈杂环境中的语音的可懂度的方法（400,600,700）和装置（220）。在利用模拟语音通信设备（102）的至少一部分来模拟噪声的物理阻塞的滤波器（304）对环境噪声进行滤波（408）之后，计算接收到的语音音频相对于环境噪声的频率相关SNR（424 ）在感知（例如树皮）频率标度上。识别共振峰（426,600,700），并且包括某些共振峰的频带中的SNR被修改（508,510），具有共振峰增强增益因子，以便提高可懂度。将一组高通滤波器增益（338）与共振峰增强增益因子组合（516），产生组合增益，该组合增益根据总SNR进行削波（518），缩放（520），标准化（526），跨越时间平滑 530）和频率（532），并用于重建（532,534）音频信号。

5.

发明申请
APPARATUS AND METHOD FOR NOISE REMOVAL 有权
标题翻译：噪声去除装置和方法

公开(公告)号：US20130158989A1

公开(公告)日：2013-06-20

申请号：US13330235

申请日：2011-12-19

申请人： Jianming Song , David Barron

发明人： Jianming Song , David Barron

IPC分类号： G10L21/02

CPC分类号： G10L21/0232 , G10L2021/02165

摘要： A continuous stream of noise is created from a plurality of input signals. A smoothing spectrum estimate is continuously calculated from the continuous stream of noise. Noise is responsively removed from a selected one of the plurality of input signals using the smoothing spectrum estimate. The removal of the noise from the selected input signal is performed substantially synchronously and in time alignment with the creating of the continuous stream of noise and the calculating of the smoothing spectrum estimate.

摘要翻译： 从多个输入信号产生连续的噪声流。从连续的噪声流连续计算平滑频谱估计。使用平滑频谱估计从多个输入信号中的所选择的一个响应地去除噪声。从所选择的输入信号中去除噪声基本上同步地进行，并且与连续的噪声流的产生以及平滑频谱估计的计算在时间上一致。

6.

发明申请
Method of refining statistical pattern recognition models and statistical pattern recognizers 有权
标题翻译：统计模式识别模型和统计模式识别方法

公开(公告)号：US20060136205A1

公开(公告)日：2006-06-22

申请号：US11018271

申请日：2004-12-21

申请人： Jianming Song

发明人： Jianming Song

IPC分类号： G10L15/06

CPC分类号： G10L15/063 , G06K9/6277 , G10L2015/0635

摘要： A device (800) performs statistical pattern recognition using model parameters that are refined by optimizing an objective function that includes a term for many items of training data for which recognition errors occur wherein each term depends on a relative magnitude of a first score for a recognition result for an item of training data and a second score calculated by evaluating a statistical pattern recognition model identified by a transcribed identity of the training data item with feature vectors extracted from the item of training data. The objective function does not include terms for items of training data for which there is a gross discrepancy between a transcribed identity and a recognized identity. Gross discrepancies can be detected by probability score or pattern identity comparisons. Terms, of the objective function are weighted based on the type of recognition error and weights can be increased for high priority patterns.

摘要翻译： 设备（800）使用通过优化目标函数来改进的模型参数来执行统计模式识别，所述目标函数包括用于识别错误发生的许多训练数据项的项，其中每个项取决于用于识别的第一分数的相对大小通过从训练数据项目提取的特征向量评估由训练数据项的转录身份识别的统计模式识别模型而计算出的训练数据项目和第二分数。目标函数不包括训练数据项，其中转录身份与识别身份之间存在严重差异。总差异可以通过概率分数或模式识别比较来检测。根据识别误差的类型对目标函数的术语进行加权，对于高优先级模式，可以增加权重。

7.

发明授权
Cohort model selection apparatus and method 失效
标题翻译：队列模型选择装置及方法

公开(公告)号：US06393397B1

公开(公告)日：2002-05-21

申请号：US09332927

申请日：1999-06-14

申请人： Ho Chuen Choi , Xiaoyuan Zhu , Jianming Song

发明人： Ho Chuen Choi , Xiaoyuan Zhu , Jianming Song

IPC分类号： G10L1506

CPC分类号： G10L17/04 , G10L17/12

摘要： An apparatus for selecting a cohort model for use in a speaker verification system includes a model generator (108) for determining a target speaker model (114) from a speech sample collected from the target speaker (106). A cohort selector (110) determines a similarity value between each of a number of predetermined existing speaker models from a model pool (112) and the target speaker model (114) and a dissimilarity value between each of the existing speaker models and any previously selected cohort models (116). An existing speaker model which is most similar to the target speaker model, but most dissimilar to previously chosen cohort models, is then chosen as another cohort model for the target speaker.

摘要翻译： 一种用于选择在扬声器验证系统中使用的队列模型的装置包括：模型发生器（108），用于从从目标扬声器（106）收集的语音样本中确定目标说话者模型（114）。队列选择器（110）确定来自模型池（112）和目标说话者模型（114）的多个预定的现有说话者模型中的每一个之间的相似度值，以及现有说话者模型中的每一者与之前选择的任何一个之间的相似度值队列模型（116）。然而，与目标说话者模型最相似但与以前选择的队列模型最相似的现有说话者模型被选择为目标说话者的另一队列模型。

8.

发明授权
Cotalker nulling based on multi super directional beamformer 有权
标题翻译：基于多个超定向波束形成器的Cotalker归零

公开(公告)号：US09497528B2

公开(公告)日：2016-11-15

申请号：US14074645

申请日：2013-11-07

申请人： Jianming Song , Mike Reuter

发明人： Jianming Song , Mike Reuter

IPC分类号： A61F11/06 , H04R1/08 , G10L17/00 , H04R3/00 , G10L21/0272 , H04R1/40 , G10L21/0208

CPC分类号： H04R1/08 , G10L17/00 , G10L21/0272 , G10L2021/02087 , H04R1/406 , H04R3/005 , H04R2430/20 , H04R2499/13

摘要： Speech from a driver and speech from a passenger in a vehicle is selected directionally using a plurality of directional microphones. Sounds detected as coming from a passenger from a plurality of directional microphones are suppressed from sounds detected as coming from a driver by a second plurality of directional microphones.

摘要翻译： 使用多个定向麦克风定向地选择来自车辆中的乘客的驾驶员和言语的语音。通过第二多个定向麦克风抑制从多个定向麦克风来的来自乘客的声音被抑制为来自驾驶员的声音。

9.

发明申请
COTALKER NULLING BASED ON MULTI SUPER DIRECTIONAL BEAMFORMER 有权
标题翻译：基于多个超方向波束的COTALKER空闲

公开(公告)号：US20150124988A1

公开(公告)日：2015-05-07

申请号：US14074645

申请日：2013-11-07

申请人： Jianming Song , Mike Reuter

发明人： Jianming Song , Mike Reuter

IPC分类号： H04R1/08 , G10L17/00

CPC分类号： H04R1/08 , G10L17/00 , G10L21/0272 , G10L2021/02087 , H04R1/406 , H04R3/005 , H04R2430/20 , H04R2499/13

摘要： Speech from a driver and speech from a passenger in a vehicle is selected directionally using a plurality of directional microphones. Sounds detected as coming from a passenger from a plurality of directional microphones are suppressed from sounds detected as coming from a driver by a second plurality of directional microphones.

摘要翻译： 使用多个定向麦克风定向地选择来自车辆中的乘客的驾驶员和言语的语音。通过第二多个定向麦克风抑制从多个定向麦克风来的来自乘客的声音被抑制为来自驾驶员的声音。

10.

发明授权
Apparatus and method for noise removal by spectral smoothing 有权
标题翻译：通过光谱平滑噪声消除的装置和方法

公开(公告)号：US08712769B2

公开(公告)日：2014-04-29

申请号：US13330235

申请日：2011-12-19

申请人： Jianming Song , David Barron

发明人： Jianming Song , David Barron

IPC分类号： G10L21/0232

CPC分类号： G10L21/0232 , G10L2021/02165

摘要： A continuous stream of noise is created from a plurality of input signals. A smoothing spectrum estimate is continuously calculated from the continuous stream of noise. Noise is responsively removed from a selected one of the plurality of input signals using the smoothing spectrum estimate. The removal of the noise from the selected input signal is performed substantially synchronously and in time alignment with the creating of the continuous stream of noise and the calculating of the smoothing spectrum estimate.

摘要翻译： 从多个输入信号产生连续的噪声流。从连续的噪声流连续计算平滑频谱估计。使用平滑频谱估计从多个输入信号中的所选择的一个响应地去除噪声。从所选择的输入信号中去除噪声基本上同步地进行，并且与连续的噪声流的产生以及平滑频谱估计的计算在时间上一致。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类