-
公开(公告)号:US20200082809A1
公开(公告)日:2020-03-12
申请号:US16684970
申请日:2019-11-15
摘要: Audio features, such as perceptual linear prediction (PLP) features and time derivatives thereof, are extracted from frames of training audio data including speech by multiple speakers, and silence, such as by using linear discriminant analysis (LDA). The frames are clustered into k-means clusters using distance measures, such as Mahalanobis distance measures, of means and variances of the extracted audio features of the frames. A recurrent neural network (RNN) is trained on the extracted audio features of the frames and cluster identifiers of the k-means clusters into which the frames have been clustered. The RNN is applied to audio data to segment audio data into segments that each correspond to one of the cluster identifiers. Each segment can be assigned a label corresponding to one of the cluster identifiers. Speech recognition can be performed on the segments.
-
公开(公告)号:US10902843B2
公开(公告)日:2021-01-26
申请号:US16684970
申请日:2019-11-15
摘要: Audio features, such as perceptual linear prediction (PLP) features and time derivatives thereof, are extracted from frames of training audio data including speech by multiple speakers, and silence, such as by using linear discriminant analysis (LDA). The frames are clustered into k-means clusters using distance measures, such as Mahalanobis distance measures, of means and variances of the extracted audio features of the frames. A recurrent neural network (RNN) is trained on the extracted audio features of the frames and cluster identifiers of the k-means clusters into which the frames have been clustered. The RNN is applied to audio data to segment audio data into segments that each correspond to one of the cluster identifiers. Each segment can be assigned a label corresponding to one of the cluster identifiers. Speech recognition can be performed on the segments.
-
3.
公开(公告)号:US09075748B2
公开(公告)日:2015-07-07
申请号:US14049830
申请日:2013-10-09
发明人: David C. Haws , Laxmi P. Parida
CPC分类号: G06F19/14 , H03M7/30 , H03M7/40 , H03M7/4006
摘要: Various embodiments provide lossless compression of an enumeration space for genetic founder lines. In one embodiment, an input comprising a set of genetic founder lines and a maximum number of generations G is obtained. A set of genetic crossing templates of a height h is generated. A determination is made if at least a first genetic crossing template in the set of genetic crossing templates is redundant with respect to a second genetic crossing template in the set of genetic crossing templates. Based on the at least first genetic crossing template being redundant is redundant with respect to the second genetic crossing template, the at least first genetic crossing template is removed from the set of genetic crossing templates. This process of removing the at least first genetic crossing template from the set of genetic crossing templates the redundant creates an updated set of genetic crossing templates.
摘要翻译: 各种实施例提供了遗传创始人线的枚举空间的无损压缩。 在一个实施例中,获得包括一组遗传创始人行和最大代数G的输入。 产生一组高度h的遗传交叉模板。 确定遗传交叉模板集合中的至少第一遗传交叉模板相对于遗传交叉模板集合中的第二遗传交叉模板是多余的。 基于所述至少第一遗传交叉模板是冗余的,相对于所述第二遗传交叉模板是冗余的,所述至少第一遗传交叉模板从所述遗传交叉模板集合中去除。 从遗传杂交模板组中去除至少第一个遗传交叉模板的该过程产生了更新的一组遗传交叉模板。
-
公开(公告)号:US11335433B2
公开(公告)日:2022-05-17
申请号:US16131175
申请日:2018-09-14
发明人: David C. Haws , Dan He , Laxmi P. Parida
摘要: Various embodiments select markers for modeling epistasis effects. In one embodiment, a processor receives a set of genetic markers and a phenotype. A relevance score is determined with respect to the phenotype for each of the set of genetic markers. A threshold is set based on the relevance score of a genetic marker with a highest relevancy score. A relevance score is determined for at least one genetic marker in the set of genetic markers for at least one interaction between the at least one genetic marker and at least one other genetic marker in the set of genetic markers. The at least one interaction is added to a top-k feature set based on the relevance score of the at least one interaction satisfying the threshold.
-
公开(公告)号:US11335434B2
公开(公告)日:2022-05-17
申请号:US16131229
申请日:2018-09-14
发明人: David C. Haws , Dan He , Laxmi P. Parida
摘要: Various embodiments select markers for modeling epistasis effects. In one embodiment, a processor receives a set of genetic markers and a phenotype. A relevance score is determined with respect to the phenotype for each of the set of genetic markers. A threshold is set based on the relevance score of a genetic marker with a highest relevancy score. A relevance score is determined for at least one genetic marker in the set of genetic markers for at least one interaction between the at least one genetic marker and at least one other genetic marker in the set of genetic markers. The at least one interaction is added to a top-k feature set based on the relevance score of the at least one interaction satisfying the threshold.
-
公开(公告)号:US10546575B2
公开(公告)日:2020-01-28
申请号:US15379038
申请日:2016-12-14
摘要: Audio features, such as perceptual linear prediction (PLP) features and time derivatives thereof, are extracted from frames of training audio data including speech by multiple speakers, and silence, such as by using linear discriminant analysis (LDA). The frames are clustered into k-means clusters using distance measures, such as Mahalanobis distance measures, of means and variances of the extracted audio features of the frames. A recurrent neural network (RNN) is trained on the extracted audio features of the frames and cluster identifiers of the k-means clusters into which the frames have been clustered. The RNN is applied to audio data to segment audio data into segments that each correspond to one of the cluster identifiers. Each segment can be assigned a label corresponding to one of the cluster identifiers. Speech recognition can be performed on the segments.
-
7.
公开(公告)号:US10249292B2
公开(公告)日:2019-04-02
申请号:US15379010
申请日:2016-12-14
摘要: Speaker diarization is performed on audio data including speech by a first speaker, speech by a second speaker, and silence. The speaker diarization includes segmenting the audio data using a long short-term memory (LSTM) recurrent neural network (RNN) to identify change points of the audio data that divide the audio data into segments. The speaker diarization includes assigning a label selected from a group of labels to each segment of the audio data using the LSTM RNN. The group of labels comprising includes labels corresponding to the first speaker, the second speaker, and the silence. Each change point is a transition from one of the first speaker, the second speaker, and the silence to a different one of the first speaker, the second speaker, and the silence. Speech recognition can be performed on the segments that each correspond to one of the first speaker and the second speaker.
-
公开(公告)号:US20180166067A1
公开(公告)日:2018-06-14
申请号:US15379038
申请日:2016-12-14
摘要: Audio features, such as perceptual linear prediction (PLP) features and time derivatives thereof, are extracted from frames of training audio data including speech by multiple speakers, and silence, such as by using linear discriminant analysis (LDA). The frames are clustered into k-means clusters using distance measures, such as Mahalanobis distance measures, of means and variances of the extracted audio features of the frames. A recurrent neural network (RNN) is trained on the extracted audio features of the frames and cluster identifiers of the k-means clusters into which the frames have been clustered. The RNN is applied to audio data to segment audio data into segments that each correspond to one of the cluster identifiers. Each segment can be assigned a label corresponding to one of the cluster identifiers. Speech recognition can be performed on the segments.
-
9.
公开(公告)号:US20180166066A1
公开(公告)日:2018-06-14
申请号:US15379010
申请日:2016-12-14
摘要: Speaker diarization is performed on audio data including speech by a first speaker, speech by a second speaker, and silence. The speaker diarization includes segmenting the audio data using a long short-term memory (LSTM) recurrent neural network (RNN) to identify change points of the audio data that divide the audio data into segments. The speaker diarization includes assigning a label selected from a group of labels to each segment of the audio data using the LSTM RNN. The group of labels comprising includes labels corresponding to the first speaker, the second speaker, and the silence. Each change point is a transition from one of the first speaker, the second speaker, and the silence to a different one of the first speaker, the second speaker, and the silence. Speech recognition can be performed on the segments that each correspond to one of the first speaker and the second speaker.
-
公开(公告)号:US09041566B2
公开(公告)日:2015-05-26
申请号:US14014635
申请日:2013-08-30
发明人: David C. Haws , Laxmi P. Parida
CPC分类号: G06F19/14 , H03M7/30 , H03M7/40 , H03M7/4006
摘要: Various embodiments provide lossless compression of an enumeration space for genetic founder lines. In one embodiment, an input comprising a set of genetic founder lines and a maximum number of generations G is obtained. A set of genetic crossing templates of a height h is generated. A determination is made if at least a first genetic crossing template in the set of genetic crossing templates is redundant with respect to a second genetic crossing template in the set of genetic crossing templates. Based on the at least first genetic crossing template being redundant is redundant with respect to the second genetic crossing template, the at least first genetic crossing template is removed from the set of genetic crossing templates. This process of removing the at least first genetic crossing template from the set of genetic crossing templates the redundant creates an updated set of genetic crossing templates.
-
-
-
-
-
-
-
-
-