-
公开(公告)号:WO2022149662A1
公开(公告)日:2022-07-14
申请号:PCT/KR2021/005233
申请日:2021-04-26
申请人: 주식회사 헤이스타즈
发明人: 송진주
摘要: 본 발명은 사용자의 발음 과정 영상 및 음성을 이용하여 사용자의 발음을 평가하는 발음 평가 장치 및 방법에 관한 것으로, 강사의 개입 없이도, 사용자의 발음에 대한 피드백을 제공할 수 있을 뿐만 아니라, 발음의 평가 결과와 함께, 구체적으로 어느 시점의 발음이 잘못 되었는지를 제공함으로써 사용자가 발음을 쉽게 교정하도록 한다.
-
公开(公告)号:WO2022048354A1
公开(公告)日:2022-03-10
申请号:PCT/CN2021/108899
申请日:2021-07-28
申请人: 北京世纪好未来教育科技有限公司
摘要: 一种语音强制对齐模型评价方法、装置、电子设备及存储介质,语音强制对齐模型评价方法包括:利用待评价语音强制对齐模型,根据测试集中各段音频和与各段音频对应的文本,获取每段音频对应的音素序列以及该音素序列中各个音素的预测起止时间(S10);针对各个音素,根据该音素的预测起止时间和预先确定的该音素的基准起止时间,获取该音素的时间准确性得分(S11);根据各个音素的时间准确性得分,确定待评价语音强制对齐模型的时间准确性得分(S12)。
-
公开(公告)号:WO2021157192A1
公开(公告)日:2021-08-12
申请号:PCT/JP2020/046052
申请日:2020-12-10
申请人: ソニーグループ株式会社
IPC分类号: G10L21/0208 , G10L15/10 , G10L25/69 , H04N21/435 , H04N21/439
摘要: 映像及び音声コンテンツの再生装置において字幕の表示を制御する制御装置を提供する。 制御装置は、音声の性質を評価する評価部と、前記評価部の評価結果に基づいて字幕の表示の有無を判定する判定部を具備する。前記評価部は、字幕用の文字列と字幕の区間に対応する音声に基づいて、音声の発音の明瞭度を評価する。前記判定部は、前記評価部による評価結果が低い音声に対応する字幕を表示すると判定し、評価結果が高い音声に対応する字幕を表示しないと判定する。
-
公开(公告)号:WO2014210208A1
公开(公告)日:2014-12-31
申请号:PCT/US2014/044168
申请日:2014-06-25
发明人: LU, Wenliang , SEN, Dipanjan
IPC分类号: G10L25/69
摘要: A method for feature extraction by an electronic device is described. The method includes processing speech using a physiological cochlear model. The method also includes analyzing sections of an output of the physiological cochlear model. The method further includes extracting a place-based analysis vector and a time-based analysis vector for each section. The method additionally includes determining one or more features from each analysis vector.
摘要翻译: 描述了一种通过电子设备进行特征提取的方法。 该方法包括使用生理学耳蜗模型处理语音。 该方法还包括分析生理学耳蜗模型的输出的部分。 该方法还包括为每个部分提取基于位置的分析向量和基于时间的分析向量。 该方法另外包括从每个分析向量确定一个或多个特征。
-
公开(公告)号:WO2013045693A2
公开(公告)日:2013-04-04
申请号:PCT/EP2012/069330
申请日:2012-10-01
IPC分类号: G10L25/69
CPC分类号: H04H40/45 , G10L19/008 , G10L25/69 , H04B1/1676
摘要: The present document relates to audio signal processing, in particular to an apparatus and a corresponding method for improving an audio signal of an FM stereo radio receiver. In particular, the present document relates to a method and system for reliably detecting the quality of a received FM stereo radio signal and for selecting an appropriate processing based on the detected quality. An apparatus (20) configured to estimate the quality of a received multi-channel FM radio signal is described. The received multi¬ channel FM radio signal is representable as a mid signal and a side signal, and the side signal is indicative of a difference between a left signal and a right signal. The apparatus (20) comprises a power determination unit configured to determine (101) a power of the mid signal, referred to as mid power, and a power of the side signal, referred to as side power; a ratio determination unit configured to determine (102) a ratio of the mid power and the side power, thereby yielding a mid-to-side ratio; and a quality determination unit configured to determine (105) a quality indicator of the received FM radio signal based at least on the mid-to-side ratio.
摘要翻译: 本文件涉及音频信号处理,尤其涉及用于改进FM立体声无线电接收器的音频信号的装置和相应方法。 特别地,本文件涉及用于可靠地检测所接收的FM立体声无线电信号的质量并且用于基于检测到的质量选择适当的处理的方法和系统。 描述了被配置为估计所接收的多频道FM无线电信号的质量的装置(20)。 收到的multi¬ 信道FM无线电信号可表示为中间信号和侧信号,侧信号表示左信号和右信号之间的差异。 所述装置(20)包括功率确定单元,所述功率确定单元被配置为确定(101)被称为中功率的中间信号的功率和被称为侧功率的侧信号的功率; 比率确定单元,被配置为确定(102)中等功率和副功率的比率,从而产生中等比率; 以及质量确定单元,被配置为至少基于所述中侧比确定(105)所接收的FM无线电信号的质量指标。 p>
-
公开(公告)号:WO2021259842A1
公开(公告)日:2021-12-30
申请号:PCT/EP2021/066786
申请日:2021-06-21
发明人: SERRA, Joan , PONS PUIG, Jordi , PASCUAL, Santiago
摘要: Described is a method of training a neural-network-based system for determining an indication of an audio quality of an audio input. The method includes obtaining, as input, at least one training set comprising audio samples. The audio samples include audio samples of a first type and audio samples of a second type, wherein each of the first type of audio samples is labelled with information indicative of a respective predetermined audio quality metric, and wherein each of the second type of audio samples is labelled with information indicative of a respective audio quality metric relative to that of a reference audio sample. The method further includes: inputting the training set to the neural-network-based system; and iteratively training the system to predict the respective label information of the audio samples in the training set.
-
公开(公告)号:WO2021171547A1
公开(公告)日:2021-09-02
申请号:PCT/JP2020/008269
申请日:2020-02-28
申请人: 日本電信電話株式会社
发明人: 金光 卓生
IPC分类号: H04M3/22 , G10L19/00 , G10L21/0208 , G10L25/69 , H04M1/24
摘要: 通信伝送装置(1)は、所定時間の音声データ(100)を、単位時間ごとに区切り、音声レベルが所定の閾値を超えているか否かによりビット列(101)に変換する入力音声レベル検出部(11)と、音声データ(100)に対して所定の演算処理を行う演算処理部(10)と、所定時間の演算処理後の音声データ(100a)を、単位時間ごとに区切り、音声レベルが所定の閾値を超えているか否かによりビット列(101a)に変換する出力音声レベル検出部(12)と、演算処理前のビット列(101)と演算処理後のビット列(101a)とを比較する所定のロジックに基づき、音声障害が発生しているか否かを判定する比較判定部(13)とを備える。
-
公开(公告)号:WO2021133382A1
公开(公告)日:2021-07-01
申请号:PCT/US2019/068391
申请日:2019-12-23
申请人: DTS, INC.
摘要: A method comprises: obtaining a mixed soundtrack that includes dialogue mixed with non-dialogue sound; converting the mixed soundtrack to comparison text; obtaining reference text for the dialogue as a reference for intelligibility of the dialogue; determining a measure of intelligibility of the dialogue of the mixed soundtrack to a listener based on a comparison of the comparison text against the reference text; and reporting the measure of intelligibility of the dialogue.
-
公开(公告)号:WO2021133155A1
公开(公告)日:2021-07-01
申请号:PCT/MY2020/050125
申请日:2020-10-28
申请人: MIMOS BERHAD
摘要: The invention relates to a system and method for managing voice query of a presentation. The system includes a client module (101), a moderating module (102), a transcriber module (103), an output display module (104), a controller module (105), a server (106) and a repository (107). The method involves generating a voice query through a client module (201) by recording an audio input; converting the audio input to text format (202); accessing the audio input stored in the server (203); retrieving the converted audio in text format stored in the server (204); and transmitting the converted audio in text format to display the converted audio in text format through at least one computing device (205).
-
公开(公告)号:WO2018177610A1
公开(公告)日:2018-10-04
申请号:PCT/EP2018/025081
申请日:2018-03-29
申请人: FRAUNHOFER-GESELLSCHAFT ZUR FÖRDERUNG DER ANGEWANDTEN FORSCHUNG E.V. , FRIEDRICH-ALEXANDER-UNI VERSITÄT ERLANGEN-NÜRNBERG
发明人: GAMPP, Patrick , UHLE, Christian , DISCH, Sascha , KARAMPOURNIOTIS, ANTonios , HAVENSTEIN, Julia , HELLMUTH, Oliver , HERRE, Jürgen , PROKEIN, Peter
IPC分类号: G10L21/038 , G10L25/69
摘要: An apparatus for determining a predetermined characteristic related to an artificial bandwidth limitation processing of an audio signal comprises a slope evaiuator configured for evaluating a slope of a spectrum of the audio signal to obtain a slope evaluation result. The apparatus comprises a frequency evaiuator configured for evaluating a cut-off frequency of the spectrum of the audio signal to obtain a frequency evaluation result, and comprises a processor for providing an information indicating that the audio signal comprises the predetermined characteristic dependent on an evaluation of the slope evaluation result and an evaluation of the frequency evaluation result.
-
-
-
-
-
-
-
-
-