-
公开(公告)号:US20190096432A1
公开(公告)日:2019-03-28
申请号:US16137596
申请日:2018-09-21
Applicant: FUJITSU LIMITED
Inventor: Sayuri Nakayama , TARO TOGAWA , Takeshi OTANI
Abstract: A speech processing method for estimating a pitch frequency includes: executing a conversion process that includes calculating a spectrum from a plurality of frames included in an input signal; executing a determination process that includes determining a speech-like frame from the plurality of frames based on characteristics of the spectrum of the frame; executing a learning process that includes specifying a fundamental sound based on a plurality of local maximum values included in the spectrum of the speech frame and learning a learning value based on a magnitude of the fundamental sound; and executing a detection process of detecting a pitch frequency of the frame based on the spectrum of the frame and the learning value.
-
2.
公开(公告)号:US20190066714A1
公开(公告)日:2019-02-28
申请号:US16113125
申请日:2018-08-27
Applicant: FUJITSU LIMITED
Inventor: Sayuri Nakayama , TARO TOGAWA , Takeshi OTANI
Abstract: A method for processing speech includes: executing a acquiring process that includes acquiring a speech signal; executing a detection process that includes detecting a first frequency spectrum from the speech signal; executing a calculation process that includes calculating a second spectrum based on an envelope of the first spectrum; executing a correction process that includes correcting the first spectrum based on comparison between a first amplitude of the first spectrum and a second amplitude of the second spectrum; executing a estimation process that includes estimating a pitch frequency of the speech signal in accordance with correlation between the corrected first frequency spectrum and periodic signals corresponding to frequencies in a certain band.
-
公开(公告)号:US20170092294A1
公开(公告)日:2017-03-30
申请号:US15266057
申请日:2016-09-15
Applicant: FUJITSU LIMITED
Inventor: TARO TOGAWA , Sayuri KOHMURA , Takeshi OTANI
CPC classification number: G10L25/06 , G10L17/005 , G10L25/51 , G10L25/78 , G10L2025/783
Abstract: A voice processing apparatus including a memory, and a processor coupled to the memory and the processor configured to acquire a first input signal containing a first voice, and a second input signal containing a second voice, obtain a first signal intensity of the first input signal, and a second signal intensity of the second input signal, specify a correlation coefficient between a time sequence of the first signal intensity and a time sequence of the second signal intensity, determine whether the first voice and the second voice are in the conversation state or not based on the specified correlation coefficient, and output information indicating an association between the first voice and the second voice when it is determined that the first voice and the second voice are in the conversation state.
-
公开(公告)号:US20200251129A1
公开(公告)日:2020-08-06
申请号:US16742493
申请日:2020-01-14
Applicant: FUJITSU LIMITED
Inventor: TARO TOGAWA , Sayuri Nakayama , JUN TAKAHASHI , Kiyonori Morioka
Abstract: A non-transitory computer-readable recording medium having stored therein a program that causes a computer to execute a procedure, the procedure includes detecting a plurality of voice sections from an input sound that includes voices of a plurality of speakers, calculating a feature amount of each of the plurality of voice sections, determining a plurality of emotions, corresponding to the plurality of voice sections respectively, of a speaker of the plurality of speakers for each of the plurality of voice sections, and clustering a plurality of feature amounts, based on a change vector from the feature amount of the voice section determined as a first emotion of the plurality of emotions of the speaker to the feature amount of the voice section determined as a second emotion of the plurality of emotions different from the first emotion.
-
公开(公告)号:US20180059155A1
公开(公告)日:2018-03-01
申请号:US15645011
申请日:2017-07-10
Applicant: FUJITSU LIMITED
Inventor: Sayuri Nakayama , TARO TOGAWA , Takeshi OTANI
CPC classification number: G01R23/16 , G10L19/02 , G10L21/0232 , G10L2021/02087 , G10L2021/02165 , H03G3/32 , H03G5/165 , H04H60/04 , H04R1/222 , H04R3/005 , H04R2430/03
Abstract: A sound processing device performs obtaining a first frequency spectrum that corresponds to a first sound signal and a second frequency spectrum that corresponds to a second sound signal, calculating a level difference between a level of each of frequency components in the first frequency spectrum and a level of each of frequency components in the second frequency spectrum, calculating a spread of a distribution of the level difference during a prescribed period for each of the frequency components, and determining a gain to be multiplied to the frequency component in the first frequency spectrum and a gain to be multiplied to the frequency component in the second frequency spectrum in accordance with the spread of the distribution of the level difference.
-
6.
公开(公告)号:US20170061991A1
公开(公告)日:2017-03-02
申请号:US15247887
申请日:2016-08-25
Applicant: FUJITSU LIMITED
Inventor: Sayuri KOHMURA , TARO TOGAWA , Takeshi OTANI
Abstract: An utterance condition determination device includes a memory configured to a voice signal of a first speaker and a voice signal of a second speaker, and a processor configured to estimate an average backchannel frequency that represents a backchannel frequency of the second speaker in a period of time from a voice start time of the voice signal of the second speaker to a predetermined time based on the voice signal of the first speaker and the voice signal of the second speaker, to calculate the backchannel frequency of the second speaker for each unit of time based on the voice signal of the first speaker and the voice signal of the second speaker, and to determine a satisfaction level of the second speaker based on the estimated average backchannel frequency and the calculated backchannel frequency.
Abstract translation: 话音条件确定装置包括被配置为第一扬声器的语音信号和第二扬声器的语音信号的存储器,以及处理器,被配置为估计在一段时间内表示第二说话者的反向信道频率的平均反向信道频率 基于第一扬声器的语音信号和第二扬声器的语音信号,从第二扬声器的语音信号的语音开始时间到预定时间,以基于每个时间单位计算第二扬声器的反向频道频率 在第一扬声器的语音信号和第二扬声器的语音信号上,并且基于所估计的平均反向信道频率和所计算的反向信道频率来确定第二扬声器的满意度。
-
公开(公告)号:US20210027796A1
公开(公告)日:2021-01-28
申请号:US16931526
申请日:2020-07-17
Applicant: FUJITSU LIMITED
Inventor: TARO TOGAWA , Sayuri Nakayama , Kiyonori Morioka
IPC: G10L21/0208 , G06N20/00 , G10L17/04 , G10L21/028
Abstract: A detection method implemented by a computer, the detection method includes: acquiring voice information containing voices of a plurality of speakers; detecting a first speech segment of a first speaker among the plurality of speakers included in the voice information based on a first acoustic feature of the first speaker, the first acoustic feature being obtained by performing a machine learning; and detecting a second speech segment of a second speaker among the plurality of speakers based on a second acoustic feature, the second acoustic feature being an acoustic feature included in the voice information associated with a predetermined time range, the predetermined time range being a time range outside the first speech segment.
-
公开(公告)号:US20190214039A1
公开(公告)日:2019-07-11
申请号:US16354260
申请日:2019-03-15
Applicant: FUJITSU LIMITED
Inventor: Sayuri Nakayama , TARO TOGAWA , Takeshi OTANI
Abstract: A non-transitory computer-readable recording medium storing a program that causes a computer to execute a process for evaluating a voice, the process includes analyzing a voice signal to detect a pitch frequency; selecting an evaluation target region to be evaluated in the detected pitch frequency based on a distribution of a detection rate of the detected pitch frequency; and evaluating a voice based on the distribution of the detection rate and the selected evaluation target region.
-
公开(公告)号:US20130279709A1
公开(公告)日:2013-10-24
申请号:US13924071
申请日:2013-06-21
Applicant: FUJITSU LIMITED
Inventor: Masanao SUZUKI , Takeshi OTANI , TARO TOGAWA , CHISATO ISHIKAWA
IPC: H04R25/00
CPC classification number: H04R25/30 , A61B5/123 , A61B5/749 , G10L21/0364 , G10L2021/0575 , H04R25/70
Abstract: A voice control device includes a hearing estimate section configured to estimate hearing of a user based on a sending/received sound ratio representing a ratio of the volume of a sending sound to the volume of a received sound; a compensation-quantity calculating section configured to calculate a compensation quantity for a received signal of the received sound responsive to the estimated hearing; and a compensation section configured to compensate the received signal based on the calculated compensation quantity.
Abstract translation: 语音控制装置包括:听觉估计部,被配置为基于表示发送声音的音量与所接收的声音的音量的比的发送/接收声音比率来估计用户的听觉; 补偿量计算部分,被配置为响应于所估计的听觉来计算接收到的声音的接收信号的补偿量; 以及补偿部,被配置为基于所计算的补偿量来补偿所接收的信号。
-
公开(公告)号:US20190096433A1
公开(公告)日:2019-03-28
申请号:US16139291
申请日:2018-09-24
Applicant: FUJITSU LIMITED
Inventor: TARO TOGAWA , Sayuri Nakayama , Takeshi OTANI
Abstract: A voice processing method for estimating an impression of speech includes: executing an acquisition process that includes acquiring voice signals; executing a feature acquisition process that includes acquiring acoustic features regarding the voice signals from the voice signals; executing a voice-parameter acquisition process that includes acquiring a voice parameter regarding a frame of the voice signals; executing a relative-value determination process that includes determining a relative value between the determined voice parameter and a statistical value of the voice parameter; executing a weight assignment process that includes assigning a weight to the frame of the voice signals in accordance with the relative value; and executing a distribution determination process that includes determining a distribution of the acoustic features, based on the weight assigned to the frame of the voice signals.
-
-
-
-
-
-
-
-
-