摘要:
According to one embodiment, a speech of a first language is recognized using a speech recognition dictionary to recognize the first language and a second language, and a source sentence of the first language is generated. The source sentence is translated into a second language, and a translation sentence of the second language is generated. An unknown word included in the translation sentence is detected. The unknown word is not stored in the speech recognition dictionary. A first pronunciation candidate of the unknown word is estimated, from a representation of the unknown word. A second pronunciation candidate of the unknown word is estimated from a pronunciation of an original word included in the source sentence corresponding to the unknown word. The unknown word, the first pronunciation candidate and the second pronunciation candidate, are registered into the speech recognition dictionary correspondingly.
摘要:
According to one embodiment, a sound collector includes a housing and a microphone holding member. The housing includes an opening portion. The opening portion is configured to receive a sound wave having a path. The housing is configured to be tightly closed except the opening portion. The microphone holding member is configured to be provided in the housing and hold a microphone that receives the sound wave propagating through the path at a predetermined position.
摘要:
A first speech input device captures a speech of a first language. A first speech output device outputs another speech of the first language. A second speech input device captures a speech of a second language. A second speech output device outputs another speech of the second language. In a speech recognition/translation server, a first speech recognition device receives a first utterance speech of the first language from the first speech input device, and recognizes the first utterance speech. A first machine translation device consecutively translates the first language of the recognition result into the second language without waiting completion of the first utterance speech. A first speech synthesis device generates a second speech of the translation result. A first output adjustment device outputs the first utterance speech and the second speech to the second speech output device by adjusting a volume of the first utterance speech to be smaller than a volume of the second speech.
摘要:
A first speech processing device includes a first speech input unit and a first speech output unit. A second speech processing device includes a second speech input unit and a second speech output unit. In a server therebetween, a speech of a first language sent from the first speech input unit is recognized. The speech recognition result is translated into a second language. The translation result is back translated into the first language. A first speech synthesis signal of the back translation result is sent to the first speech output unit. A second speech synthesis signal of the translation result is sent to the second speech output unit. Duration of the second speech synthesis signal or the first speech synthesis signal is measured. The first speech synthesis signal and the second speech synthesis signal are outputted by synchronizing a start time and an end time thereof, based on the duration.
摘要:
A conversation supporting device of an embodiment of the present disclosure has a information storage unit, a recognition resource constructing unit, and a voice recognition unit. Here, the information storage unit stores the information disclosed by a speaker. The recognition resource constructing unit uses the disclosed information to construct the recognition resource including a voice model and a language model for recognition of voice data. The voice recognition unit uses the recognition resource to recognize the voice data.
摘要:
A first speech input device captures a speech of a first language. A first speech output device outputs another speech of the first language. A second speech input device captures a speech of a second language. A second speech output device outputs another speech of the second language. In a speech recognition/translation server, a first speech recognition device receives a first utterance speech of the first language from the first speech input device, and recognizes the first utterance speech. A first machine translation device consecutively translates the first language of the recognition result into the second language without waiting completion of the first utterance speech. A first speech synthesis device generates a second speech of the translation result. A first output adjustment device outputs the first utterance speech and the second speech to the second speech output device by adjusting a volume of the first utterance speech to be smaller than a volume of the second speech.
摘要:
A first speech processing device includes a first speech input unit and a first speech output unit. A second speech processing device includes a second speech input unit and a second speech output unit. In a server therebetween, a speech of a first language sent from the first speech input unit is recognized. The speech recognition result is translated into a second language. The translation result is back translated into the first language. A first speech synthesis signal of the back translation result is sent to the first speech output unit. A second speech synthesis signal of the translation result is sent to the second speech output unit. Duration of the second speech synthesis signal or the first speech synthesis signal is measured. The first speech synthesis signal and the second speech synthesis signal are outputted by synchronizing a start time and an end time thereof, based on the duration.
摘要:
According to one embodiment, a speech of a first language is recognized using a speech recognition dictionary to recognize the first language and a second language, and a source sentence of the first language is generated. The source sentence is translated into a second language, and a translation sentence of the second language is generated. An unknown word included in the translation sentence is detected. The unknown word is not stored in the speech recognition dictionary. A first pronunciation candidate of the unknown word is estimated, from a representation of the unknown word. A second pronunciation candidate of the unknown word is estimated from a pronunciation of an original word included in the source sentence corresponding to the unknown word. The unknown word, the first pronunciation candidate and the second pronunciation candidate, are registered into the speech recognition dictionary correspondingly.