摘要:
A method for providing participants to a multiparty meeting with a transcript of the meeting, comprising the steps of: establishing an meeting among two or more participants; exchanging during said meeting voice data as well as documents; uploading at least a part of said voice data and at least a part of said documents to a remote speech recognition server (1), using an application programming interface of said remote speech recognition server; converting at least a part of said voice data to text with an automatic speech recognition system (13) in said remote speech recognition server, wherein said automatic speech recognition system uses said documents to improve the quality of speech recognition; building in said remote speech recognition server a computer object (120) embedding at least a part of said voice data, at least a part of said documents, and said text; making said computer object (120) available to at least one of said participant.
摘要:
An improved and computationally efficient signal processing is provided to estimate and reduce noise in a sampled signal. Hence, a first filter recursive filters a vector in the signal in one direction along the vector, a second filter recursive filters the vector in the opposite direction to the first filter along the vector, and a combining section combines the results of the first and second filters. Coefficients of the first and second filters are dependent on a position in the vector.
摘要:
An improved and computationally efficient signal processing is provided to estimate and reduce noise in a sampled signal. Hence, a first filter recursive filters a vector in the signal in one direction along the vector, a second filter recursive filters the vector in the opposite direction to the first filter along the vector, and a combining section combines the results of the first and second filters. Coefficients of the first and second filters are dependent on a position in the vector.
摘要:
A system is provided for matching two or more sequences of phonemes both or all of which may be generated from text or speech. A dynamic programming matching technique is preferably used having constraints which depend upon whether or not the two sequences are generated from text or speech and in which the scoring of the dynamic programming paths is weighted by phoneme confusion scores, phoneme insertion scores and phoneme deletion scores where appropriate.
摘要:
A data structure is provided for annotating data files within a database. The annotation data comprises a phoneme and word lattice which allows the quick and efficient searching of data files within the database in response to a user's input query. The structure of the annotation data is such that it allows the input query to be made by voice and can be used for annotating various kinds of data files, such as audio data files, video data files, multimedia data files etc. The annotation data may be generated from the data files themselves or may be input by the user either from a voiced input or from a typed input.
摘要:
Robust signal detection against various types of background noise is implemented. According to a signal detection apparatus and method of this invention, the feature amount of an input signal sequence and the feature amount of a noise component contained in the signal sequence are extracted. After that, the first likelihood indicating probability that the signal sequence is detected and the second likelihood indicating probability that the noise component is detected are calculated on the basis of a predetermined signal-to-noise ratio and the extracted feature amount of the signal sequence. Additionally, a likelihood ratio indicating the ratio between the first likelihood and the second likelihood is calculated. Detection of the signal sequence is determined on the basis of the likelihood ratio.
摘要:
A signal processing apparatus and method for performing a robust endpoint detection of a signal are provided. An input signal sequence is divided into frames each of which has a predetermined time length. The presence of the signal in the frame is detected. After that, the filter process of smoothing the detection result by using the detection result for a past frame is applied to the detection result for a current frame. The filter output is compared with a predetermined threshold value to determine the state of the signal sequence of the current frame on the basis of the comparison result.
摘要:
Robust signal detection against various types of background noise is implemented. According to a signal detection apparatus, the feature amount of an input signal sequence and the feature amount of a noise component contained in the signal sequence are extracted. After that, the first likelihood indicating probability that the signal sequence is detected and the second likelihood indicating probability that the noise component is detected are calculated on the basis of a predetermined signal-to-noise ratio and the extracted feature amount of the signal sequence. Additionally, a likelihood ratio indicating the ratio between the first likelihood and the second likelihood is calculated. Detection of the signal sequence is determined on the basis of the likelihood ratio.
摘要:
A signal processing apparatus and method for performing a robust endpoint detection of a signal are provided. An input signal sequence is divided into frames each of which has a predetermined time length. The presence of the signal in the frame is detected. After that, the filter process of smoothing the detection result by using the detection result for a past frame is applied to the detection result for a current frame. The filter output is compared with a predetermined threshold value to determine the state of the signal sequence of the current frame on the basis of the comparison result.