摘要:
A tailored speaker-independent voice recognition system has a speech recognition dictionary (360) with at least one word (371). That word (371) has at least two transcriptions (373), each transcription (373) having a probability factor (375) and an indicator (377) of whether the transcription is active. When a speech utterance is received (510), the voice recognition system determines (520, 530) the word signified by the speech utterance, evaluates (540) the speech utterance against the transcriptions of the correct word, updates (550) the probability factors for each transcription, and inactivates (570) any transcription that has an updated probability factor that is less than a threshold.
摘要:
An electronic device (400) for speech dialog includes functions that receive (405, 205) a speech phrase that includes an instantiated variable (315), generate pitch and voicing characteristics (330) of the instantiated variable, and performs voice recognition (410, 220) of the instantiated variable to determine a most likely set of recognition acoustic states (335). A trained map (358) is established (115) that maps recognition feature vectors derived from training speech (105) to synthesis feature vectors derived from the same training speech (110). Recognition feature vectors that represent the most likely set of recognition acoustic states for the recognized instantiated variable are converted to a most likely set of synthesis acoustic states (420) in accordance with the map. The electronic device may generate (421, 440, 445) a synthesized value of the instantiated variable using the most likely set of synthesis acoustic states and the pitch and voicing characteristics extracted from the instantiated variable.
摘要:
An electronic device (300) for speech dialog includes functions that receive (305, 105) a speech phrase that comprises a request phrase that includes an instantiated variable (215), generate (335, 115) pitch and voicing characteristics (315) of the instantiated variable, and performs voice recognition (319, 125) of the instantiated variable to determine a most likely set of acoustic states (235). The electronic device may generate (335, 140) a synthesized value of the instantiated variable using the most likely set of acoustic states and the pitch and voicing characteristics of the instantiated variable. The electronic device may use a table of previously entered values of variables that have been determined to be unique, and in which the values are associated with a most likely set of acoustic states and the pitch and voicing characteristics determined at the receipt of each value to disambiguate (425, 430) a newly received instantiated variable.
摘要:
A method and apparatus for generating a voice tag (140) includes a means (110) for combining (205) a plurality of utterances (106, 107, 108) into a combined utterance (111) and a means (120) for extraction (210) of the voice tag as a sequence of phonemes having a high likelihood of representing the combined utterance, using a set of stored phonemes (115) and the combined utterance.
摘要:
A method, a system and a computer program product for interpreting a verbal input in a multimodal dialog system are provided. The method includes assigning (302) a confidence value to at least one word generated by a verbal recognition component. The method further includes generating (304) a semantic unit confidence score for the verbal input. The generation of a semantic unit confidence score is based on the confidence value of at least one word and at least one semantic confidence operator.
摘要:
A method and apparatus are provided for reproducing a speech sequence of a user through a communication device of the user. The method includes the steps of detecting a speech sequence from the user through the communication device, recognizing a phoneme sequence within the detected speech sequence and forming a confidence level of each phoneme within the recognized phoneme sequence. The method further includes the steps of audibly reproducing the recognized phoneme sequence for the user through the communication device and gradually highlighting or degrading a voice quality of at least some phonemes of the recognized phoneme sequence based upon the formed confidence level of the at least some phonemes.
摘要:
Techniques are provided for improving security in a single-sign-on context by providing, to a user's client system, two linked authentication credentials in separate logical communication sessions and requiring that both credentials be presented to a host system. Only after presentation of both credentials is the user authenticated and permitted to access applications on the host system.
摘要:
Techniques are described for repairing some types of user account problems that interfere with granting a user access to a computer system and doing so during a process to authenticate the user in a way that does not require the user to re-enter authentication information or require the user to restart a communication session with the computer system. In response to a determination that a user's account has a problem during an authentication process, techniques are provided to enable a user to execute an appropriate process or processes to fix the user account, after which the authentication process continues. In this way, the correction to the user account may appear to be seamless to the user.
摘要:
A kind of pyrazolyl acrylniitrile compounds represented by the structures of formula I or stereoisomers thereof are disclosed in the present invention. Where in: R1 is selected from the group of substituents consisting of H, C1-C4 alkoxy C1-C2 alkyl, C3-C5 alkenyloxy C1-C2 alkyl, C3-C5 alknyloxy C1-C2 alkyl, C1-C4 alkylthio C1-C2 alkyl, C1-C5 alkyl carbonyl, C3-C8 cycloalkyl carbonyl, C1-C5 alkoxy carbonyl or C1-C5 alkylthio carbonyl; R2 is Cl or methyl; R3 is H, methyl, CN, NO2 or halogen. Or its stereoisomers.The Formula I compounds have high insecticidal activities or acaricidal activities, so they can be used as insecticide or acaricide.
摘要:
A device and method of detecting a mode of inputting speech to a wireless device includes a processor (220) communicatively coupled to a user input (218), an audio input (206), a timer (211), and a speech processor (230). The processor (220) monitors the user input (218) and, upon detection of a first change in state of the user input (218), opens an input channel from the audio input (206) to the speech processor (230), monitors the timer (211) for an elapsed time, and monitors the user input (218) for a second change of state. Upon detection of the second change of state after a predetermined amount of time elapses, the input channel is closed. Upon detection of the second change of state before a predetermined amount of time elapses, the user input is monitored for a third change of state and upon detecting the third change of state, the input channel is closed.