摘要:
Recognizing a stream of speech received as speech vectors over a lossy communications link includes constructing for a speech recognizer a series of speech vectors from packets received over a lossy packetized transmission link, wherein some of the packets associated with each speech vector are lost or corrupted during transmission. Each constructed speech vector is multi-dimensional and includes associated features. After waiting for a predetermined time, speech vectors are generated and potentially corrupted features within the speech vector are indicated to the speech recognizer when present. Speech recognition is attempted at the speech recognizer on the speech vectors when corrupted features are present. This recognition may be based only on certain or valid features within each speech vector. Retransmission of a missing or corrupted packet is requested when corrupted values are indicated by the indicating step and when the attempted recognition step fails.
摘要:
Recognizing a stream of speech received as speech vectors over a lossy communications link includes constructing for a speech recognizer a series of speech vectors from packets received over a lossy packetized transmission link, wherein some of the packets associated with each speech vector are lost or corrupted during transmission. Each constructed speech vector is multi-dimensional and includes associated features. After waiting for a predetermined time, speech vectors are generated and potentially corrupted features within the speech vector are indicated to the speech recognizer when present. Speech recognition is attempted at the speech recognizer on the speech vectors when corrupted features are present. This recognition may be based only on certain or valid features within each speech vector. Retransmission of a missing or corrupted packet is requested when corrupted values are indicated by the indicating step and when the attempted recognition step fails.
摘要:
Recognizing a stream of speech received as speech vectors over a lossy communications link includes constructing for a speech recognizer a series of speech vectors from packets received over a lossy packetized transmission link, wherein some of the packets associated with each speech vector are lost or corrupted during transmission. Each constructed speech vector is multi-dimensional and includes associated features. Potentially corrupted features within the speech vector are indicated to the speech recognizer when present. Speech recognition is attempted at the speech recognizer on the speech vectors when corrupted features are present. This recognition may be based only on certain or valid features within each speech vector. Retransmission of a missing or corrupted packet is requested when corrupted values are indicated by the indicating step and when the attempted recognition step fails.
摘要:
Recognizing a stream of speech received as speech vectors over a lossy communications link includes constructing for a speech recognizer a series of speech vectors from packets received over a lossy packetized transmission link, wherein some of the packets associated with each speech vector are lost or corrupted during transmission. Each constructed speech vector is multi-dimensional and includes associated features. After waiting for a predetermined time, speech vectors are generated and potentially corrupted features within the speech vector are indicated to the speech recognizer when present. Speech recognition is attempted at the speech recognizer on the speech vectors when corrupted features are present. This recognition may be based only on certain or valid features within each speech vector. Retransmission of a missing or corrupted packet is requested when corrupted values are indicated by the indicating step and when the attempted recognition step fails.
摘要:
Recognizing a stream of speech received as speech vectors over a lossy communications link includes constructing for a speech recognizer a series of speech vectors from packets received over a lossy packetized transmission link, wherein some of the packets associated with each speech vector are lost or corrupted during transmission. Each constructed speech vector is multi-dimensional and includes associated features. Potentially corrupted features within the speech vector are indicated to the speech recognizer when present. Speech recognition is attempted at the speech recognizer on the speech vectors when corrupted features are present. This recognition may be based only on certain or valid features within each speech vector. Retransmission of a missing or corrupted packet is requested when corrupted values are indicated by the indicating step and when the attempted recognition step fails.
摘要:
A network resource system includes a first server which can communicate with a client computer. The first server produces a speech signal representing speech from a user at the client computer, and context information which indicates the semantic context of the user's speech and a predefined format in which data are returned to the first server. A network knowledge server is in communication with and separated from the first server. The network knowledge server returns to the first server a text structure having one or more fields corresponding to the predefined format. The first server uses data from the one or more fields to determine a response to the user's speech.
摘要:
A method and apparatus derive a dynamic grammar composed of a subset of a plurality of data elements that are each associated with one of a plurality of reference identifiers. The present invention generates a set of selection identifiers on the basis of a user-provided first input identifier and determines which of these selection identifiers are present in a set of pre-stored reference identifiers. The present invention creates a dynamic grammar that includes those data elements that are associated with those reference identifiers that are matched to any of the selection identifiers. Based on a user-provided second identifier and on the data elements of the dynamic grammar, the present invention selects one of the reference identifiers in the dynamic grammar.
摘要:
A method and apparatus for recognizing an identifier entered by a user. A caller enters a predetermined identifier through a voice input device or a touch-tone keypad of a telephone handset. A signal representing the entered identifier is transmitted to a remote recognizer, which responds to the identifier signal by producing a recognized output intended to match the entered identifier. The present invention compares this recognized identifier with a list of valid reference identifiers to determine which one of these reference identifiers most likely matches the entered identifier. In performing this determination, the present invention employs a confusion matrix, which is an arrangement of probabilities that indicate the likelihood that a given character in a particular character position of the reference identifier would be recognized by the recognizer as a character in the corresponding character position of the recognized identifier. This determination yields an identifier recognition probability for every reference identifier, and the present invention selects the reference identifier with the highest identifier recognition probability as most likely corresponding to the entered identifier.
摘要:
A method and apparatus derive a dynamic grammar composed of a subset of a plurality of data elements that are each associated with one of a plurality of reference identifiers. The present invention generates a set of selection identifiers on the basis of a user-provided first input identifier and determines which of these selection identifiers are present in a set of pre-stored reference identifiers. The present invention creates a dynamic grammar that includes those data elements that are associated with those reference identifiers that are matched to any of the selection identifiers. Based on a user-provided second identifier and on the data elements of the dynamic grammar, the present invention selects one of the reference identifiers in the dynamic grammar.
摘要:
A method and apparatus recognize an identifier entered by a user. A caller enters a predetermined identifier through a voice input device or a touch-tone keypad of a telephone handset. A signal representing the entered identifier is transmitted to a remote recognizer, which responds to the identifier signal by producing a recognized output intended to match the entered identifier. The present invention compares this recognized identifier with a list of valid reference identifiers to determine which one of these reference identifiers most likely matches the entered identifier. In performing this determination, the present invention compares each character of the recognized identifier with a character in a corresponding character position of each reference identifier in light of a plurality of confusion sets. On the basis of this comparison, the set of reference identifiers is reduced to a candidate set of reference identifiers, from which a reference identifier that matches the input identifier provided by the user.