摘要:
Systems and methods for improving the performance of a speech recognition system. In some embodiments a tuner module and/or a tester module are configured to cooperate with a speech recognition system. The tester and tuner modules can be configured to cooperate with each other. In one embodiment, the tuner module may include a module for playing back a selected portion of a digital data audio file, a module for creating and/or editing a transcript of the selected portion, and/or a module for displaying information associated with a decoding of the selected portion, the decoding generated by a speech recognition engine. In other embodiments, the tester module can include an editor for creating and/or modifying a grammar, a module for receiving a selected portion of a digital audio file and its corresponding transcript, and a scoring module for producing scoring statistics of the decoding based at least in part on the transcript.
摘要:
Systems and methods for improving the performance of a speech recognition system. In some embodiments a tuner module and/or a tester module are configured to cooperate with a speech recognition system. The tester and tuner modules can be configured to cooperate with each other. In one embodiment, the tuner module may include a module for playing back a selected portion of a digital data audio file, a module for creating and/or editing a transcript of the selected portion, and/or a module for displaying information associated with a decoding of the selected portion, the decoding generated by a speech recognition engine. In other embodiments, the tester module can include an editor for creating and/or modifying a grammar, a module for receiving a selected portion of a digital audio file and its corresponding transcript, and a scoring module for producing scoring statistics of the decoding based at least in part on the transcript.
摘要:
Systems and methods for improving the performance of a speech recognition system. In some embodiments a tuner module and/or a tester module are configured to cooperate with a speech recognition system. The tester and tuner modules can be configured to cooperate with each other. In one embodiment, the tuner module may include a module for playing back a selected portion of a digital data audio file, a module for creating and/or editing a transcript of the selected portion, and/or a module for displaying information associated with a decoding of the selected portion, the decoding generated by a speech recognition engine. In other embodiments, the tester module can include an editor for creating and/or modifying a grammar, a module for receiving a selected portion of a digital audio file and its corresponding transcript, and a scoring module for producing scoring statistics of the decoding based at least in part on the transcript.
摘要:
Systems and methods for determining a confidence score associated with a decoding output of a speech recognition engine. In one embodiment, a method of determining the confidence score comprises arranging time frame and acoustic score data into an array, determining a phoneme sequence in the array that yields the highest sum of acoustic scores under certain constraints, e.g., minimum number of time frames and order of phonemes in a phoneme string. A relative score is derived by applying a functional relationship between the acoustic score and different sums comprising acoustic scores from the array. The confidence score, in some embodiments, depends at least in part on the relative score and a measure of ambiguity associated with similar sounding phrases being included in different concepts of a specified grammar.
摘要:
Textual transcription of speech is generated and formatted according to user-specified transformation and behavior requirements for a speech recognition system having input grammars and transformations. An apparatus may include a speech recognition platform configured to receive a user-specified transformation requirement, recognize speech in speech data into recognized speech according to a set of recognition grammars; and apply transformations to the recognized speech according to the user-specified transformation requirement. The apparatus may further be configured to receive a user-specified behavior requirement and transform the recognized speech according to the behavior requirement. Other embodiments are described and claimed.
摘要:
Architecture for integrating application-dependent information into a constraints component at deployment time or when available. In terms of a general grammar, the constraints component can include or be a general grammar that comprises application-independent information and is structured in such a way that application-dependent information can be integrated into the general grammar without loss of fidelity. The general grammar includes a probability space and reserves a section of the probability space for the integration of application-dependent information. An integration component integrates the application-dependent information into the reserved section of the probability space for recognition processing. The application-dependent information is integrated into the reserved section of the probability space at deployment time or when available. The general grammar is structured to support the integration and improve the overall system.
摘要:
Architecture for integrating application-dependent information into a constraints component at deployment time or when available. In terms of a general grammar, the constraints component can include or be a general grammar that comprises application-independent information and is structured in such a way that application-dependent information can be integrated into the general grammar without loss of fidelity. The general grammar includes a probability space and reserves a section of the probability space for the integration of application-dependent information. An integration component integrates the application-dependent information into the reserved section of the probability space for recognition processing. The application-dependent information is integrated into the reserved section of the probability space at deployment time or when available. The general grammar is structured to support the integration and improve the overall system.
摘要:
Textual transcription of speech is generated and formatted according to user-specified transformation and behavior requirements for a speech recognition system having input grammars and transformations. An apparatus may include a speech recognition platform configured to receive a user-specified transformation requirement, recognize speech in speech data into recognized speech according to a set of recognition grammars; and apply transformations to the recognized speech according to the user-specified transformation requirement. The apparatus may further be configured to receive a user-specified behavior requirement and transform the recognized speech according to the behavior requirement. Other embodiments are described and claimed.