摘要:
Systems and methods for improving the performance of a speech recognition system. In some embodiments a tuner module and/or a tester module are configured to cooperate with a speech recognition system. The tester and tuner modules can be configured to cooperate with each other. In one embodiment, the tuner module may include a module for playing back a selected portion of a digital data audio file, a module for creating and/or editing a transcript of the selected portion, and/or a module for displaying information associated with a decoding of the selected portion, the decoding generated by a speech recognition engine. In other embodiments, the tester module can include an editor for creating and/or modifying a grammar, a module for receiving a selected portion of a digital audio file and its corresponding transcript, and a scoring module for producing scoring statistics of the decoding based at least in part on the transcript.
摘要:
Systems and methods for improving the performance of a speech recognition system. In some embodiments a tuner module and/or a tester module are configured to cooperate with a speech recognition system. The tester and tuner modules can be configured to cooperate with each other. In one embodiment, the tuner module may include a module for playing back a selected portion of a digital data audio file, a module for creating and/or editing a transcript of the selected portion, and/or a module for displaying information associated with a decoding of the selected portion, the decoding generated by a speech recognition engine. In other embodiments, the tester module can include an editor for creating and/or modifying a grammar, a module for receiving a selected portion of a digital audio file and its corresponding transcript, and a scoring module for producing scoring statistics of the decoding based at least in part on the transcript.
摘要:
Systems and methods for improving the performance of a speech recognition system. In some embodiments a tuner module and/or a tester module are configured to cooperate with a speech recognition system. The tester and tuner modules can be configured to cooperate with each other. In one embodiment, the tuner module may include a module for playing back a selected portion of a digital data audio file, a module for creating and/or editing a transcript of the selected portion, and/or a module for displaying information associated with a decoding of the selected portion, the decoding generated by a speech recognition engine. In other embodiments, the tester module can include an editor for creating and/or modifying a grammar, a module for receiving a selected portion of a digital audio file and its corresponding transcript, and a scoring module for producing scoring statistics of the decoding based at least in part on the transcript.
摘要:
Systems and methods for determining a confidence score associated with a decoding output of a speech recognition engine. In one embodiment, a method of determining the confidence score comprises arranging time frame and acoustic score data into an array, determining a phoneme sequence in the array that yields the highest sum of acoustic scores under certain constraints, e.g., minimum number of time frames and order of phonemes in a phoneme string. A relative score is derived by applying a functional relationship between the acoustic score and different sums comprising acoustic scores from the array. The confidence score, in some embodiments, depends at least in part on the relative score and a measure of ambiguity associated with similar sounding phrases being included in different concepts of a specified grammar.
摘要:
Systems and methods are provided for embodiments of a speech recognition system call flow object model. The systems and methods organize and execute, for example, multiple question directed dialogs, overview dialogs, or natural language directed dialogs. In certain embodiments, the organizing and executing of the natural language directed dialogs uses primary and secondary concepts without requiring a structured response. The systems and methods enable the call flow designers to define particular call flows without requiring the designers to perform any programming or coding.
摘要:
Systems and methods are provided for embodiments of a speech recognition system call flow object model. The systems and methods organize and execute, for example, multiple question directed dialogs, overview dialogs, or natural language directed dialogs. In certain embodiments, the organizing and executing of the natural language directed dialogs uses primary and secondary concepts without requiring a structured response. The systems and methods enable the call flow designers to define particular call flows without requiring the designers to perform any programming or coding.