Abstract:
Systems and methods are provided for training of an Automatic Speech Recognition (ASR) model during runtime of a transcription system, the system includes a background processor configured to operate with the transcription system to display a speech-to-text sample of an audio segment of a cockpit communication with an identifier which is converted using an ASR model wherein the background processor receives a response by a user during runtime of the transcription system and display of the speech-to-text sample and causes a change to the identifier to either a positive or negative attribute upon a determination of the correctness of a conversion process of the speech-to-text sample using the ASR model by review of a display of the content of the speech-to-text sample; and to train the ASR model based on information associated with the content of the speech-to-text sample in accordance with the response by the user.
Abstract:
A method for controlling an interactive display is provided. The method receives a set of voice input data, via a voice input device communicatively coupled to the interactive display; interprets, by at least one processor, the set of voice input data to produce an interpreted result, wherein the at least one processor is communicatively coupled to the voice input device and the interactive display; presents, by the interactive display, a text representation of the interpreted result coupled to a user-controlled cursor; receives, by a user interface, a user input selection of a textual or graphical element presented by the interactive display, wherein the user interface is communicatively coupled to the at least one processor and the interactive display; and performs, by the at least one processor, an operation associated with the interpreted result and the user input selection.
Abstract:
A method for controlling an interactive display is provided. The method receives a set of voice input data, via a voice input device communicatively coupled to the interactive display; interprets, by at least one processor, the set of voice input data to produce an interpreted result, wherein the at least one processor is communicatively coupled to the voice input device and the interactive display; presents, by the interactive display, a text representation of the interpreted result coupled to a user-controlled cursor; receives, by a user interface, a user input selection of a textual or graphical element presented by the interactive display, wherein the user interface is communicatively coupled to the at least one processor and the interactive display; and performs, by the at least one processor, an operation associated with the interpreted result and the user input selection.
Abstract:
A system and method for recognizing speech on board an aircraft that compensates for different regional dialects over an area comprised of at least first and second distinct geographical regions, comprises analyzing speech in the first distinct geographical region using speech data characteristics representative of speech in the first distinct geographical region, detecting a change in position from the first distinct geographical region to the second geographical region, and analyzing speech in the second distinct geographical region using speech data characteristics representative of speech in the second distinct geographical region upon detecting that the aircraft has transitioned from the first distinct geographical region to the second distinct geographical region.
Abstract:
Methods and systems are provided for assisting operation of a vehicle using speech recognition. One method involves analyzing a transcription of an audio communication with respect to the vehicle to characterize a nonstandard pattern within the transcription of the audio communication, obtaining a ground truth for the transcription of the audio communication, determining one or more performance metrics associated with the nonstandard pattern within the transcription based on a relationship between the transcription of the audio communication and the ground truth for the transcription, updating a speech recognition vocabulary for the vehicle to include the nonstandard pattern based at least in part on the one or more performance metrics and determining an updated speech recognition model for the vehicle using the updated speech recognition vocabulary and the audio communication.
Abstract:
A system for extracting speaker information in an ATC transcription and displaying the speaker information on a graphical display unit is provided. The system is configured to: segment a stream of audio received from an ATC and other aircraft into a plurality of chunks; determine, for each chunk, if the speaker is enrolled in an enrolled speaker database; when the speaker is enrolled in the enrolled speaker database, decode the chunk using a speaker-dependent automatic speech recognition (ASR) model and tag the chunk with a permanent name for the speaker; when the speaker is not enrolled in the enrolled speaker database, assign a temporary name for the speaker, tag the chunk with the temporary name, and decode the chunk using a speaker independent speech recognition model; format the decoded chunk as text; and signal the graphical display unit to display the formatted text along with an identity for the speaker.
Abstract:
Systems and methods are provided for a transcription system with voice activity detection (VAD). The system includes a VAD module to receive incoming audio and generate an audio segment; and a speech decoder with a split predictor to perform, in a first pass, a decode operation to transcribe text from an audio segment into a message; wherein in the first pass, if the message is determined not to contain a split point based on a content-based analysis performed by the split predictor, the speech decoder forwards the message for display and if the message is determined based on the content-based analysis to contain the split point, the speech decoder performs in a second pass, a re-decode operation to transcribe text from the audio segment based on the split point wherein the split point is configured within an audio domain of the audio segment by the split predictor and forward the message for display.
Abstract:
Methods and apparatus are provided for traffic prioritization of surrounding air traffic for display onboard an aircraft. The apparatus includes a traffic data source configured to supply surrounding traffic data. The apparatus includes a traffic control module coupled to receive user selection data from the user input device and the surrounding traffic data. The traffic control module can be configured to determine a prioritization zone for prioritizing the surrounding air traffic to identify air traffic preceding the aircraft based on the user selection data, the range and the vertical speed of the surrounding air traffic, and set first traffic data that includes the surrounding air traffic within the prioritization zone listed by priority and second traffic data that includes the surrounding air traffic outside of the prioritization zone listed in received sequence. The apparatus displays a graphical user interface that includes the first traffic and the second traffic data.
Abstract:
A system for extracting speaker information in an ATC transcription and displaying the speaker information on a graphical display unit is provided. The system is configured to: segment a stream of audio received from an ATC and other aircraft into a plurality of chunks; determine, for each chunk, if the speaker is enrolled in an enrolled speaker database; when the speaker is enrolled in the enrolled speaker database, decode the chunk using a speaker-dependent automatic speech recognition (ASR) model and tag the chunk with a permanent name for the speaker; when the speaker is not enrolled in the enrolled speaker database, assign a temporary name for the speaker, tag the chunk with the temporary name, and decode the chunk using a speaker independent speech recognition model; format the decoded chunk as text; and signal the graphical display unit to display the formatted text along with an identity for the speaker.
Abstract:
A method for displaying received radio voice messages onboard an aircraft is provided. The method post-processes, by at least one processor onboard the aircraft, a set of speech recognition (SR) hypothetical data to increase accuracy of an associated SR system, by: obtaining, by the at least one processor, secondary source data from a plurality of secondary sources; comparing, by the at least one processor, the set of SR hypothetical data to the secondary source data; and identifying, by the at least one processor, an aircraft tail number using the set of SR hypothetical data and the secondary source data; identifies, by the at least one processor, a subset of the received radio voice messages including the tail number; and presents, via a display device onboard the aircraft, the subset using distinguishing visual characteristics.