摘要:
An infotainment and connectivity system for a vehicle includes a control module, a first plurality of input devices, and a plurality of output devices. The control module includes a first control logic sequence, a plurality of software based programs, and a memory module. The control logic operates to control operation of the infotainment and connectivity system. The first plurality of input devices is disposed on an interior of the vehicle and includes a camera, a microphone, and an image sensor. The plurality of output devices includes at least a communication synchronization system.
摘要:
A voice control device includes a microphone module, a voice encoding module, a display and a processing unit. The voice encoding module is electrically connected to the microphone module. The processing unit is electrically connected to the voice encoding module and the display. The microphone module receives a voice signal and transmits the received voice signal to the voice encoding module. One of the voice encoding module and the processing unit analyzes and processes the voice signal to determine a sound source direction of the voice signal and obtains response information according to the voice signal. The processing unit controls the display to rotate to the sound source direction and transmits the response information to the display for displaying the response information.
摘要:
An apparatus including circuitry configured to determine a position of a mouth of a user that is distinguishable among a plurality of people, and control an acquisition condition for collecting a sound based on the determined position of the user's mouth.
摘要:
A method of producing output indicative of the content of speech or mouthed speech from movement of speech articulators is described. The method may including fixing a plurality of magnets respectively to a plurality of speech articulators of a human individual. Providing a support. Providing a plurality of signal magnetic field sensors. Providing at least three reference magnetic field sensors orientated differently from one another with respect to the Earth's magnetic field. The signal and reference magnetic field sensors being fixed to the support which holds the sensors in fixed spatial relationships to one another. Producing, over a period of time, a respective signal from each signal magnetic field sensor and a respective signal from each reference magnetic field sensor. Obtaining, over the period of time, for each said signal magnetic field sensor signal, a respective correction value.
摘要:
A speech-to-text input method, comprising: receiving a speech input from a user; converting the speech input into text through speech recognition; displaying the recognized text to the user; determining a gaze position of the user on a display by way of tracking the eye movement of the user; displaying an edit cursor at said gaze position when said gaze position is located at the displayed text; receiving a speech edit command from the user; recognizing the speech edit command through speech recognition; and editing said text at said edit cursor according to the recognized speech edit command.
摘要:
Systems and methods for contactless speech recognition using lip-reading are provided. In various aspects, a speech recognition unit 112 is configured to receive, via a receiver 108, a Doppler broadened reflected electromagnetic signal that has been modulated and reflected by the lip and facial movements of a speaking subject 104 and to output recognized speech based on an analysis of the received reflected signal. In one embodiment, the functionality of speech recognition unit 112 is implemented via a preprocessing unit 202, a Neural Network ("NNet") unit 204, and a Hidden Markov Model ("HMM") unit 206.
摘要:
In general, the subject matter described in this specification can be embodied in methods, systems, and program products for providing search results automatically to a user of a computing device. A spoken input provided by a user to a computing device is received. The spoken input is transmitted to a computer server system that is remote from the computing device. Search result information that is responsive to the spoken input is receiving by the computing device and in response to the transmitted spoken input. An alert is provided to the user that the device will connect the user to a target of the search result information if the user does not intervene to stop the connecting of the user. The user is connected to the target of the search result information based on a determination that the user has not intervened to stop the connecting of the user.
摘要:
A pronunciation diagnosis device includes: articulation attribute data having articulation attribute values corresponding to preferable pronunciation concerning one of the tongue state, the lip state, the glottis state, the palatine uvula state, the nasopalatine groove state, the teeth state, and the jaw state or a combination including at least one of the articulation organ states, enforcing of the articulation organ state and a combination of the exhalation states for each of the phonemes constituting each voice language system; means for extracting acoustic characteristics from a voice signal pronounced by a speaker such as a frequency characteristic amount, a voice volume, a duration time, their change amount, or a combination of the change pattern, and a combination of them; attribute value estimation means for estimating an attribute value concerning the articulation attribute according to the extracted acoustic characteristic; and judgment means for comparing the estimated attribute value to the preferable articulation attribute data so as to make judgment concerning the pronunciation of the speaker.
摘要:
A face portion detection device, a behavior content classification device, a speech content classification device, a car navigation system, a face direction classification device, a face portion detection device control program, a behavior content classification device control program, a face portion detection device control method, and a behavior content classification device control method are provided for appropriately classifying a behavior content of the object person from a captured image including the face of the object person. A speech section detection device 1 includes an image capturing unit 10, a data storage unit 11, an image processing unit 12, a lip region detection unit 13, feature extraction unit 14, and a speech section detection unit 15. The lip region detection unit 13 uses a dedicated SVM to detect a lip region from a captured image, and the speech section detection unit 15 uses features of an image of a detected lip region and a dedicated HMM to detect a speech section.
摘要:
A data input system having a keypad defining a plurality of keys, each key contains at least one symbol of a group of symbols. The group of symbols is divided into subgroups each having at least one of alphabetical symbols, numeric symbols, and command symbols, where each subgroup is associated with at least a portion of a user's finger. A finger recognition system is in communication with at least one key, where the key has at least a first symbol from a first subgroup and at least a second symbol from a second subgroup, The finger recognition system is configured to recognize the portion of the user's finger when the finger interacts with the key so as to select the symbol on the key, corresponding to the subgroup associated with the portion of the user's finger.