Abstract:
A speech recognition method, an apparatus, an electronic device, and a computer-readable storage medium are provided. The method includes acquiring a first speech recognition result of a speech; acquiring context information and pronunciation feature information about a target text unit in the first speech recognition result; and acquiring a second speech recognition result of the speech based on the context information and the pronunciation feature information.
Abstract:
A method for providing multimodal translation of a content in a source language is provided. The method includes receiving a user input with respect to a translation request of text included in the content, in response to receiving the user input, acquiring a multimodal input from the content, the multimodal input including location information related to the content other multimodal inputs, generating scene information representing the multimodal input related to the content by using a fusion layer based on the location information and the other multimodal inputs, identifying a candidate word set in a target language, determining at least one candidate word from the candidate word set based on the scene information, and translating the text included in the content into the target language using a translation model based on the determined at least one candidate word.
Abstract:
A system and a method are disclosed for reducing local oscillator leakage. In some embodiments, a system includes a circuit, including: a mixer; and a bias control circuit, the mixer having a first local oscillator input, the bias control circuit being configured to control a bias at the first local oscillator input.
Abstract:
A data processing method, a device wake-up method, an electronic device, and a storage medium are provided. In the data processing method, speech to be processed is converted into a keyword phone sequence, and a similar pronunciation sequence generator acquires a similar phone sequence corresponding to the keyword phone sequence in a sequence generation manner, thereby acquiring a first data processed result corresponding to the speech to be processed. By replacing the search method of large-scale speech databases with this generation manner, effective coverage of possible real-life sounds can be achieved with a smaller model, thus improving the ability to distinguish confusing pronunciations. The above data processing method performed by the electronic device can be performed by an artificial intelligence (AI) model.
Abstract:
An apparatus for detecting a body part from a user image may include an image acquirer to acquire a depth image, an extractor to extract the user image from a foreground of the acquired depth image, and a body part detector to detect the body part from the user image, using a classifier trained based on at least one of a single-user image sample and a multi-user image sample. The single-user image may be an image representing non-overlapping users, and the multi-user image may be an image representing overlapping users.
Abstract:
Provided is a device and method for estimating a head pose which may obtain an excellent head pose recognition result free from the influence of an illumination change, the device including a head area extracting unit to extract a head area from an input depth image, a head pitch angle estimating unit to estimate a pitch angle of a head in the head area, a head yaw angle estimating unit to estimate a yaw angle of the head in the head area, and a head pose displaying unit to display a head pose based on the estimated pitch angle of the head and the estimated yaw angle of the head.
Abstract:
A method performed by an electronic device comprises acquiring information to be translated. The method includes determining, based on the information to be translated, a target domain adapter from a plurality of candidate domain adapters, the target domain adapter corresponding to the information to be translated, each candidate domain adapter from the plurality of candidate domain adapters corresponding to at least one domain. The method includes obtaining, based on the target domain adapter corresponding to the information to be translated, a translation result corresponding to the information to be translated.
Abstract:
An electronic device and method are provided. The electronic device includes a directional coupler, a sense pair connected to the directional coupler, and an analog-to-digital converter (ADC) connected to the sense pair. The ADC directly digitizes a signal current received from the sense pair.