Abstract:
An echo cancellation system that performs audio beamforming to separate audio input into multiple directions and determines a target signal and a reference signal from the multiple directions. For example, the system may detect a strong signal associated with a speaker and select the strong signal as a reference signal, selecting another direction as a target signal. The system may determine a speech position and may select the speech position as a target signal and an opposite direction as a reference signal. The system may create pairwise combinations of opposite directions, with an individual direction being selected as a target signal and a reference signal. The system may select a fixed beamformer output for the target signal and an adaptive beamformer output for the reference signal, or vice versa. The system may remove the reference signal (e.g., audio output by the loudspeaker) to isolate speech included in the target signal.
Abstract:
There is provided a control device, control method, and computer program capable of executing control such that information can be displayed more appropriately and efficiently according to an environment in which information is displayed or a situation of displayed information, the control device including: an acquisition unit configured to acquire state information obtained by detecting a state of a projection surface; and a generation unit configured to generate control information for controlling an illumination unit that outputs illumination without involvement in display of image information based on the state information.
Abstract:
There is provided a display control device, display control method, and program capable of executing control such that information can be displayed more appropriately and efficiently according to an environment in which information is displayed or a situation of displayed information, the display control device including: a display control unit configured to control a display indicating a range of directivity formed by beamforming executed by a sound input unit or a sound output unit.
Abstract:
A method of estimating a steering vector of a sensor array of M sensors according to one embodiment of the present disclosure includes estimating a steering vector of a noise source located at an angle θ degrees from a look direction of the array using a least squares estimate of the gains of the sensors in the array, defining a steering vector of a desired sound source in the look direction of the array, and estimating the steering vector by performing element-by-element multiplication of the estimated noise vector and the complex conjugate of steering vector of the desired sound source. The sensors may be microphones.
Abstract:
A portable eye tracker device is disclosed which includes a frame, at least one optics holding member, a movement sensor, and a control unit. The frame may be a frame adapted for wearing by a user. The at least one optics holding member may include at least one illuminator configured to selectively illuminate at least a portion of at least one eye of the user, and at least one image sensor configured to capture image data representing images of at least a portion of at least one eye of the user. The movement sensor may be configured to detect movement of the frame. The control unit may be configured to control the at least one illuminator for the selective illumination of at least a portion of at least one eye of the user, receive the image data from the image sensors, and receive information from the movement sensor.
Abstract:
A system and method relate to receiving, by a processing device, a plurality of sound signals captured at a plurality of microphone sensors, wherein the plurality of sound signals are from a sound source, and wherein a number (M) of the plurality of microphone sensors is greater than three, determining a number (K) of layers for a multistage minimum variance distortionless response (MVDR) beamformer based on the number (M) of the plurality of microphone sensors, wherein the number (K) of layers is greater than one, and wherein each layer of the multistage MVDR beamformer comprises one or more mini-length MVDR beamformers, and executing the multistage MVDR beamformer to the plurality of sound signals to calculate an estimate of the sound source.
Abstract:
An object position estimating apparatus which estimates positions of M objects in real space (M being an integer not less than 2), including: a characteristic vector generating unit operable to generate, for each of M objects, a characteristic vector, the characteristic vector including as its components measurements of the object measured on N scales (N being an integer not less than 3), each of N scales measuring closeness to each of N reference points in the real space; a dissimilarity matrix deriving unit operable to calculate a norm between the characteristic vectors of two objects for every pair from among M objects and to derive a dissimilarity matrix with M rows and M columns, the dissimilarity matrix including as its elements the calculated norms; and an estimation unit operable to estimate positions of M objects in the real space based on the dissimilarity matrix and to output an estimation result.
Abstract:
A sound processing apparatus (400) is provided with: a directivity synthesis processing unit (410) for generating a first directivity sound pick-up signal by synthesizing a first sound pick-up signal and a relatively delayed second sound pick-up signal and a second directivity sound pick-up signal by synthesizing a relatively delayed first sound pick-up signal and a second sound pick-up signal; a comparison signal calculation unit (440) for generating a non-directivity level signal indicating the level of a sum of the directivity sound pick-up signals and a directivity level signal by adding the levels of the directivity sound pick-up signals; a level comparison unit (451) for acquiring the difference between the levels of the non-directivity level signal and the directivity level signal; and a delay control unit (452) for adjusting the delay amount such that the difference between the levels becomes smaller.
Abstract:
A portable eye tracker device is disclosed which includes a frame, at least one optics holding member, and a control unit. The frame may be adapted for wearing by a user. The at least one optics holding member may include at least one illuminator configured to selectively illuminate at least a portion of at least one eye of the user, and at least one image sensor configured to capture image data representing images of at least a portion of at least one eye of the user. The control unit may be configured to control the at least one illuminator for the selective illumination of at least a portion of at least one eye of the user, and receive the image data from the at least one image sensor.
Abstract:
A method is provided for encoding multiple microphone signals into a composite source-separable audio (SSA) signal, conducive for transmission over a voice network. The embodiments enable the processing of source separation of the target voice signal from its ambient sound to be performed at any point in the voice communication network, including the internet cloud. A multiplicity of processing is possible over the SSA signal, based on the intended voice application. The level of processing is adapted with the availability of the processing power at the chosen processing node in the network in one embodiment. An apparatus for separating out the target source voice from its ambient sound is also provided. The apparatus includes a directed source separation (DSS) unit, which processes the two virtual microphone signals in the SSA representation, to generate a new SSA signal including the enhanced target voice and the enhanced ambient noise.