Abstract:
A device to perform end-of-utterance detection includes a speaker vector extractor configured to receive a frame of an audio signal and to generate a speaker vector that corresponds to the frame. The device also includes an end-of-utterance detector configured to process the speaker vector and to generate an indicator that indicates whether the frame corresponds to an end of an utterance of a particular speaker.
Abstract:
A device includes a screen and one or more processors configured to provide, at the screen, a graphical user interface (GUI) configured to display data associated with multiple devices on the screen. The GUI is also configured to illustrate a label and at least one control input for each device of the multiple devices. The GUI is also configured to provide feedback to a user. The feedback indicates that a verbal command is not recognized with an action to be performed. The GUI is also configured to provide instructions for the user on how to teach the one or more processors which action is to be performed in response to receiving the verbal command.
Abstract:
A method, performed in an electronic device, for connecting to a target device is disclosed. The method includes capturing an image including a face of a target person associated with the target device and recognizing an indication of the target person. The indication of the target person may be a pointing object, a speech command, and/or any suitable input command. The face of the target person in the image is detected based on the indication and at least one facial feature of the face in the image is extracted. Based on the at least one facial feature, the electronic device is connected to the target device.
Abstract:
A method for providing object information for a scene in a wearable computer is disclosed. In this method, an image of the scene is captured. Further, the method includes determining a current location of the wearable computer and a view direction of an image sensor of the wearable computer and extracting at least one feature from the image indicative of at least one object. Based on the current location, the view direction, and the at least one feature, information on the at least one object is determined. Then, the determined information is output.
Abstract:
According to an aspect of the present disclosure, a method for controlling access to a plurality of electronic devices is disclosed. The method includes detecting whether a first device is in contact with a user, adjusting a security level of the first device to activate the first device when the first device is in contact with the user, detecting at least one second device within a communication range of the first device, and adjusting a security level of the at least one second device to control access to the at least one second device based on a distance between the first device and the at least one second device.
Abstract:
A method of detecting a target keyword for activating a function in an electronic device is disclosed. The method includes receiving an input sound starting from one of the plurality of portions of the target keyword. The input sound may be periodically received based on a duty cycle. The method extracts a plurality of sound features from the input sound, and obtains state information on a plurality of states associated with the portions of the target keyword. Based on the extracted sound features and the state information, the input sound may be detected as the target keyword. The plurality of states includes a predetermined number of entry states indicative of a predetermined number of the plurality of portions.