SYSTEM AND METHOD FOR PERSONALIZING DIALOGUE BASED ON USER'S APPEARANCES

    公开(公告)号:WO2019133698A1

    公开(公告)日:2019-07-04

    申请号:PCT/US2018/067654

    申请日:2018-12-27

    Applicant: DMAI, INC.

    Inventor: SHUKLA, Nishant

    Abstract: The present teaching relates to method, system, medium, and implementations for enabling communication with a user. Information representing surrounding of a user engaged in an on-going dialogue is received via the communication platform, wherein the information includes a current response from the user in the on-going dialogue and is acquired from a current scene in which the user is present and captures characteristics of the user and the current scene. Relevant features are extracted from the information. A state of the user is estimated based on the relevant features and a dialogue context surrounding the current scene is determined based on the relevant features. A feedback directed to the current response of the user is generated based on the state of the user and the dialogue context.

    SYSTEM AND METHOD FOR IDENTIFYING A POINT OF INTEREST BASED ON INTERSECTING VISUAL TRAJECTORIES

    公开(公告)号:WO2019161241A1

    公开(公告)日:2019-08-22

    申请号:PCT/US2019/018270

    申请日:2019-02-15

    Applicant: DMAI, INC.

    Inventor: SHUKLA, Nishant

    Abstract: The present teaching relates to method, system, medium, and implementations for identifying object of interest. Image data acquired by a camera with respect to a scene are received. One or more users are detected, during a period of time, from the image data who are present at the scene. Three dimensional (3D) gazing rays of the one or more users during the period of time are estimated. One or more intersections of such 3D gazing rays are identified and are used to determine at least one object of interest of the one or more users.

    SYSTEM AND METHOD FOR INFERRING SCENES BASED ON VISUAL CONTEXT-FREE GRAMMAR MODEL

    公开(公告)号:WO2019161237A1

    公开(公告)日:2019-08-22

    申请号:PCT/US2019/018264

    申请日:2019-02-15

    Applicant: DMAI, INC.

    Abstract: The present teaching relates to method, system, medium, and implementations for determining a type of a scene. Image data acquired by a camera with respect to a scene are received and one or more objects present in the scene are detected therefrom. The detected objects are recognized based on object recognition models. The spatial relationships among the detected objects are then determined based on the image data. The recognized objects and their spatial relationships are then used to infer a type of the scene in accordance with at least one scene context-free grammar model.

    SYSTEM AND METHOD FOR SPEECH UNDERSTANDING VIA INTEGRATED AUDIO AND VISUAL BASED SPEECH RECOGNITION

    公开(公告)号:WO2019161198A1

    公开(公告)日:2019-08-22

    申请号:PCT/US2019/018215

    申请日:2019-02-15

    Applicant: DMAI, INC.

    Abstract: The present teaching relates to method, system, medium, and implementations for speech recognition. An audio signal is received that represents a speech of a user engaged in a dialogue. A visual signal is received that captures the user uttering the speech. A first speech recognition result is obtained by performing audio based speech recognition based on the audio signal. Based on the visual signal, lip movement of the user is detected and a second speech recognition result is obtained by performing lip reading based speech recognition. The first and the second speech recognition results are then integrated to generate an integrated speech recognition result.

    SYSTEM AND METHOD FOR SELECTIVE ANIMATRONIC PERIPHERAL RESPONSE FOR HUMAN MACHINE DIALOGUE

    公开(公告)号:WO2019133689A1

    公开(公告)日:2019-07-04

    申请号:PCT/US2018/067641

    申请日:2018-12-27

    Applicant: DMAI, INC.

    Inventor: SHUKLA, Nishant

    Abstract: The present teaching relates to method, system, medium, and implementation for activating an animatronic device. Information about a user is obtained for whom an animatronic device is to be configured to carry out a dialogue with the user. The animatronic device includes a head portion and a body portion and the head portion is configured based on one of a plurality of selectable head portions. One or more preferences of the user are identified from the obtained information and used to select, from the plurality of selectable head portions, a first selected head portion. A configuration of the head portion of the animatronic device is then configured based on the first selected head portion for carrying out the dialogue.

    SYSTEM AND METHOD FOR RECONSTRUCTING UNOCCUPIED 3D SPACE

    公开(公告)号:WO2019161229A1

    公开(公告)日:2019-08-22

    申请号:PCT/US2019/018253

    申请日:2019-02-15

    Applicant: DMAI, INC.

    Inventor: SHUKLA, Nishant

    Abstract: The present teaching relates to method, system, medium, and implementations for understanding a three dimensional (3D) scene. Image data acquired by a camera at different time instances with respect to the 3D scene are received wherein the 3D scene includes a user or one or more objects. The face of the user is detected and tracked at different time instances. With respect to some of the time instances, a 2D user profile representing a region in the image data occupied by the user is generated based on a corresponding face detected and a corresponding 3D space in the 3D scene is estimated based on calibration parameters associated with the camera. Such estimated 3D space occupied by the user in the 3D scene is used to dynamically update a 3D space occupancy record of the 3D scene.

    SYSTEM AND METHOD FOR DISAMBIGUATING A SOURCE OF SOUND BASED ON DETECTED LIP MOVEMENT

    公开(公告)号:WO2019161196A2

    公开(公告)日:2019-08-22

    申请号:PCT/US2019/018212

    申请日:2019-02-15

    Applicant: DMAI, INC.

    Abstract: The present teaching relates to method, system, medium, and implementations for detecting a source of speech sound in a dialogue. A visual signal acquired from a dialogue scene is first received, where the visual signal captures a person present in the dialogue scene. A human lip associated with the person is detected from the visual signal and tracked to detect whether lip movement is observed. If lip movement is detected, a first candidate source of sound is generated corresponding to an area in the dialogue scene where the lip movement occurred.

    SYSTEM AND METHOD FOR PERSONALIZED AND ADAPTIVE APPLICATION MANAGEMENT

    公开(公告)号:WO2019133684A1

    公开(公告)日:2019-07-04

    申请号:PCT/US2018/067634

    申请日:2018-12-27

    Applicant: DMAI, INC.

    Abstract: The present teaching relates to method, system, and medium for cross network communications. Information related to an application running on a user device is first received, which includes a state of the application and sensor data obtained with respect to a user interacting with the application on the user device. A request is sent to an application server for an instruction of a state transition of the application. A light weight model (LWM) for an object involved in the state transition is received and is personalized based on at least one of the sensor data and one or more preferences related to the user to generate a personalized model (PM) for the object, which is then sent to the user device.

    SYSTEM AND METHOD FOR DISAMBIGUATING A SOURCE OF SOUND BASED ON DETECTED LIP MOVEMENT

    公开(公告)号:WO2019161196A3

    公开(公告)日:2019-08-22

    申请号:PCT/US2019/018212

    申请日:2019-02-15

    Applicant: DMAI, INC.

    Abstract: The present teaching relates to method, system, medium, and implementations for detecting a source of speech sound in a dialogue. A visual signal acquired from a dialogue scene is first received, where the visual signal captures a person present in the dialogue scene. A human lip associated with the person is detected from the visual signal and tracked to detect whether lip movement is observed. If lip movement is detected, a first candidate source of sound is generated corresponding to an area in the dialogue scene where the lip movement occurred.

    SYSTEM AND METHOD FOR ADAPTIVE DETECTION OF SPOKEN LANGUAGE VIA MULTIPLE SPEECH MODELS

    公开(公告)号:WO2019161193A3

    公开(公告)日:2019-08-22

    申请号:PCT/US2019/018209

    申请日:2019-02-15

    Applicant: DMAI, INC.

    Inventor: SHUKLA, Nishant

    Abstract: The present teaching relates to method, system, medium, and implementations for speech recognition in a spoken language. Upon receiving a speech signal representing an utterance of a speaker in one of a plurality of spoken languages, speech recognition is performed based on the speech signal in accordance with a plurality of speech recognition models corresponding to the plurality of spoken languages to generate a plurality of text strings each of which represents a speech recognition result in a corresponding one of the plurality of spoken languages. With respect to each of the plurality of text strings associated with a corresponding spoken language, a likelihood that the utterance is in the corresponding spoken language is computed. A spoken language of the utterance is determined based on the likelihood with respect to each of the plurality of text strings.

Patent Agency Ranking