Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Vivek Yadav"

1.

发明申请
ACTIVE SPEAKER DETECTION USING IMAGE DATA 有权

公开(公告)号：US20230068798A1

公开(公告)日：2023-03-02

申请号：US17465143

申请日：2021-09-02

Applicant: Amazon Technologies, Inc.

Inventor： Tyler Jerel Etchart , Vivek Yadav , Pradeep Natarajan

IPC: G10L15/25 , G06K9/00 , G06T7/70 , G10L15/22 , G10L25/78

Abstract: A system can operate a speech-controlled device to perform active speaker detection to detect an utterance using image data showing a user speaking the utterance. This enables the device to perform utterance detection using the image data and/or determine which user is speaking the utterance. To perform active speaker detection, the device processes the image data to determine expression parameters associated with the user's face and generates facial measurements based on the expression parameters. For example, the device can use the expression parameters to generate a 3D model including an agnostic facial representation and determine a mouth aspect ratio by measuring a mouth height and a mouth width of the agnostic facial representation. As the mouth aspect ratio changes when the user is speaking, the device can determine that the user is speaking and/or detect an utterance based on an amount of variation of the mouth aspect ratio.

2.

发明授权
Natural language processing 有权

公开(公告)号：US11532301B1

公开(公告)日：2022-12-20

申请号：US17113823

申请日：2020-12-07

Applicant: Amazon Technologies, Inc.

Inventor： Kiana Hajebi , Vivek Yadav , Pradeep Natarajan

IPC: G10L15/22 , G10L15/18

Abstract: Devices and techniques are generally described for inference reduction in natural language processing using semantic similarity-based caching. In various examples, first automatic speech recognition (ASR) data representing a first natural language input may be determined. A cache may be searched using the first ASR data. A first skill associated with the first ASR data may be determined from the cache. In some examples, first intent data representing a semantic interpretation of the first natural language input data may be determined by using a first natural language process associated with the first skill.

3.

发明申请
TASK-BASED IMAGE MASKING 有权

公开(公告)号：US20210406589A1

公开(公告)日：2021-12-30

申请号：US16913837

申请日：2020-06-26

Applicant: Amazon Technologies, Inc.

Inventor： Vivek Yadav , Aayush Gupta , Yue Wu , Pradeep Natarajan , Ayush Jaiswal

IPC: G06K9/62 , G06N20/00 , G06N3/08

Abstract: Techniques for masking images based on a particular task are described. A system masks portions of an image that are not relevant to a particular task, thus, reducing the amount of data used by applications for image processing tasks. For example, images to be processed using a hair color classification model are masked so that only portions that show the person's hair are available for the model to analyze. The system configures different masker components to mask images for different tasks. A masker component can be implemented at a user device to mask images prior to sending to an application/task-specific model.

4.

发明授权
Natural language processing 有权

公开(公告)号：US11626107B1

公开(公告)日：2023-04-11

申请号：US17114116

申请日：2020-12-07

Applicant: Amazon Technologies, Inc.

Inventor： Kiana Hajebi , Vivek Yadav , Pradeep Natarajan

IPC: G10L15/22 , G10L15/18

Abstract: Devices and techniques are generally described for inference reduction in natural language processing using semantic similarity-based caching. In various examples, first automatic speech recognition (ASR) data representing a first natural language input may be determined. A cache may be searched using the first ASR data. A first skill associated with the first ASR data may be determined from the cache. In some examples, first intent data representing a semantic interpretation of the first natural language input data may be determined by using a first natural language process associated with the first skill.

5.

发明申请
CUSTOMIZABLE LOCK MANAGEMENT FOR DISTRIBUTED RESOURCES 有权

公开(公告)号：US20210382636A1

公开(公告)日：2021-12-09

申请号：US16938846

申请日：2020-07-24

Applicant: Amazon Technologies, Inc.

Inventor： Saravana Perumal , Abhijit Chaudhuri , Mahesh H. Dhabade , Vivek Yadav , Nagaprasad K P , Rahul Kamalkishore Agrawal , Pankaj Chawla , Visakh Sakthidharan Nair

IPC: G06F3/06

Abstract: A write lock request for a data object on behalf of a first data accessor is received at a lock manager. The data object is currently locked on behalf of a second data accessor. The lock manager modifies lock metadata associated with the data object to indicate the first data accessor as the primary lock owner, and designates the second data accessor as a non-primary owner.

6.

发明授权
Natural language processing 有权

公开(公告)号：US12165636B1

公开(公告)日：2024-12-10

申请号：US17984511

申请日：2022-11-10

Applicant: Amazon Technologies, Inc.

Inventor： Kiana Hajebi , Vivek Yadav , Pradeep Natarajan

IPC: G10L15/22 , G10L15/18

Abstract: Devices and techniques are generally described for inference reduction in natural language processing using semantic similarity-based caching. In various examples, first automatic speech recognition (ASR) data representing a first natural language input may be determined. A cache may be searched using the first ASR data. A first skill associated with the first ASR data may be determined from the cache. In some examples, first intent data representing a semantic interpretation of the first natural language input data may be determined by using a first natural language process associated with the first skill.

7.

发明授权
Task-based image masking 有权

公开(公告)号：US11854116B2

公开(公告)日：2023-12-26

申请号：US17740533

申请日：2022-05-10

Applicant: Amazon Technologies, Inc.

Inventor： Vivek Yadav , Aayush Gupta , Yue Wu , Pradeep Natarajan , Ayush Jaiswal

IPC: G06T11/00 , G06N20/00 , G06N3/08 , G06F18/2431 , G06F18/214 , G06V10/772 , G06V10/20

CPC classification number: G06T11/00 , G06F18/214 , G06F18/2431 , G06N3/08 , G06N20/00 , G06V10/20 , G06V10/772

Abstract: Techniques for masking images based on a particular task are described. A system masks portions of an image that are not relevant to a particular task, thus, reducing the amount of data used by applications for image processing tasks. For example, images to be processed using a hair color classification model are masked so that only portions that show the person's hair are available for the model to analyze. The system configures different masker components to mask images for different tasks. A masker component can be implemented at a user device to mask images prior to sending to an application/task-specific model.

8.

发明授权
Customizable lock management for distributed resources 有权

公开(公告)号：US11449241B2

公开(公告)日：2022-09-20

申请号：US16938846

申请日：2020-07-24

Applicant: Amazon Technologies, Inc.

Inventor： Saravana Perumal , Abhijit Chaudhuri , Mahesh H. Dhabade , Vivek Yadav , Nagaprasad K P , Rahul Kamalkishore Agrawal , Pankaj Chawla , Visakh Sakthidharan Nair

IPC: G06F3/06

Abstract: A write lock request for a data object on behalf of a first data accessor is received at a lock manager. The data object is currently locked on behalf of a second data accessor. The lock manager modifies lock metadata associated with the data object to indicate the first data accessor as the primary lock owner, and designates the second data accessor as a non-primary owner.

9.

发明授权
Listener animation 有权

公开(公告)号：US12254548B1

公开(公告)日：2025-03-18

申请号：US18082709

申请日：2022-12-16

Applicant: Amazon Technologies, Inc.

Inventor： Gourav Datta , Vivek Yadav , Yue Wu , Ayush Jaiswal , Rajiv M Reddy , Prateek Singhal , Karthik Ramakrishnan , Premkumar Natarajan

IPC: G06T7/20 , G06T7/70 , G06T13/20 , G06T13/40 , G06V40/16 , G10L15/22 , G10L25/57 , G10L25/60

Abstract: A system configured to perform style-aware listener animation. By representing different listening styles (e.g., facial expressions) using an embedding space, a single model can be trained to generate unique facial animations for a number of distinct listeners. Thus, individual listening styles can be associated with a listener identifier, enabling the system to (i) animate a plurality of different listeners with unique nonverbal behavior and/or (ii) select a particular listener identifier or desired type of listener style with which to animate. This enables the model to be generalized to new listeners to generate additional listener facial responses without needing training data for each new listener. The model may process a listener representation style or listener identifier, along with input data corresponding to a speaker talking, to generate unique facial animation responsive to the speech.

10.

发明申请
TASK-BASED IMAGE MASKING 有权

公开(公告)号：US20220405528A1

公开(公告)日：2022-12-22

申请号：US17740533

申请日：2022-05-10

Applicant: Amazon Technologies, Inc.

Inventor： Vivek Yadav , Aayush Gupta , Yue Wu , Pradeep Natarajan , Ayush Jaiswal

IPC: G06K9/62 , G06N20/00 , G06N3/08

Abstract: Techniques for masking images based on a particular task are described. A system masks portions of an image that are not relevant to a particular task, thus, reducing the amount of data used by applications for image processing tasks. For example, images to be processed using a hair color classification model are masked so that only portions that show the person's hair are available for the model to analyze. The system configures different masker components to mask images for different tasks. A masker component can be implemented at a user device to mask images prior to sending to an application/task-specific model.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification