Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Zhaoqing Ma"

1.

发明授权
Vector norm algorithmic subsystems for improving clustering solutions 有权

公开(公告)号：US11003959B1

公开(公告)日：2021-05-11

申请号：US16440384

申请日：2019-06-13

Applicant: AMAZON TECHNOLOGIES, INC.

Inventor： Ilya Levner , Konstantinos Boulis , Gurbinder Gill , Canku Calargun , Prajwal Yadapadithaya , Venkata Krishnan Ramamoorthy , Zhaoqing Ma

IPC: G06K9/62 , G06N3/04 , G06N3/08 , G06K9/68

Abstract: Categorizing images may include training a first neural network to cluster a plurality of images to obtain a first image embedding space, wherein a vector representation is determined for each of the plurality of images based on the training, determining a vector norm value corresponding to each of the plurality of images based on the vector representation for each of the plurality of images, and identifying a first subset of the images for which a corresponding vector norm value satisfies a predetermined vector norm quality threshold. Then, a second neural network may be trained using the first subset of images to obtain a second image embedding space, and the second image embedding space may be used to categorize additional images.

2.

发明授权
Adjusting speed of human speech playback 有权

公开(公告)号：US10276185B1

公开(公告)日：2019-04-30

申请号：US15677659

申请日：2017-08-15

Applicant: Amazon Technologies, Inc.

Inventor： Zhaoqing Ma , Tony Roy Hardie , Christo Frank Devaraj

IPC: G10L13/00 , G10L21/04 , G10L25/78 , G10L25/27

Abstract: A system configured to vary a speech speed of speech represented in input audio data without changing a pitch of the speech. The system may vary the speech speed based on a number of different inputs, including non-audio data, data associated with a command, or data associated with the voice message itself. The non-audio data may correspond to information about an account, device or user, such as user preferences, calendar entries, location information, etc. The system may analyze audio data associated with the command to determine command speech speed, identity of person listening, etc. The system may analyze the input audio data to determine a message speech speed, background noise level, identity of the person speaking, etc. Using all of these inputs, the system may dynamically determine a target speech speed and may generate output audio data having the target speech speed.

3.

发明授权
Incremental clustering for face recognition systems 有权

公开(公告)号：US11354936B1

公开(公告)日：2022-06-07

申请号：US16929387

申请日：2020-07-15

Applicant: Amazon Technologies, Inc.

Inventor： Dharmil Satishbhai Chandarana , Ilya Levner , Zhaoqing Ma , Prajwal Yadapadithaya , Riley James Williams , Canku Alp Calargun , Prama Anand

IPC: G06V40/00 , G06V40/16 , G06K9/62

Abstract: Techniques for improved image classification are provided. Face embeddings are generated for each face depicted in a collection of images, and the face embeddings are clustered based on the individual whose face is depicted. Based on these clusters, each embedding is assigned a label reflecting the cluster assignments. Some or all of the face embeddings are then used to train a classifier model to generate cluster labels for new input images. This classifier model can then be used to process new images in an efficient manner, and classify them into appropriate clusters.

4.

发明授权
Adjusting speed of human speech playback 有权

公开(公告)号：US11232808B2

公开(公告)日：2022-01-25

申请号：US16394717

申请日：2019-04-25

Applicant: Amazon Technologies, Inc.

Inventor： Zhaoqing Ma , Tony Roy Hardie , Christo Frank Devaraj

IPC: G10L21/04 , G10L25/78 , G10L25/27

Abstract: A system configured to vary a speech speed of speech represented in input audio data without changing a pitch of the speech. The system may vary the speech speed based on a number of different inputs, including non-audio data, data associated with a command, or data associated with the voice message itself. The non-audio data may correspond to information about an account, device or user, such as user preferences, calendar entries, location information, etc. The system may analyze audio data associated with the command to determine command speech speed, identity of person listening, etc. The system may analyze the input audio data to determine a message speech speed, background noise level, identity of the person speaking, etc. Using all of these inputs, the system may dynamically determine a target speech speed and may generate output audio data having the target speech speed.

5.

发明申请
ADJUSTING SPEED OF HUMAN SPEECH PLAYBACK 审中-公开

公开(公告)号：US20190318758A1

公开(公告)日：2019-10-17

申请号：US16394717

申请日：2019-04-25

Applicant: Amazon Technologies, Inc.

Inventor： Zhaoqing Ma , Tony Roy Hardie , Christo Frank Devaraj

IPC: G10L21/04 , G10L25/78

Abstract: A system configured to vary a speech speed of speech represented in input audio data without changing a pitch of the speech. The system may vary the speech speed based on a number of different inputs, including non-audio data, data associated with a command, or data associated with the voice message itself. The non-audio data may correspond to information about an account, device or user, such as user preferences, calendar entries, location information, etc. The system may analyze audio data associated with the command to determine command speech speed, identity of person listening, etc. The system may analyze the input audio data to determine a message speech speed, background noise level, identity of the person speaking, etc. Using all of these inputs, the system may dynamically determine a target speech speed and may generate output audio data having the target speech speed.

Patent Agency Ranking