Invention Grant
- Patent Title: Encoder using machine-trained term frequency weighting factors that produces a dense embedding vector
-
Application No.: US16368798Application Date: 2019-03-28
-
Publication No.: US11669558B2Publication Date: 2023-06-06
- Inventor: Yan Wang , Ye Wu , Houdong Hu , Surendra Ulabala , Vishal Thakkar , Arun Sacheti
- Applicant: Microsoft Technology Licensing, LLC
- Applicant Address: US WA Redmond
- Assignee: Microsoft Technology Licensing, LLC
- Current Assignee: Microsoft Technology Licensing, LLC
- Current Assignee Address: US WA Redmond
- Main IPC: G06N3/04
- IPC: G06N3/04 ; G06N5/02 ; G06N3/045 ; G06F16/33 ; G06F16/245 ; G06F16/248 ; G06V20/62 ; G06F18/2413 ; G06F17/16

Abstract:
A computer-implemented technique generates a dense embedding vector that provides a distributed representation of input text. The technique includes: generating an input term-frequency (TF) vector of dimension g that includes frequency information relating to frequency of occurrence of terms in an instance of input text; using a TF-modifying component to modify the term-specific frequency information in the input TF vector by respective machine-trained weighting factors, to produce an intermediate vector of dimension g; using a projection component to project the intermediate vector of dimension g into an embedding vector of dimension k, where k is less than g. Both the TF-modifying component and the projection component use respective machine-trained neural networks. An application performs any of a retrieval-based function, a recognition-based function, a recommendation-based function, a classification-based function, etc. based on the embedding vector.
Public/Granted literature
- US20200311542A1 Encoder Using Machine-Trained Term Frequency Weighting Factors that Produces a Dense Embedding Vector Public/Granted day:2020-10-01
Information query