Building training data and similarity relations for semantic space

Invention Grant

US12020175B2 Building training data and similarity relations for semantic space 有权

Please log in to see more content

Patent Title: Building training data and similarity relations for semantic space
Application No.: US17001311

Application Date: 2020-08-24
Publication No.: US12020175B2

Publication Date: 2024-06-25
Inventor: Eugene Livshitz , Alexander Pashintsev , Boris Gorbatov
Applicant: Evernote Corporation
Applicant Address: US CA Redwood City
Assignee: Bending Spoons S.p.A.
Current Assignee: Bending Spoons S.p.A.
Current Assignee Address: IT Milan
Agency: Morgan, Lewis & Bockius LLP
Main IPC: G06F17/00
IPC: G06F17/00 ; G06F16/33 ; G06F40/205 ; G06N5/04 ; G06N20/00

Building training data and similarity relations for semantic space

Abstract:

A method and system for selecting data from a source text corpus for training a semantic data analysis system. The method includes selecting an item of the text corpus, wherein the item includes at least one section. The method includes extracting a section of the at least one section of the item. The method also includes determining a length of the section of the at least one section of the item. Based on the length of the section being greater than a predetermined amount, the method includes subdividing the section into a plurality of fragments. Each fragment of the plurality of fragments is deemed to be similar to each other. Further, the method includes building a training set based on the plurality of fragments. The training set is used to train the semantic data analysis system.

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06F	电数字数据处理（基于特定计算模型的计算机系统入G06N）
G06F17/00	特别适用于特定功能的数字计算设备或数据处理设备或数据处理方法（信息检索，数据库结构或文件系统结构，G06F 16/00）