- Patent Title: Building training data and similarity relations for semantic space
-
Application No.: US17001311Application Date: 2020-08-24
-
Publication No.: US12020175B2Publication Date: 2024-06-25
- Inventor: Eugene Livshitz , Alexander Pashintsev , Boris Gorbatov
- Applicant: Evernote Corporation
- Applicant Address: US CA Redwood City
- Assignee: Bending Spoons S.p.A.
- Current Assignee: Bending Spoons S.p.A.
- Current Assignee Address: IT Milan
- Agency: Morgan, Lewis & Bockius LLP
- Main IPC: G06F17/00
- IPC: G06F17/00 ; G06F16/33 ; G06F40/205 ; G06N5/04 ; G06N20/00

Abstract:
A method and system for selecting data from a source text corpus for training a semantic data analysis system. The method includes selecting an item of the text corpus, wherein the item includes at least one section. The method includes extracting a section of the at least one section of the item. The method also includes determining a length of the section of the at least one section of the item. Based on the length of the section being greater than a predetermined amount, the method includes subdividing the section into a plurality of fragments. Each fragment of the plurality of fragments is deemed to be similar to each other. Further, the method includes building a training set based on the plurality of fragments. The training set is used to train the semantic data analysis system.
Information query