Invention Application
- Patent Title: SYNTHETIC CRAFTING OF TRAINING AND TEST DATA FOR NAMED ENTITY RECOGNITION
-
Application No.: US17248583Application Date: 2021-01-29
-
Publication No.: US20220245346A1Publication Date: 2022-08-04
- Inventor: Shubham Mehrotra , Ankit Chadha
- Applicant: salesforce.com, inc.
- Applicant Address: US CA San Francisco
- Assignee: salesforce.com, inc.
- Current Assignee: salesforce.com, inc.
- Current Assignee Address: US CA San Francisco
- Main IPC: G06F40/295
- IPC: G06F40/295 ; G06F40/47 ; G06F40/30

Abstract:
A method and system for extracting and labeling Named-Entity Recognition (NER) data in a target language for use in a multi-lingual software module has been developed. First, a textual sentence is translated to the target language using a translation module. A named entity is identified and extracted within the translated sentence. The named entity is identified by either: exact mapping; a semantically similar translated named entity that meets a predetermined minimum threshold of similarity; or utilizing a rule-based library for the target language. Once identified, the named entity is labeled with a pre-determined category and stored in a retrievable electronic database.
Public/Granted literature
- US11853699B2 Synthetic crafting of training and test data for named entity recognition by utilizing a rule-based library Public/Granted day:2023-12-26
Information query