Invention Application
- Patent Title: GENERATION OF TRAINING DATA FOR MACHINE LEARNING BASED MODELS FOR NAMED ENTITY RECOGNITION FOR NATURAL LANGUAGE PROCESSING
-
Application No.: US17202188Application Date: 2021-03-15
-
Publication No.: US20220222489A1Publication Date: 2022-07-14
- Inventor: Jingyuan Liu , Abhishek Sharma , Suhail Sanjiv Barot , Gurkirat Singh , Mridul Gupta , Shiva Kumar Pentyala , Ankit Chadha
- Applicant: salesforce.com, inc.
- Applicant Address: US CA San Francisco
- Assignee: salesforce.com, inc.
- Current Assignee: salesforce.com, inc.
- Current Assignee Address: US CA San Francisco
- Main IPC: G06K9/62
- IPC: G06K9/62 ; G06F40/295 ; G06F40/247 ; G06F40/35 ; G06F40/284 ; G06N20/00

Abstract:
A system performs named entity recognition for performing natural language processing, for example, for conversation engines. The system uses context information in named entity recognition. The system includes the context of a sentence during model training and execution. The system generates high quality contextual data for training NER models. The system utilizes labeled and unlabeled contextual data for training NER models. The system provides NER models for execution in production environments. The system uses heuristics to determine whether to use a context-based NER model or a simple NER model that does not use context information. This allows the system to use simple NER models when the likelihood of improving the accuracy of prediction based on context is low.
Public/Granted literature
Information query