- 专利标题: ENHANCED NAMED ENTITY RECOGNITION (NER) USING CUSTOM-BUILT REGULAR EXPRESSION (REGEX) MATCHER AND HEURISTIC ENTITY RULER
-
申请号: EP23201621.2申请日: 2023-10-04
-
公开(公告)号: EP4369245A1公开(公告)日: 2024-05-15
- 发明人: HOSUDURG, ANANTHA DESIK PURANAM , NAMAN, SUMIRAN , ROY, ASHIM , PATWARDHAN, NIKHIL GIRISH
- 申请人: Tata Consultancy Services Limited
- 申请人地址: IN Maharashtra Nirmal Building 9th Floor Nariman Point Mumbai 400 021
- 专利权人: Tata Consultancy Services Limited
- 当前专利权人: Tata Consultancy Services Limited
- 当前专利权人地址: IN Maharashtra Nirmal Building 9th Floor Nariman Point Mumbai 400 021
- 代理机构: Goddar, Heinz J.
- 优先权: IN 202221063669 2022.11.08
- 主分类号: G06F40/295
- IPC分类号: G06F40/295 ; G06F40/205
摘要:
Pre-trained models for Named Entity Recognition (NER) come with static NE classes, limited in number, and remain same irrespective of domain of the input text. Thus, domain specific training is required. Embodiments of the present disclosure provide a method and system for enhanced NER using a custom-built REGEX matcher and a heuristic entity ruler. The invention helps in discovering the NE's of the given text with pipeline-based approach with combination of models of NLP transformer, custom-built REGEX, and heuristic entity rules. The method automatically handles class resolution based on the heuristic entity ruler. The method enables a user to customize or add any new heuristic rules for entity ruler or custom regex as a knowledgebase to train the model with automatic relearning and unlearning. The extracted NEs are provided for further processing or masking in a structured format.
信息查询