CODE-BASED PATTERN EXTRACTION AND APPLICATION IN A NAMED ENTITY RECOGNITION PIPELINE
摘要:
Various systems and methods are presented regarding code-based pattern extraction (Code-PE) and the application of Code-PE to a named entity recognition pipeline. Patterns can be generated from named entities, wherein the entities have an assigned type. Codes are identified within the entities, subsequently vectorized and clustered based upon the presence of the identified codes. Patterns are identified for the respective clusters. The patterns can be applied to an untyped entity, in the event of the pattern matching, the entity can be typed with the type assigned to the pattern. The typed entity can be used to recursively update knowledge regarding typed- and untyped-entities. In the event a pattern incorrectly types an entity, the pattern can be retrained with the updated knowledge.
信息查询
0/0