Invention Grant
- Patent Title: Applying a structured language model to information extraction
- Patent Title (中): 将结构化语言模型应用于信息提取
-
Application No.: US12862001Application Date: 2010-08-24
-
Publication No.: US08706491B2Publication Date: 2014-04-22
- Inventor: Ciprian Chelba , Milind Mahajan
- Applicant: Ciprian Chelba , Milind Mahajan
- Applicant Address: US WA Redmond
- Assignee: Microsoft Corporation
- Current Assignee: Microsoft Corporation
- Current Assignee Address: US WA Redmond
- Agent Steve Wight; Carole Boelitz; Micky Minhas
- Main IPC: G06F17/27
- IPC: G06F17/27 ; G10L15/00 ; G10L15/18 ; G10L15/04 ; G10L15/05 ; G06F17/20 ; G06F17/28 ; G10L15/22

Abstract:
One feature of the present invention uses the parsing capabilities of a structured language model in the information extraction process. During training, the structured language model is first initialized with syntactically annotated training data. The model is then trained by generating parses on semantically annotated training data enforcing annotated constituent boundaries. The syntactic labels in the parse trees generated by the parser are then replaced with joint syntactic and semantic labels. The model is then trained by generating parses on the semantically annotated training data enforcing the semantic tags or labels found in the training data. The trained model can then be used to extract information from test data using the parses generated by the model.
Public/Granted literature
- US20100318348A1 APPLYING A STRUCTURED LANGUAGE MODEL TO INFORMATION EXTRACTION Public/Granted day:2010-12-16
Information query