Invention Application
WO2016077016A1 AUTOMATIC GENERATION OF N-GRAMS AND CONCEPT RELATIONS FROM LINGUISTIC INPUT DATA
审中-公开
自动生成N-GRAMS和概念输入数据的关系
- Patent Title: AUTOMATIC GENERATION OF N-GRAMS AND CONCEPT RELATIONS FROM LINGUISTIC INPUT DATA
- Patent Title (中): 自动生成N-GRAMS和概念输入数据的关系
-
Application No.: PCT/US2015/055490Application Date: 2015-10-14
-
Publication No.: WO2016077016A1Publication Date: 2016-05-19
- Inventor: NAUZE, Fabrice , KISSIG, Christian , ZARAFIN, Madalina , VILLADA-MOIRON, Maria Begona , GENET, Roos
- Applicant: ORACLE INTERNATIONAL CORPORATION
- Applicant Address: 500 Oracle Parkway, M/S 5OP7 Redwood Shores, California 94065 US
- Assignee: ORACLE INTERNATIONAL CORPORATION
- Current Assignee: ORACLE INTERNATIONAL CORPORATION
- Current Assignee Address: 500 Oracle Parkway, M/S 5OP7 Redwood Shores, California 94065 US
- Agency: BERGSTROM, James et al.
- Priority: US62/077,887 20141110; US62/077,868 20141110; US14/793,677 20150707; US14/793,701 20150707
- Main IPC: G06F17/30
- IPC: G06F17/30 ; G06F17/27
Abstract:
A method of automatically generating a lemma dictionary from a web resource may include extracting a plurality of tokens from text-based documents within the web resource, and generating a plurality of N-grams from the plurality of tokens. The method may additionally include receiving one or more filter definitions that identify valid N-grams, and filtering the plurality of N-grams using the one or more filter definitions to generate a lemma dictionary. The method may further include generating an ontology that comprises the lemma dictionary.
Information query