METHODS AND SYSTEMS FOR LANGUAGE-AGNOSTIC MACHINE LEARNING IN NATURAL LANGUAGE PROCESSING USING FEATURE EXTRACTION
    4.
    发明申请
    METHODS AND SYSTEMS FOR LANGUAGE-AGNOSTIC MACHINE LEARNING IN NATURAL LANGUAGE PROCESSING USING FEATURE EXTRACTION 审中-公开
    使用特征提取的自然语言处理中语言学习机器学习的方法与系统

    公开(公告)号:US20160162467A1

    公开(公告)日:2016-06-09

    申请号:US14964525

    申请日:2015-12-09

    IPC分类号: G06F17/27

    摘要: Methods, apparatuses, and systems are presented for generating natural language models using a novel system architecture for feature extraction. A method for extracting features for natural language processing comprises: accessing one or more tokens generated from a document to be processed; receiving one or more feature types defined by user; receiving selection of one or more feature types from a plurality of system-defined and user-defined feature types, wherein each feature type comprises one or more rules for generating features; receiving one or more parameters for the selected feature types, wherein the one or more rules for generating features are defined at least in part by the parameters; generating features associated with the document to be processed based on the selected feature types and the received parameters; and outputting the generated features in a format common among all feature types.

    摘要翻译: 提出了使用用于特征提取的新型系统架构来生成自然语言模型的方法,装置和系统。 一种用于提取自然语言处理特征的方法,包括:访问从要处理的文档生成的一个或多个令牌; 接收用户定义的一个或多个特征类型; 从多个系统定义和用户定义的特征类型中接收对一个或多个特征类型的选择,其中每个特征类型包括用于生成特征的一个或多个规则; 为所选择的特征类型接收一个或多个参数,其中用于生成特征的所述一个或多个规则至少部分地由所述参数定义; 基于所选择的特征类型和接收到的参数来生成与要处理的文档相关联的特征; 并以所有特征类型中共同的格式输出生成的特征。