AUTOMATED BUILDING OF A MODEL FOR BEHAVIORAL TARGETING
    1.
    发明申请
    AUTOMATED BUILDING OF A MODEL FOR BEHAVIORAL TARGETING 审中-公开
    自动化建筑行为导向模型

    公开(公告)号:US20110231256A1

    公开(公告)日:2011-09-22

    申请号:US12842934

    申请日:2010-07-23

    IPC分类号: G06Q30/00

    CPC分类号: G06Q30/02 G06Q30/0255

    摘要: A method for generating a behavioral model for a targeted advertisement category (TAC), including: obtaining click stream data including ad-clicks and events preceding the ad-clicks and performed on web pages; assigning features having categories and keywords associated with the web pages to the events; identifying an ad-click of the ad-clicks and a subset of the events preceding the ad-click that result in the ad-click, where the subset of the events is associated with at least one feature; generating an aggregated event sequence by aggregating the ad-click and the subset of the events; selecting, in response to the at least one feature being associated with the TAC, a training data set including at least the aggregated event sequence; generating the behavioral model for the TAC by applying a learning algorithm to a portion of the training data set; and evaluating performance of built models and select model based on performance result.

    摘要翻译: 一种用于生成目标广告类别(TAC)的行为模型的方法,包括:获得包括广告点击和广告点击之前在网页上执行的事件的点击流数据; 将具有与所述网页相关联的类别和关键字的特征分配给所述事件; 识别广告点击的广告点击和导致广告点击的广告点击之前的一小部分事件,其中事件的子集与至少一个要素相关联; 通过聚合广告点击和事件的子集来生成聚合事件序列; 响应于所述至少一个与所述TAC相关联的特征,选择至少包括所述聚合事件序列的训练数据集; 通过将学习算法应用到训练数据集的一部分来生成TAC的行为模型; 并根据性能结果评估内置模型的性能和选择模型。

    System and method for modelling and profiling in multiple languages
    2.
    发明授权
    System and method for modelling and profiling in multiple languages 有权
    用多种语言建模和分析的系统和方法

    公开(公告)号:US09026542B2

    公开(公告)日:2015-05-05

    申请号:US12842921

    申请日:2010-07-23

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30867 G06F17/30702

    摘要: A system and method for generating feature vectors of documents in different languages are provided. The feature vectors provide scores associated with keywords defined in a base language for use by a profiler for generating or updating a user profile. The system and method use a plurality of keyword sets comprising: a base language keyword set comprising a plurality of base language keywords each associated with a respective identifier (ID); and a second language keyword set comprising a plurality of second language keywords each corresponding in meaning to a respective one of the base language keywords and associated with the ID of the corresponding base language keyword. One of a plurality of tokenizers is selected to parse a document based on the language of the document and to generate the feature vector using the keyword set of the corresponding language.

    摘要翻译: 提供了一种用于生成不同语言的文档的特征向量的系统和方法。 特征向量提供与基本语言中定义的关键词相关联的分数,以供分析器用于生成或更新用户简档。 该系统和方法使用多个关键字集合,包括:基本语言关键字集合,其包括与相应标识符(ID)相关联的多个基本语言关键字; 以及第二语言关键字集合,其包括多个第二语言关键字,每个对应于基本语言关键字中的相应一个,并且与相应的基本语言关键字的ID相关联。 选择多个记号器中的一个以基于文档的语言来解析文档,并且使用相应语言的关键字集来生成特征向量。

    SYSTEM AND METHOD FOR MODELLING AND PROFILING IN MULTIPLE LANGUAGES
    3.
    发明申请
    SYSTEM AND METHOD FOR MODELLING AND PROFILING IN MULTIPLE LANGUAGES 有权
    用于多语言建模和配置的系统和方法

    公开(公告)号:US20110276577A1

    公开(公告)日:2011-11-10

    申请号:US12842921

    申请日:2010-07-23

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30867 G06F17/30702

    摘要: A system and method for generating feature vectors of documents in different languages are provided. The feature vectors provide scores associated with keywords defined in a base language for use by a profiler for generating or updating a user profile. The system and method use a plurality of keyword sets comprising: a base language keyword set comprising a plurality of base language keywords each associated with a respective identifier (ID); and a second language keyword set comprising a plurality of second language keywords each corresponding in meaning to a respective one of the base language keywords and associated with the ID of the corresponding base language keyword. One of a plurality of tokenizers is selected to parse a document based on the language of the document and to generate the feature vector using the keyword set of the corresponding language.

    摘要翻译: 提供了一种用于生成不同语言的文档的特征向量的系统和方法。 特征向量提供与基本语言中定义的关键词相关联的分数,以供分析器用于生成或更新用户简档。 该系统和方法使用多个关键字集合,包括:基本语言关键字集合,其包括与相应标识符(ID)相关联的多个基本语言关键字; 以及第二语言关键字集合,其包括多个第二语言关键字,每个对应于基本语言关键字中的相应一个,并且与相应的基本语言关键字的ID相关联。 选择多个记号器中的一个以基于文档的语言来解析文档,并且使用相应语言的关键字集来生成特征向量。

    CATEGORIZATION AUTOMATION
    4.
    发明申请
    CATEGORIZATION AUTOMATION 有权
    分类自动化

    公开(公告)号:US20110258152A1

    公开(公告)日:2011-10-20

    申请号:US13077696

    申请日:2011-03-31

    IPC分类号: G06F15/18

    CPC分类号: G06F17/30876 G06F17/30705

    摘要: A method for categorization using multiple categories including obtaining multiple uniform resource locators (URLs) associated with the multiple categories, collecting multiple web pages identified by the multiple URLs, generating vocabulary terms based on the multiple web pages, generating an N-gram file including the multiple vocabulary terms, generating multiple classified URLs by labeling the plurality of URLs based on the multiple categories, generating multiple feature vectors by processing the classified URLs and the multiple web pages against the N-gram file, generating a categorization model by applying a machine learning algorithm to the multiple feature vectors, and loading a classifier with the categorization module and the N-gram file.

    摘要翻译: 一种用于使用多个类别进行分类的方法,包括获得与多个类别相关联的多个统一资源定位符(URL),收集由多个URL标识的多个网页,基于多个网页生成词汇术语,生成包括 多个词汇术语,通过基于多个类别标记多个URL来生成多个分类URL,通过针对N-gram文件处理分类的URL和多个网页来生成多个特征向量,通过应用机器学习来生成分类模型 算法到多个特征向量,并加载分类器与分类模块和N-gram文件。

    Categorization automation based on category ontology
    5.
    发明授权
    Categorization automation based on category ontology 有权
    基于类本体的分类自动化

    公开(公告)号:US08489523B2

    公开(公告)日:2013-07-16

    申请号:US13077696

    申请日:2011-03-31

    IPC分类号: G06F15/18

    CPC分类号: G06F17/30876 G06F17/30705

    摘要: A method for categorization using multiple categories including obtaining multiple uniform resource locators (URLs) associated with the multiple categories, collecting multiple web pages identified by the multiple URLs, generating vocabulary terms based on the multiple web pages, generating an N-gram file including the multiple vocabulary terms, generating multiple classified URLs by labeling the plurality of URLs based on the multiple categories, generating multiple feature vectors by processing the classified URLs and the multiple web pages against the N-gram file, generating a categorization model by applying a machine learning algorithm to the multiple feature vectors, and loading a classifier with the categorization module and the N-gram file.

    摘要翻译: 一种用于使用多个类别进行分类的方法,包括获得与多个类别相关联的多个统一资源定位符(URL),收集由多个URL标识的多个网页,基于多个网页生成词汇术语,生成包括 多个词汇术语,通过基于多个类别标记多个URL来生成多个分类URL,通过针对N-gram文件处理分类的URL和多个网页来生成多个特征向量,通过应用机器学习来生成分类模型 算法到多个特征向量,并加载分类器与分类模块和N-gram文件。