KEYWORD USAGE SCORE BASED ON FREQUENCY IMPULSE AND FREQUENCY WEIGHT
    1.
    发明申请
    KEYWORD USAGE SCORE BASED ON FREQUENCY IMPULSE AND FREQUENCY WEIGHT 失效
    基于频率和频率的关键字使用分数

    公开(公告)号:US20080301117A1

    公开(公告)日:2008-12-04

    申请号:US11756740

    申请日:2007-06-01

    IPC分类号: G06F7/76 G06F17/30

    摘要: A method and system for assessing keyword usage based on frequency of usage of the keywords during various periods is provided. A keyword usage measurement system is provided with the frequency of keywords during various periods. The measurement system then calculates a recent usage score for a keyword by combining a frequency impulse score for the keyword with a frequency weight for the keyword. The frequency impulse score for a keyword indicates whether a recent change in the frequency of the keyword has occurred. The frequency weight for a keyword indicates a recent measure of the frequency of the keyword.

    摘要翻译: 提供了一种基于各种期间关键词使用频率来评估关键字使用的方法和系统。 关键字使用测量系统在不同时期提供关键字的频率。 然后,测量系统通过将关键字的频率脉冲得分与该关键字的频率权重组合来计算关键字的最近使用分数。 关键字的频率脉冲得分指示是否发生了关键字的频率的最近的改变。 关键字的频率权重表示最近对关键字频率的度量。

    Keyword usage score based on frequency impulse and frequency weight
    2.
    发明授权
    Keyword usage score based on frequency impulse and frequency weight 失效
    基于频率冲击和频率权重的关键词使用得分

    公开(公告)号:US07644075B2

    公开(公告)日:2010-01-05

    申请号:US11756740

    申请日:2007-06-01

    IPC分类号: G06F17/30

    摘要: A method and system for assessing keyword usage based on frequency of usage of the keywords during various periods is provided. A keyword usage measurement system is provided with the frequency of keywords during various periods. The measurement system then calculates a recent usage score for a keyword by combining a frequency impulse score for the keyword with a frequency weight for the keyword. The frequency impulse score for a keyword indicates whether a recent change in the frequency of the keyword has occurred. The frequency weight for a keyword indicates a recent measure of the frequency of the keyword.

    摘要翻译: 提供了一种基于各种期间关键词使用频率来评估关键字使用的方法和系统。 关键字使用测量系统在不同时期提供关键字的频率。 然后,测量系统通过将关键字的频率脉冲得分与该关键字的频率权重组合来计算关键字的最近使用分数。 关键字的频率脉冲得分指示是否发生了关键字的频率的最近的改变。 关键字的频率权重表示最近对关键字频率的度量。

    IDENTIFICATION OF TOPICS FOR ONLINE DISCUSSIONS BASED ON LANGUAGE PATTERNS
    3.
    发明申请
    IDENTIFICATION OF TOPICS FOR ONLINE DISCUSSIONS BASED ON LANGUAGE PATTERNS 有权
    基于语言模式的在线讨论主题的识别

    公开(公告)号:US20080313180A1

    公开(公告)日:2008-12-18

    申请号:US11763282

    申请日:2007-06-14

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30731 G06Q30/02

    摘要: A topic identification system identifies topics of online discussions by iteratively identifying topic words or keywords of the online discussions and identifying language patterns associated with those keywords. The topic identification system starts out with an initial set of keywords and identifies language patterns that each include a keyword. The topic identification system then uses the identified language patterns to identify additional keywords of the online discussion that match the patterns. The topic identification system then again identifies language patterns using the keywords including the newly identified keywords. The topic identification system may repeat the process of identifying language patterns and keywords until a termination criterion is satisfied.

    摘要翻译: 主题识别系统通过迭代地识别在线讨论的主题或关键字并识别与这些关键字相关联的语言模式来识别在线讨论的主题。 主题识别系统以一组初始关键字开始,并识别每个关键字的语言模式。 然后,主题识别系统使用所识别的语言模式来识别与模式匹配的在线讨论的附加关键字。 然后,主题识别系统再次使用包括新确定的关键字的关键字来识别语言模式。 主题识别系统可以重复识别语言模式和关键字的过程,直到满足终止标准。

    Identification of topics for online discussions based on language patterns
    4.
    发明授权
    Identification of topics for online discussions based on language patterns 有权
    基于语言模式识别在线讨论的主题

    公开(公告)号:US07739261B2

    公开(公告)日:2010-06-15

    申请号:US11763282

    申请日:2007-06-14

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30731 G06Q30/02

    摘要: A topic identification system identifies topics of online discussions by iteratively identifying topic words or keywords of the online discussions and identifying language patterns associated with those keywords. The topic identification system starts out with an initial set of keywords and identifies language patterns that each include a keyword. The topic identification system then uses the identified language patterns to identify additional keywords of the online discussion that match the patterns. The topic identification system then again identifies language patterns using the keywords including the newly identified keywords. The topic identification system may repeat the process of identifying language patterns and keywords until a termination criterion is satisfied.

    摘要翻译: 主题识别系统通过迭代地识别在线讨论的主题或关键字并识别与这些关键字相关联的语言模式来识别在线讨论的主题。 主题识别系统以一组初始关键字开始,并识别每个关键字的语言模式。 然后,主题识别系统使用所识别的语言模式来识别与模式匹配的在线讨论的附加关键字。 然后,主题识别系统再次使用包括新确定的关键字的关键字来识别语言模式。 主题识别系统可以重复识别语言模式和关键字的过程,直到满足终止标准。

    ADVERTISEMENT APPROVAL BASED ON TRAINING DATA
    5.
    发明申请
    ADVERTISEMENT APPROVAL BASED ON TRAINING DATA 审中-公开
    基于培训数据的广告批准

    公开(公告)号:US20080300971A1

    公开(公告)日:2008-12-04

    申请号:US11755523

    申请日:2007-05-30

    IPC分类号: G06Q30/00

    摘要: A system for determining whether to approve a target document (e.g., advertisement) is provided. The system trains a classifier using tuples of words from appropriate documents and tuples of words from inappropriate documents. To approve a target document, the system identifies tuples of words of the target document. The system then applies the classifier to the identified tuples to classify the document as being appropriate or inappropriate. If the document is classified as appropriate, the system automatically approves the document.

    摘要翻译: 提供用于确定是否批准目标文档(例如,广告)的系统。 系统使用适当文件的单词组和不适当文件的单词元组来训练分类器。 要批准目标文档,系统会标识目标文档的单词元组。 然后,系统将分类器应用于所识别的元组,以将文档分类为合适或不合适。 如果文档被分类为适当的,系统将自动批准文档。

    DETERMINING RELEVANCE OF A TERM TO CONTENT USING A COMBINED MODEL
    6.
    发明申请
    DETERMINING RELEVANCE OF A TERM TO CONTENT USING A COMBINED MODEL 审中-公开
    使用组合模型确定期限与内容的相关性

    公开(公告)号:US20080103886A1

    公开(公告)日:2008-05-01

    申请号:US11553897

    申请日:2006-10-27

    IPC分类号: G06Q30/00

    摘要: A method and system for generating and using a combined model to identify whether a bid term is relevant to an advertisement is provided. A relevance system trains a combined model that includes an initial model and a decision tree model that are trained using features that represent relationships between bid terms and advertisements. The relevance system trains the initial model to map initial model features to a modeled relevance. The relevance system trains the decision tree model to map the decision tree features and the modeled relevance to a final relevance. The trained initial model and decision tree model represent the combined model. The relevance system then uses the combined model to determine the relevance of bid terms to advertisements.

    摘要翻译: 提供了一种用于生成和使用组合模型以识别出价项是否与广告相关的方法和系统。 相关系统训练包括初始模型和决策树模型的组合模型,该模型使用表示投标条款和广告之间关系的特征来训练。 相关系统训练初始模型以将初始模型特征映射到建模相关性。 相关系统训练决策树模型,将决策树特征和建模相关性映射到最终相关性。 训练初始模型和决策树模型代表组合模型。 相关系统然后使用组合模型来确定投标条款与广告的相关性。

    Scalable probabilistic latent semantic analysis
    8.
    发明授权
    Scalable probabilistic latent semantic analysis 有权
    可扩展概率潜在语义分析

    公开(公告)号:US07844449B2

    公开(公告)日:2010-11-30

    申请号:US11392763

    申请日:2006-03-30

    IPC分类号: G06F17/27

    CPC分类号: G06F17/2785

    摘要: A scalable two-pass scalable probabilistic latent semantic analysis (PLSA) methodology is disclosed that may perform more efficiently, and in some cases more accurately, than traditional PLSA, especially where large and/or sparse data sets are provided for analysis. The improved methodology can greatly reduce the storage and/or computational costs of training a PLSA model. In the first pass of the two-pass methodology, objects are clustered into groups, and PLSA is performed on the groups instead of the original individual objects. In the second pass, the conditional probability of a latent class, given an object, is obtained. This may be done by extending the training results of the first pass. During the second pass, the most likely latent classes for each object are identified.

    摘要翻译: 公开了一种可扩展的双向可伸缩概率潜在语义分析(PLSA)方法,其可以比传统的PLSA更有效地执行,在某些情况下可以更准确地执行,特别是在提供大型和/或稀疏数据集用于分析的情况下。 改进的方法可以大大降低培训PLSA模型的存储和/或计算成本。 在双路方法的第一遍中,对象被聚集成组,并且PLSA在组而不是原始的单个对象上执行。 在第二遍中,获得给定对象的潜在类的条件概率。 这可以通过扩展第一遍的训练结果来完成。 在第二遍期间,识别每个对象最可能的潜在类。

    Person disambiguation using name entity extraction-based clustering
    9.
    发明授权
    Person disambiguation using name entity extraction-based clustering 有权
    使用基于名称实体提取的聚类方法消除歧义

    公开(公告)号:US07685201B2

    公开(公告)日:2010-03-23

    申请号:US11796818

    申请日:2007-04-30

    IPC分类号: G06F7/00 G06F17/30

    CPC分类号: G06F17/3071 G06F17/30696

    摘要: Described is a technology for disambiguating data corresponding to persons that are located from search results, so that different persons having the same name can be clearly distinguished. Name entity extraction locates words (terms) that are within a certain distance of persons' names in the search results. The terms are used in disambiguating search results that correspond to different persons having the same name, such as location information, organization information, career information, and/or partner information. In one example, each person is represented as a vector, and similarity among vectors is calculated based on weighting that corresponds to nearness of the terms to a person, and/or the types of terms. Based on the similarity data, the person vectors that represent the same person are then merged into one cluster, so that each cluster represents (to a high probability) only one distinct person.

    摘要翻译: 描述了一种用于消除对应于从搜索结果定位的人的数据的技术,使得可以清楚地区分具有相同名称的不同的人。 名称实体提取查找搜索结果中某人距离内的单词(术语)。 这些术语用于消除与具有相同名称的不同人员相对应的搜索结果,例如位置信息,组织信息,职业信息和/或合作伙伴信息。 在一个示例中,每个人被表示为向量,并且基于对应于对人的术语的接近度的加权和/或术语的类型来计算向量之间的相似性。 基于相似性数据,代表同一个人的人物向量然后被合并成一个群集,使得每个群集只代表一个不同的人。

    IDENTIFYING INFLUENTIAL PERSONS IN A SOCIAL NETWORK
    10.
    发明申请
    IDENTIFYING INFLUENTIAL PERSONS IN A SOCIAL NETWORK 有权
    在社会网络中识别受影响人

    公开(公告)号:US20080070209A1

    公开(公告)日:2008-03-20

    申请号:US11533742

    申请日:2006-09-20

    IPC分类号: G09B19/00

    CPC分类号: G06Q30/02 G06Q10/10

    摘要: An influential persons identification system and method for identifying a set of influential persons (or influencers) in a social network (such as an online social network). The influential persons set is generated such that by sending a message to the set the message will be propagated through the network at the greatest speed and coverage. A ranking of users is generated, and a pruning process is performed starting with the top-ranked user and working down the list. For each user on the list, the user is identified as an influencer and then the user and each of his friends are deleted from the social network users list. Next, the same process is performed for the second-ranked user, the third-ranked user, and so forth. The process terminates when the list of users of the social network is exhausted or the desired number of influencers on the influential person set is reached.

    摘要翻译: 在社交网络(如在线社交网络)中识别一组有影响力的人(或影响者)的有影响力的人员识别系统和方法。 产生有影响力的人员,通过发送消息给消息集,消息将以最大的速度和覆盖率通过网络传播。 生成用户排名,并从顶级用户开始执行修剪过程,并在列表中执行操作。 对于列表中的每个用户,用户被识别为影响者,然后从社交网络用户列表中删除用户和他的每个朋友。 接下来,对于第二等级的用户,第三等级的用户等执行相同的处理。 当社交网络的用户列表用完或者达到期望数量的有影响力的人集合的影响者时,该过程终止。