Distributed non-negative matrix factorization
    1.
    发明授权
    Distributed non-negative matrix factorization 有权
    分布式非负矩阵分解

    公开(公告)号:US08356086B2

    公开(公告)日:2013-01-15

    申请号:US12750772

    申请日:2010-03-31

    CPC分类号: G06F17/16

    摘要: Architecture that scales up the non-negative matrix factorization (NMF) technique to a distributed NMF (denoted DNMF) to handle large matrices, for example, on a web scale that can include millions and billions of data points. To analyze web-scale data, DNMF is applied through parallelism on distributed computer clusters, for example, with thousands of machines. In order to maximize the parallelism and data locality, matrices are partitioned in the short dimension. The probabilistic DNMF can employ not only Gaussian and Poisson NMF techniques, but also exponential NMF for modeling web dyadic data (e.g., dwell time of a user on browsed web pages).

    摘要翻译: 将非负矩阵分解(NMF)技术扩展到分布式NMF(表示为DNMF)以处理大型矩阵的架构,例如,可以包括数百万和数十亿个数据点的网络规模。 为了分析网络规模数据,DNMF通过并行性应用于分布式计算机集群,例如数千台机器。 为了最大化并行度和数据局部性,矩阵在短维中被划分。 概率DNMF不仅可以采用高斯和泊松NMF技术,还可以采用指数NMF来建模网络二进制数据(例如,用户在浏览的网页上的停留时间)。

    DISTRIBUTED NON-NEGATIVE MATRIX FACTORIZATION
    2.
    发明申请
    DISTRIBUTED NON-NEGATIVE MATRIX FACTORIZATION 有权
    分布式非负矩阵法

    公开(公告)号:US20110246573A1

    公开(公告)日:2011-10-06

    申请号:US12750772

    申请日:2010-03-31

    IPC分类号: G06F15/16

    CPC分类号: G06F17/16

    摘要: Architecture that scales up the non-negative matrix factorization (NMF) technique to a distributed NMF (denoted DNMF) to handle large matrices, for example, on a web scale that can include millions and billions of data points. To analyze web-scale data, DNMF is applied through parallelism on distributed computer clusters, for example, with thousands of machines. In order to maximize the parallelism and data locality, matrices are partitioned in the short dimension. The probabilistic DNMF can employ not only Gaussian and Poisson NMF techniques, but also exponential NMF for modeling web dyadic data (e.g., dwell time of a user on browsed web pages).

    摘要翻译: 将非负矩阵分解(NMF)技术扩展到分布式NMF(表示为DNMF)以处理大型矩阵的架构,例如,可以包括数百万和数十亿个数据点的网络规模。 为了分析网络规模数据,DNMF通过并行性应用于分布式计算机集群,例如数千台机器。 为了最大化并行度和数据局部性,矩阵在短维中被划分。 概率DNMF不仅可以采用高斯和泊松NMF技术,还可以采用指数NMF来建模网络二进制数据(例如,用户在浏览的网页上的停留时间)。

    INTERACTIVE WEB CRAWLER
    3.
    发明申请
    INTERACTIVE WEB CRAWLER 有权
    互动WEB CRAWLER

    公开(公告)号:US20120323881A1

    公开(公告)日:2012-12-20

    申请号:US13163001

    申请日:2011-06-17

    IPC分类号: G06F17/30

    摘要: The claimed subject matter provides a system or method for web crawling hidden files. An exemplary method comprises loading a web page with a browser agent, and executing any dynamic elements hosted on the web page using the browser agent to insert pre-determined values. A list of form controls may be retrieved from the web page using the browser agent, and the controls may be analyzed using a driver component. Form control values may be sent from the driver component to the browser agent, and an event may be submitted to the web page by the browser agent or scripted content may be run to trigger operations on the web page corresponding to the form control values. A URL may be generated for various form control values using a generalizer.

    摘要翻译: 所要求保护的主题提供用于网络爬行隐藏文件的系统或方法。 示例性方法包括使用浏览器代理加载网页,以及使用浏览器代理来执行托管在网页上的任何动态元素以插入预定值。 可以使用浏览器代理从网页检索表单控件的列表,并且可以使用驱动器组件来分析控件。 表单控制值可以从驱动器组件发送到浏览器代理,并且可以由浏览器代理将事件提交到网页,或者可以运行脚本内容来触发对应于表单控制值的网页上的操作。 可以使用泛化器为各种形式控制值生成URL。

    Click chain model
    4.
    发明授权
    Click chain model 有权
    点击链模型

    公开(公告)号:US08126894B2

    公开(公告)日:2012-02-28

    申请号:US12327783

    申请日:2008-12-03

    IPC分类号: G06F7/00 G06F17/30

    CPC分类号: G06F17/30864

    摘要: Techniques are described for generating a statistical model from observed click chains. The model can be used to compute a probability that a document is relevant to a given search query. With the model, a probability of a user examining a given document in a given search result conditionally depends on: a probability that a preceding document in the given search result is examined by a user viewing the given search result; a probability that the preceding document is clicked on by a user viewing the given search result, which conditionally depends directly on the probability that the preceding document is examined and on a probability of relevance of the preceding document.

    摘要翻译: 描述了从观察到的点击链中生成统计模型的技术。 该模型可用于计算文档与给定搜索查询相关的概率。 使用该模型,用户在给定搜索结果中检查给定文档的概率有条件地取决于:给定搜索结果中的前一个文档被查看给定搜索结果的用户检查的概率; 观看给定搜索结果的用户点击前一文档的概率,其有条件地直接取决于前一文档被检查的概率和前一文档的相关概率。

    CLICK CHAIN MODEL
    5.
    发明申请
    CLICK CHAIN MODEL 有权
    点击链模型

    公开(公告)号:US20100138410A1

    公开(公告)日:2010-06-03

    申请号:US12327783

    申请日:2008-12-03

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30864

    摘要: Techniques are described for generating a statistical model from observed click chains. The model can be used to compute a probability that a document is relevant to a given search query. With the model, a probability of a user examining a given document in a given search result conditionally depends on: a probability that a preceding document in the given search result is examined by a user viewing the given search result; a probability that the preceding document is clicked on by a user viewing the given search result, which conditionally depends directly on the probability that the preceding document is examined and on a probability of relevance of the preceding document.

    摘要翻译: 描述了从观察到的点击链中生成统计模型的技术。 该模型可用于计算文档与给定搜索查询相关的概率。 使用该模型,用户在给定搜索结果中检查给定文档的概率有条件地取决于:给定搜索结果中的前一个文档被查看给定搜索结果的用户检查的概率; 观看给定搜索结果的用户点击前一文档的概率,其有条件地直接取决于前一文档被检查的概率和前一文档的相关概率。

    Interactive web crawler
    6.
    发明授权
    Interactive web crawler 有权
    互动式网页抓取工具

    公开(公告)号:US08538949B2

    公开(公告)日:2013-09-17

    申请号:US13163001

    申请日:2011-06-17

    IPC分类号: G06F17/30

    摘要: The claimed subject matter provides a system or method for web crawling hidden files. An exemplary method includes loading a web page with a browser agent, and executing any dynamic elements hosted on the web page using the browser agent to insert pre-determined values. A list of form controls may be retrieved from the web page using the browser agent, and the controls may be analyzed using a driver component. Form control values may be sent from the driver component to the browser agent, and an event may be submitted to the web page by the browser agent or scripted content may be run to trigger operations on the web page corresponding to the form control values. A URL may be generated for various form control values using a generalizer.

    摘要翻译: 所要求保护的主题提供用于网络爬行隐藏文件的系统或方法。 示例性方法包括使用浏览器代理加载网页,以及使用浏览器代理来执行托管在网页上的任何动态元素以插入预定值。 可以使用浏览器代理从网页检索表单控件的列表,并且可以使用驱动器组件来分析控件。 表单控制值可以从驱动器组件发送到浏览器代理,并且可以由浏览器代理将事件提交到网页,或者可以运行脚本内容来触发对应于表单控制值的网页上的操作。 可以使用泛化器为各种形式控制值生成URL。

    PROBABILISTIC GRADIENT BOOSTED MACHINES
    7.
    发明申请
    PROBABILISTIC GRADIENT BOOSTED MACHINES 审中-公开
    概念梯级增压机

    公开(公告)号:US20110264609A1

    公开(公告)日:2011-10-27

    申请号:US12764979

    申请日:2010-04-22

    申请人: Chao Liu Yi-Min Wang

    发明人: Chao Liu Yi-Min Wang

    IPC分类号: G06F15/18

    CPC分类号: G06N20/00

    摘要: Probabilistic gradient boosted machines are described herein. A probabilistic gradient boosted machine can be utilized to learn a function based at least in part upon sets of observations of a target attribute that is common across a plurality of entities and feature vectors that are representative of such entities. The sets of observations are assumed to accord to a distribution function in the exponential family. The learned function is utilized to generate values that are employed parameterize the distribution function, such that sets of observations can be predicted for different entities.

    摘要翻译: 这里描述了概率梯度升高的机器。 可以使用概率梯度增强机器来至少部分地基于多个实体中共同的目标属性的观察集合和代表这样的实体的特征向量来学习功能。 假设观测集合符合指数族中的分布函数。 所学习的函数用于产生参数化分布函数的值,使得可以针对不同实体预测观测集。

    Learning a ranker to rank entities with automatically derived domain-specific preferences

    公开(公告)号:US10235679B2

    公开(公告)日:2019-03-19

    申请号:US12764983

    申请日:2010-04-22

    申请人: Chao Liu Yi-Min Wang

    发明人: Chao Liu Yi-Min Wang

    IPC分类号: G06Q30/02

    摘要: A system is described herein that includes a preference deriver component that receives a predefined preference rule that indicates a hierarchy pertaining to entities belonging to a domain, wherein each of the entities has attributes and values for such attributes corresponding thereto, and wherein the preference deriver component outputs preferences between various subsets of entities based at least in part upon the preference rule. The system also includes a learning component that learns a computer-implemented ranker component that is configured to rank the entities belonging to the domain, wherein the learning component learns the computer-implemented ranker based at least in part upon the preferences between the various subsets of the entities output by the preference deriver component.

    LEARNING A RANKER TO RANK ENTITIES WITH AUTOMATICALLY DERIVED DOMAIN-SPECIFIC PREFERENCES
    9.
    发明申请
    LEARNING A RANKER TO RANK ENTITIES WITH AUTOMATICALLY DERIVED DOMAIN-SPECIFIC PREFERENCES 审中-公开
    学习一个具有自动衍生的域特定优先级的实体的排名

    公开(公告)号:US20110264518A1

    公开(公告)日:2011-10-27

    申请号:US12764983

    申请日:2010-04-22

    申请人: Chao Liu Yi-Min Wang

    发明人: Chao Liu Yi-Min Wang

    IPC分类号: G06F15/18 G06Q30/00 G06F17/30

    CPC分类号: G06Q30/02 G06Q30/0251

    摘要: A system is described herein that includes a preference deriver component that receives a predefined preference rule that indicates a hierarchy pertaining to entities belonging to a domain, wherein each of the entities has attributes and values for such attributes corresponding thereto, and wherein the preference deriver component outputs preferences between various subsets of entities based at least in part upon the preference rule. The system also includes a learning component that learns a computer-implemented ranker component that is configured to rank the entities belonging to the domain, wherein the learning component learns the computer-implemented ranker based at least in part upon the preferences between the various subsets of the entities output by the preference deriver component.

    摘要翻译: 本文描述了一种系统,其包括偏好提取器组件,其接收指示属于域的实体的层次结构的预定义的优先级规则,其中每个实体具有与其对应的这些属性的属性和值,并且其中优选提升组件 至少部分地基于偏好规则来输出各个实体子集之间的偏好。 所述系统还包括学习组件,其学习被配置为对属于所述域的实体进行排名的计算机实现的游戏者组件,其中所述学习组件至少部分地基于所述域的各个子集之间的偏好来学习所述计算机实现的游戏者 由优选导出器组件输出的实体。

    Method, Electronic Program Menu and Processing Device for Displaying Television Program Related Information
    10.
    发明申请
    Method, Electronic Program Menu and Processing Device for Displaying Television Program Related Information 审中-公开
    方法,电子节目菜单和显示电视节目相关信息的处理设备

    公开(公告)号:US20140351857A1

    公开(公告)日:2014-11-27

    申请号:US13813159

    申请日:2010-12-03

    摘要: The present invention relates to a method for displaying information associated with television program, which includes: fetching a plurality of sequentially arranged program listings and corresponding program notes; generating an electronic program guide according to the program listings and corresponding program notes; and displaying the electronic program guide. The electronic program guide includes a program listing, a program note associated with the program listing, and at least one icon indicating that the user would select to display the previous or the next program listing in the electronic program guide. The present invention further provides an electronic program guide and a processing apparatus for generating the electronic program guide. The electronic program guide can display information associated with TV program in a more intuitive manner.

    摘要翻译: 本发明涉及一种用于显示与电视节目有关的信息的方法,包括:获取多个顺序排列的节目列表和相应节目音符; 根据节目列表和相应的节目说明生成电子节目指南; 并显示电子节目指南。 电子节目指南包括节目列表,与节目列表相关联的节目说明,以及指示用户将选择在电子节目指南中显示先前节目列表或下一节目列表的至少一个图标。 本发明还提供了一种用于产生电子节目指南的电子节目指南和处理装置。 电子节目指南可以以更直观的方式显示与电视节目相关联的信息。