专利检索 ap:("Wen-tau Yih" OR "Joshua T. Goodman" OR "Lucretia H. Vanderwende" OR "Hisami Suzuki") AND inv:"Joshua T. Goodman" 第 1 页

1.

发明申请
Document summarization by maximizing informative content words 有权
标题翻译：通过最大化信息内容词汇的文档摘要

公开(公告)号：US20080109425A1

公开(公告)日：2008-05-08

申请号：US11591937

申请日：2006-11-02

申请人： Wen-tau Yih , Joshua T. Goodman , Lucretia H. Vanderwende , Hisami Suzuki

发明人： Wen-tau Yih , Joshua T. Goodman , Lucretia H. Vanderwende , Hisami Suzuki

IPC分类号： G06F17/30 , G06F15/18 , G06F9/44

CPC分类号： G06F17/30719

摘要： Document summarization is performed by scoring individual words in sentences in a document or document cluster. Sentences from the document or document cluster are selected to form a summary based on the scores of the words contained in those sentences.

摘要翻译： 通过在文档或文档集群中的句子中的单个单词进行评分来执行文档摘要。选择文档或文档集合中的句子，以便根据这些句子中包含的单词的分数来形成一个摘要。

2.

发明授权
Document summarization by maximizing informative content words 有权
标题翻译：通过最大化信息内容词汇的文档摘要

公开(公告)号：US07702680B2

公开(公告)日：2010-04-20

申请号：US11591937

申请日：2006-11-02

申请人： Wen-tau Yih , Joshua T. Goodman , Lucretia H. Vanderwende , Hisami Suzuki

发明人： Wen-tau Yih , Joshua T. Goodman , Lucretia H. Vanderwende , Hisami Suzuki

IPC分类号： G06F7/00 , G06F17/30

CPC分类号： G06F17/30719

摘要： Document summarization is performed by scoring individual words in sentences in a document or document cluster. Sentences from the document or document cluster are selected to form a summary based on the scores of the words contained in those sentences.

摘要翻译： 通过在文档或文档集群中的句子中的单个单词进行评分来执行文档摘要。选择文档或文档集合中的句子，以便根据这些句子中包含的单词的分数来形成一个摘要。

3.

发明授权
Classification using a cascade approach 失效
标题翻译：使用级联方法分类

公开(公告)号：US07693806B2

公开(公告)日：2010-04-06

申请号：US11766434

申请日：2007-06-21

申请人： Wen-tau Yih , Joshua T. Goodman , Geoffrey J. Hulten

发明人： Wen-tau Yih , Joshua T. Goodman , Geoffrey J. Hulten

IPC分类号： G06F15/18 , G06N3/08

CPC分类号： H04L51/12 , G06K9/6256 , G06Q10/06 , G06Q10/10

摘要： A system and method that facilitates and effectuates optimizing a classifier for greater performance in a specific region of classification that is of interest, such as a low false positive rate or a low false negative rate. A two-stage classification model can be trained and employed, where the first stage classification is optimized over the entire classification region and the second stage classifier is optimized for the specific region of interest. During training the entire set of training data is employed by a first stage classifier. Only data that is classified by the first stage classifier or by cross validation to fall within a region of interest is used to train the second stage classifier. During classification, data that is classified within the region of interest by the first classification is given the first stage classifier's classification value, otherwise the classification value for the instance of data from the second stage classifier is used.

摘要翻译： 促进并实现分类器在特定感兴趣区域中的更高性能的系统和方法，例如低假阳性率或低假阴性率。可以训练和采用两阶段分类模型，其中对整个分类区域优化第一阶段分类，并针对特定的兴趣区域优化第二阶段分类器。在训练期间，整套训练数据由第一阶段分类器采用。仅使用由第一阶段分类器分类的数据或通过交叉验证落入感兴趣区域内的数据来训练第二阶段分类器。在分类期间，通过第一分类对分类在感兴趣区域内的数据给予第一阶段分类器的分类值，否则使用来自第二阶段分类器的数据实例的分类值。

4.

发明授权
Web document keyword and phrase extraction 有权
标题翻译： Web文档关键字和短语提取

公开(公告)号：US08135728B2

公开(公告)日：2012-03-13

申请号：US11619230

申请日：2007-01-03

申请人： Wen-tau Yih , Joshua T. Goodman , Vitor Rocha de Carvalho

发明人： Wen-tau Yih , Joshua T. Goodman , Vitor Rocha de Carvalho

IPC分类号： G06F7/00 , G06F17/30 , G06F13/14

CPC分类号： G06F17/241 , G06F17/27 , G06F17/30 , G06F17/30616

摘要： Extraction analysis techniques biased, in part, by query frequency information from a query log file and/or search engine cache are employed along with machine learning processes to determine candidate keywords and/or phrases of web documents. Web oriented features associated with the candidate keywords and/or phrases are also utilized to analyze the web documents. A keyword and/or phrase extraction mechanism can be utilized to score keywords and/or phrases in a web document and estimate a likelihood that the keywords and/or phrases are relevant, for example, in an advertising system and the like.

摘要翻译： 提取分析技术部分地通过来自查询日志文件和/或搜索引擎高速缓冲存储器的查询频率信息以及机器学习过程来偏移来确定web文档的候选关键字和/或短语。与候选关键字和/或短语相关联的面向Web的功能也用于分析网络文档。可以使用关键字和/或短语提取机制来评估网络文档中的关键字和/或短语，并估计关键词和/或短语相关的可能性，例如在广告系统等中。

5.

发明授权
Using IP address and domain for email spam filtering 有权
标题翻译：使用IP地址和域进行垃圾邮件过滤

公开(公告)号：US07689652B2

公开(公告)日：2010-03-30

申请号：US11031672

申请日：2005-01-07

申请人： Manav Mishra , Elissa E. S. Murphy , Geoffrey J Hulten , Joshua T. Goodman , Wen-Tau Yih

发明人： Manav Mishra , Elissa E. S. Murphy , Geoffrey J Hulten , Joshua T. Goodman , Wen-Tau Yih

IPC分类号： G06F15/16 , G06F15/173

CPC分类号： H04L51/28 , H04L29/1215 , H04L51/12 , H04L61/1564 , H04L63/0227 , H04L63/1441

摘要： Email spam filtering is performed based on a combination of IP address and domain. When an email message is received, an IP address and a domain associated with the email message are determined. A cross product of the IP address (or portions of the IP address) and the domain (or portions of the domain) is calculated. If the email message is known to be either spam or non-spam, then a spam score based on the known spam status is stored in association with each (IP address, domain) pair element of the cross product. If the spam status of the email message is not known, then the (IP address, domain) pair elements of the cross product are used to lookup previously determined spam scores. A combination of the previously determined spam scores is used to determine whether or not to treat the received email message as spam.

摘要翻译： 电子邮件垃圾邮件过滤是基于IP地址和域名的组合来执行的。当接收到电子邮件消息时，确定与电子邮件消息相关联的IP地址和域。计算IP地址（或IP地址的部分）和域（或域的部分）的交叉乘积。如果电子邮件消息被称为垃圾邮件或非垃圾邮件，则根据已知垃圾邮件状态的垃圾邮件分数与交叉产品的每个（IP地址，域）对元素相关联地存储。如果电子邮件的垃圾邮件状态未知，则交叉产品的（IP地址，域）对元素将用于查找先前确定的垃圾邮件分数。使用先前确定的垃圾邮件分数的组合来确定是否将接收的电子邮件消息视为垃圾邮件。

6.

发明申请
CLASSIFICATION USING A CASCADE APPROACH 失效
标题翻译：使用CASCADE方法进行分类

公开(公告)号：US20080319932A1

公开(公告)日：2008-12-25

申请号：US11766434

申请日：2007-06-21

申请人： Wen-tau Yih , Joshua T. Goodman , Geoffrey J. Hulten

发明人： Wen-tau Yih , Joshua T. Goodman , Geoffrey J. Hulten

IPC分类号： G06F15/18

CPC分类号： H04L51/12 , G06K9/6256 , G06Q10/06 , G06Q10/10

摘要： A system and method that facilitates and effectuates optimizing a classifier for greater performance in a specific region of classification that is of interest, such as a low false positive rate or a low false negative rate. A two-stage classification model can be trained and employed, where the first stage classification is optimized over the entire classification region and the second stage classifier is optimized for the specific region of interest. During training the entire set of training data is employed by a first stage classifier. Only data that is classified by the first stage classifier or by cross validation to fall within a region of interest is used to train the second stage classifier. During classification, data that is classified within the region of interest by the first classification is given the first stage classifier's classification value, otherwise the classification value for the instance of data from the second stage classifier is used.

摘要翻译： 促进并实现分类器在特定感兴趣区域中的更高性能的系统和方法，例如低假阳性率或低假阴性率。可以训练和采用两阶段分类模型，其中对整个分类区域优化第一阶段分类，并针对特定的兴趣区域优化第二阶段分类器。在训练期间，整套训练数据由第一阶段分类器采用。仅使用由第一阶段分类器分类的数据或通过交叉验证落入感兴趣区域内的数据来训练第二阶段分类器。在分类期间，通过第一分类对分类在感兴趣区域内的数据给予第一阶段分类器的分类值，否则使用来自第二阶段分类器的数据实例的分类值。

7.

发明授权
Training filters for detecting spasm based on IP addresses and text-related features 有权
标题翻译：培训过滤器，用于根据IP地址和文本相关功能检测痉挛

公开(公告)号：US07464264B2

公开(公告)日：2008-12-09

申请号：US10809163

申请日：2004-03-25

申请人： Joshua T. Goodman , Robert L. Rounthwaite , Geoffrey J. Hulten , Wen-tau Yih

发明人： Joshua T. Goodman , Robert L. Rounthwaite , Geoffrey J. Hulten , Wen-tau Yih

IPC分类号： H04L9/00 , G06F21/00

CPC分类号： H04L51/12 , G06Q10/107

摘要： The subject invention provides for an intelligent quarantining system and method that facilitates detecting and preventing spam. In particular, the invention employs a machine learning filter specifically trained using origination features such as an IP address as well as destination feature such as a URL. Moreover, the system and method involve training a plurality of filters using specific feature data for each filter. The filters are trained independently each other, thus one feature may not unduly influence another feature in determining whether a message is spam. Because multiple filters are trained and available to scan messages either individually or in combination (at least two filters), the filtering or spam detection process can be generalized to new messages having slightly modified features (e.g., IP address). The invention also involves locating the appropriate IP addresses or URLs in a message as well as guiding filters to weigh origination or destination features more than text-based features.

摘要翻译： 本发明提供了一种便于检测和防止垃圾邮件的智能隔离系统和方法。特别地，本发明采用使用诸如IP地址之类的发起特征以及目的地特征（例如URL）专门训练的机器学习滤波器。此外，该系统和方法涉及使用针对每个滤波器的特定特征数据来训练多个滤波器。滤波器被彼此独立地训练，因此在确定消息是否是垃圾邮件时，一个特征可能不会不适当地影响另一特征。由于多个过滤器被训练并可用于单独或组合扫描消息（至少两个过滤器），因此过滤或垃圾邮件检测过程可以推广到具有稍微修改的特征（例如，IP地址）的新消息。本发明还涉及在消息中定位适当的IP地址或URL，以及引导过滤器比基于文本的特征更重要的起始或目的地特征。

8.

发明授权
Search engine that identifies and uses social networks in communications, retrieval, and electronic commerce 有权
标题翻译：在通信，检索和电子商务中识别和使用社交网络的搜索引擎

公开(公告)号：US09396269B2

公开(公告)日：2016-07-19

申请号：US11427288

申请日：2006-06-28

申请人： Christopher A. Meek , Eric J. Horvitz , Joshua T. Goodman , Gary W. Flake , Oliver Hurst-Hiller , Anoop Gupta , Ramez Naam , Kenneth A. Moss , William H. Gates, III , John C. Platt , Trenholme J. Griffin , Bradly A. Brunell

发明人： Christopher A. Meek , Eric J. Horvitz , Joshua T. Goodman , Gary W. Flake , Oliver Hurst-Hiller , Anoop Gupta , Ramez Naam , Kenneth A. Moss , William H. Gates, III , John C. Platt , Trenholme J. Griffin , Bradly A. Brunell

IPC分类号： G06F17/30

CPC分类号： G06F17/30867

摘要： Architecture that monitors interaction data (e.g., search queries, query results and click-through rates), and provides users with links to other users that fall into similar categories with respect to the foregoing monitored activities (e.g., providing links to individuals and groups that share common interests and/or profiles). A search engine can be interactively coupled with one or more social networks, and that maps individuals and/or groups within respective social networks to subsets of categories associated with searches. A database stores mapped information which can be continuously updated and reorganized as links within the system mapping become stronger or weaker. The architecture can comprise a social network system that includes a database for mapping search-related information to an entity of a social network, and a search component for processing a search query for search results and returning a link to an entity of a social network based on the search query.

摘要翻译： 监视交互数据（例如搜索查询，查询结果和点击率）的架构，并向用户提供与上述受监视活动相类似的类别的其他用户的链接（例如，提供与个人和组的链接分享共同的兴趣和/或简介）。搜索引擎可以与一个或多个社交网络交互地耦合，并且将各个社交网络内的个人和/或组映射到与搜索相关联的类别的子集。数据库存储可以不断更新和重新组织的映射信息，因为系统映射中的链接变得更强或更弱。该架构可以包括社交网络系统，其包括用于将搜索相关信息映射到社交网络的实体的数据库，以及用于处理搜索结果的搜索查询并返回到基于社交网络的实体的链接的搜索组件在搜索查询上。

9.

发明授权
Content presentation based on user preferences 有权
标题翻译：基于用户偏好的内容呈现

公开(公告)号：US07997485B2

公开(公告)日：2011-08-16

申请号：US11427748

申请日：2006-06-29

申请人： Gary W. Flake , Eric J. Horvitz , Joshua T. Goodman , Eric D. Brill , Bradly A. Brunell , Susan T. Dumais , Alexander G. Gounares , Trenholme J. Griffin , Oliver Hurst-Hiller , Raymond E. Ozzie

发明人： Gary W. Flake , Eric J. Horvitz , Joshua T. Goodman , Eric D. Brill , Bradly A. Brunell , Susan T. Dumais , Alexander G. Gounares , Trenholme J. Griffin , Oliver Hurst-Hiller , Raymond E. Ozzie

IPC分类号： G06K15/00

CPC分类号： G06F21/6245 , G06Q30/02 , G06Q30/0273 , H04L63/0227 , H04L63/102 , H04L63/1483 , H04L67/306 , H04W12/12

摘要： Architecture is provided that facilitates user-controlled access to user profile information. A user is allowed to selectively expose (or mask) portions of his/her profile to third parties. Additionally, advertisers and/or content providers can offer incentives or enticement in response to the acceptance of which a user exposes larger portions of their profile. The architecture comprises a system that facilitates profile management utilizing a profile component that facilitates creation and storage of an electronic profile of a user, and a control component under control of the user for controlling access to the profile. Machine learning and reasoning is provided to make inferences and automate aspects thereof.

摘要翻译： 提供了便于用户控制访问用户简档信息的体系结构。允许用户选择性地（或掩盖）他/她的个人资料的部分到第三方。此外，广告商和/或内容提供商可以提供激励或诱惑以响应用户公开其配置文件的较大部分的接受。该架构包括利用简化组件来促进简档管理的系统，该简档组件便于用户的电子简档的创建和存储，以及用户控制下的用于控制对简档的访问的控制组件。提供机器学习和推理来推断和自动化其方面。

10.

发明授权
Storage abuse prevention 有权
标题翻译：存储滥用预防

公开(公告)号：US07848501B2

公开(公告)日：2010-12-07

申请号：US11042245

申请日：2005-01-25

申请人： Joshua T. Goodman , Carl M. Kadie , Christopher A. Meek

发明人： Joshua T. Goodman , Carl M. Kadie , Christopher A. Meek

IPC分类号： H04M3/42 , G06F15/16

CPC分类号： G06F21/606 , G06F21/316 , G06F21/552 , G06F2221/2135 , H04L51/00 , H04L63/1408

摘要： The subject invention provides a unique system and method that facilitates mitigation of storage abuse in connection with free storage provided by messaging service providers such as email, instant messaging, chat, blogging, and/or web hosting service providers. The system and method involve measuring the outbound volume of stored data. When the volume satisfies a threshold, a cost can be imposed on the account to mitigate the suspicious or abusive activity. Other factors can be considered as well that can modify the cost imposed on the cost such as by increasing the cost. Machine learning can be employed as well to predict a level or degree of suspicion. The various factors or the text of the messages can be used as input for the machine learning system.

摘要翻译： 本发明提供了一种独特的系统和方法，其有助于缓解由诸如电子邮件，即时消息，聊天，博客和/或网络托管服务提供商之类的消息传递服务提供商提供的免费存储的存储滥用。系统和方法涉及测量存储数据的出站量。当卷满足阈值时，可以对该帐户施加成本以减轻可疑或滥用活动。也可以考虑其他因素，从而可以通过增加成本来改变对成本的成本。也可以使用机器学习来预测一定程度的怀疑。消息的各种因素或文本可以用作机器学习系统的输入。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类