-
公开(公告)号:US09424346B2
公开(公告)日:2016-08-23
申请号:US13453901
申请日:2012-04-23
Applicant: Abdur R. Chowdhury , Steven Michael Beitzel , David Dolan Lewis , Aleksander Kolcz
Inventor: Abdur R. Chowdhury , Steven Michael Beitzel , David Dolan Lewis , Aleksander Kolcz
IPC: G06F17/30
CPC classification number: G06F17/30707 , G06F17/30657 , G06F17/30864
Abstract: A query phrase may be automatically classified to one or more topics of interest (e.g., categories) to assist in routing the query phrase to one or more appropriate backend databases. A selectional preference query classification technique may be used to classify the query phrase based on a comparison between the query phrase and patterns of query phrases. Additionally, or alternatively, a combination of query classification techniques may be used to classify the query phrase. Topical classification of a query phrase also may be used to assist a search system in delivering auxiliary information to a user who entered the query phrase. Advertisements, for instance, may be tailored based on classification rather than query keywords.
-
2.
公开(公告)号:US20130173518A1
公开(公告)日:2013-07-04
申请号:US13621034
申请日:2012-09-15
Applicant: Joshua Alspector , Aleksander Kolcz , Abdur R. Chowdhury
Inventor: Joshua Alspector , Aleksander Kolcz , Abdur R. Chowdhury
IPC: G06N5/02
CPC classification number: G06F16/35 , G06F16/1748 , G06F16/353 , G06N5/02 , H04L51/12
Abstract: A classification system includes a signature-based duplicate detector and an inductive classifier that share attribute information. To perform the duplicate detection and the classification, the duplicate detector and inductive classifier are first initialized by generating a lexicon of attributes for the duplicate detector and a classification model for the classifier. To develop a classification model, a training set of documents of known class are used by the classifier to determine the attributes of the documents that are most useful in classifying an unknown document. The model is developed from these attributes. Attribute information containing the attributes determined by the classifier is then passed to the duplicate detector and the duplicate detector uses the attribute information to generate the lexicon of attributes.
-
公开(公告)号:US20130007026A1
公开(公告)日:2013-01-03
申请号:US13612840
申请日:2012-09-13
Applicant: Joshua ALSPECTOR , Aleksander KOLCZ , Abdur R. CHOWDHURY
Inventor: Joshua ALSPECTOR , Aleksander KOLCZ , Abdur R. CHOWDHURY
IPC: G06F7/00
CPC classification number: G06F17/30156 , G06F17/30011 , Y10S707/99943 , Y10S707/99945
Abstract: In a single-signature duplicate document system, a secondary set of attributes is used in addition to a primary set of attributes so as to improve the precision of the system. When the projection of a document onto the primary set of attributes is below a threshold, then a secondary set of attributes is used to supplement the primary lexicon so that the projection is above the threshold.
Abstract translation: 在单签名重复文档系统中,除了主要属性集之外还使用辅助属性集,以提高系统的精度。 当将文档投影到主要属性集合上时,使用辅助的一组属性来补充主要词典,使得投影高于阈值。
-
公开(公告)号:US20120197913A1
公开(公告)日:2012-08-02
申请号:US13363806
申请日:2012-02-01
Applicant: Ophir FRIEDER , Abdur R. Chowdhury
Inventor: Ophir FRIEDER , Abdur R. Chowdhury
IPC: G06F17/30
CPC classification number: G06F17/30687 , Y10S707/99948
Abstract: A document is compared to the documents in a document collection using a hash algorithm and collection statistics to detect if the document is similar to any of the documents in the document collection.
Abstract translation: 使用散列算法和收集统计信息将文档与文档集合中的文档进行比较,以检测文档是否与文档集合中的任何文档类似。
-
公开(公告)号:US20100169329A1
公开(公告)日:2010-07-01
申请号:US12643662
申请日:2009-12-21
Applicant: Ophir FRIEDER , Abdur R. Chowdhury
Inventor: Ophir FRIEDER , Abdur R. Chowdhury
IPC: G06F17/30
CPC classification number: G06F17/30687 , Y10S707/99948
Abstract: A document is compared to the documents in a document collection using a hash algorithm and collection statistics to detect if the document is similar to any of the documents in the document collection.
Abstract translation: 使用散列算法和收集统计信息将文档与文档集合中的文档进行比较,以检测文档是否与文档集合中的任何文档类似。
-
公开(公告)号:US07562069B1
公开(公告)日:2009-07-14
申请号:US11023643
申请日:2004-12-29
Applicant: Abdur R. Chowdhury , Gregory S. Pass
Inventor: Abdur R. Chowdhury , Gregory S. Pass
IPC: G06F17/30
CPC classification number: G06F17/30528 , G06F17/30395 , G06F17/30424 , G06F17/30598 , G06F17/30646 , G06F17/30719 , G06F17/30864 , G06F17/30867 , G06Q30/0601 , Y10S707/948 , Y10S707/99932 , Y10S707/99933 , Y10S707/99934 , Y10S707/99944
Abstract: A search query is resolved prior to being submitted to one or more search engines. The query is resolved such that the query unambiguously corresponds to a category included in a query ontology that relates search queries to query categories. The query may be resolved by supplementing the query with additional information corresponding to the category. For example, the query may be formatted into a canonical form of the query for the category. Alternatively or additionally, the query may be supplemented with one or more keywords that are associated with the category and that represent words or phrases that appear in a high percentage of search results for queries from the category. Resolving the query yields search results that more closely reflect search results desired by a user submitting the query.
Abstract translation: 搜索查询在提交给一个或多个搜索引擎之前得到解决。 查询被解析,使得查询明确地对应于包括在查询本体中的类别,其将搜索查询与查询类别相关联。 可以通过用与该类别相对应的附加信息补充查询来解决查询。 例如,查询可能被格式化为该类别的查询的规范形式。 或者或另外,查询可以用与该类别相关联的一个或多个关键字进行补充,并且表示对来自该类别的查询的搜索结果的高比例出现的单词或短语。 解决查询会产生更接近反映用户提交查询所需的搜索结果的搜索结果。
-
公开(公告)号:US07392262B1
公开(公告)日:2008-06-24
申请号:US11016959
申请日:2004-12-21
Applicant: Joshua Alspector , Aleksander Kolcz , Abdur R. Chowdhury
Inventor: Joshua Alspector , Aleksander Kolcz , Abdur R. Chowdhury
IPC: G06F17/00
CPC classification number: G06F17/30156 , G06F17/30011 , Y10S707/99943 , Y10S707/99945
Abstract: In a single-signature duplicate document system, a secondary set of attributes is used in addition to a primary set of attributes so as to improve the precision of the system. When the projection of a document onto the primary set of attributes is below a threshold, then a secondary set of attributes is used to supplement the primary lexicon so that the projection is above the threshold.
Abstract translation: 在单签名重复文档系统中,除了主要属性集之外还使用辅助属性集,以提高系统的精度。 当将文档投影到主要属性集合上时,使用辅助的一组属性来补充主要词典,使得投影高于阈值。
-
公开(公告)号:US07272597B2
公开(公告)日:2007-09-18
申请号:US11023648
申请日:2004-12-29
Applicant: Abdur R. Chowdhury , Gregory S. Pass , Gerald Frederick Campbell
Inventor: Abdur R. Chowdhury , Gregory S. Pass , Gerald Frederick Campbell
CPC classification number: G06F17/30864 , G06F17/30675 , G06F17/30876 , Y10S707/99933
Abstract: Expert domains for a query category represent domains from which a high percentage of search results for queries associated with the query category are retrieved. The expert domains are identified by establishing a base statistical model that indicates frequencies of appearance for domains in search results retrieved for queries corresponding to multiple categories. In addition, frequencies of domain appearance are determined for search results retrieved for queries associated with a category. Domains that appear more frequently in the search results corresponding to the category are identified as expert domains for the category. A user may be allowed to customize expert domains related to one or more categories by adding or removing expert domains for the category.
Abstract translation: 查询类别的专家域表示检索与查询类别相关联的查询的高比例搜索结果的域。 通过建立基准统计模型来识别专家领域,该基础统计模型指示针对与多个类别相对应的查询检索的搜索结果中的域的出现频率。 另外,针对与类别相关联的查询检索的搜索结果确定域外观的频率。 在与该类别相对应的搜索结果中更频繁出现的域被标识为该类别的专家域。 可以允许用户通过添加或删除该类别的专家域来定制与一个或多个类别相关的专家域。
-
公开(公告)号:US08768940B2
公开(公告)日:2014-07-01
申请号:US13612840
申请日:2012-09-13
Applicant: Joshua Alspector , Abdur R. Chowdhury , Aleksander Kolcz
Inventor: Joshua Alspector , Abdur R. Chowdhury , Aleksander Kolcz
IPC: G06F17/30
CPC classification number: G06F17/30156 , G06F17/30011 , Y10S707/99943 , Y10S707/99945
Abstract: In a single-signature duplicate document system, a secondary set of attributes is used in addition to a primary set of attributes so as to improve the precision of the system. When the projection of a document onto the primary set of attributes is below a threshold, then a secondary set of attributes is used to supplement the primary lexicon so that the projection is above the threshold.
Abstract translation: 在单签名重复文档系统中,除了主要属性集之外还使用辅助属性集,以提高系统的精度。 当将文档投影到主要属性集合上时,使用辅助的一组属性来补充主要词典,使得投影高于阈值。
-
公开(公告)号:US08521713B2
公开(公告)日:2013-08-27
申请号:US13180156
申请日:2011-07-11
Applicant: Abdur R. Chowdhury , Gregory S. Pass , Gerald Frederick Campbell
Inventor: Abdur R. Chowdhury , Gregory S. Pass , Gerald Frederick Campbell
CPC classification number: G06F17/30864 , G06F17/30675 , G06F17/30876 , Y10S707/99933
Abstract: Expert domains for a query category represent domains from which a high percentage of search results for queries associated with the query category are retrieved. The expert domains are identified by establishing a base statistical model that indicates frequencies of appearance for domains in search results retrieved for queries corresponding to multiple categories. In addition, frequencies of domain appearance are determined for search results retrieved for queries associated with a category. Domains that appear more frequently in the search results corresponding to the category are identified as expert domains for the category. A user may be allowed to customize expert domains related to one or more categories by adding or removing expert domains for the category.
Abstract translation: 查询类别的专家域表示检索与查询类别相关联的查询的高比例搜索结果的域。 通过建立基准统计模型来识别专家领域,该基础统计模型指示针对与多个类别相对应的查询检索的搜索结果中的域的出现频率。 另外,针对与类别相关联的查询检索的搜索结果确定域外观的频率。 在与该类别相对应的搜索结果中更频繁出现的域被标识为该类别的专家域。 可以允许用户通过添加或删除该类别的专家域来定制与一个或多个类别相关的专家域。
-
-
-
-
-
-
-
-
-