Patent search ap:"Aleksander Kolcz" Page 1

1.

发明授权
Duplicate document detection 有权
Title translation: 重复文件检测

公开(公告)号：US08768940B2

公开(公告)日：2014-07-01

申请号：US13612840

申请日：2012-09-13

Applicant: Joshua Alspector , Abdur R. Chowdhury , Aleksander Kolcz

Inventor： Joshua Alspector , Abdur R. Chowdhury , Aleksander Kolcz

IPC: G06F17/30

CPC classification number: G06F17/30156 , G06F17/30011 , Y10S707/99943 , Y10S707/99945

Abstract: In a single-signature duplicate document system, a secondary set of attributes is used in addition to a primary set of attributes so as to improve the precision of the system. When the projection of a document onto the primary set of attributes is below a threshold, then a secondary set of attributes is used to supplement the primary lexicon so that the projection is above the threshold.

Abstract translation: 在单签名重复文档系统中，除了主要属性集之外还使用辅助属性集，以提高系统的精度。当将文档投影到主要属性集合上时，使用辅助的一组属性来补充主要词典，使得投影高于阈值。

2.

发明授权
Reliability of duplicate document detection algorithms 有权
Title translation: 重复文件检测算法的可靠性

公开(公告)号：US08429178B2

公开(公告)日：2013-04-23

申请号：US13185238

申请日：2011-07-18

Applicant: Joshua Alspector , Aleksander Kolcz , Abdur R. Chowdhury

Inventor： Joshua Alspector , Aleksander Kolcz , Abdur R. Chowdhury

IPC: G06F17/30

CPC classification number: G06F17/30156 , G06F17/30011 , Y10S707/99943 , Y10S707/99945

Abstract: In a single-signature duplicate document system, a secondary set of attributes is used in addition to a primary set of attributes so as to improve the precision of the system. When the projection of a document onto the primary set of attributes is below a threshold, then a secondary set of attributes is used to supplement the primary lexicon so that the projection is above the threshold.

Abstract translation: 在单签名重复文档系统中，除了主要属性集之外还使用辅助属性集，以提高系统的精度。当将文档投影到主要属性集合上时，使用辅助的一组属性来补充主要词典，使得投影高于阈值。

3.

发明授权
Detecting spam from metafeatures of an email message 有权
Title translation: 从电子邮件的元数据中检测垃圾邮件

公开(公告)号：US08370930B2

公开(公告)日：2013-02-05

申请号：US12039727

申请日：2008-02-28

Applicant: Chad Mills , Ryan Colvin , Kevin Chan , Robert McCann , Aleksander Kolcz

Inventor： Chad Mills , Ryan Colvin , Kevin Chan , Robert McCann , Aleksander Kolcz

IPC: G06F11/00

CPC classification number: H04L51/12

Abstract: Detecting spam from metafeatures of an email message. As a part of detecting spam, the email message is accessed and a distribution of numerical values is accorded to a set of features of the email message. It is determined whether the distribution of numerical values accorded the set of features of the email message is consistent with that of spam. Access is provided to the determination of whether the email message has a distribution of numerical values accorded the set of features that is consistent with that of spam.

Abstract translation: 从电子邮件的元数据中检测垃圾邮件。作为检测垃圾邮件的一部分，电子邮件消息被访问，数字分布符合电子邮件消息的一组功能。确定符合电子邮件消息特征的数值分配是否与垃圾邮件的一致。提供访问以确定电子邮件消息是否具有符合与垃圾邮件一致的特征集合的数值分布。

4.

发明授权
Filtering system for providing personalized information in the absence of negative data 有权
Title translation: 过滤系统，在没有负数据的情况下提供个性化信息

公开(公告)号：US08060507B2

公开(公告)日：2011-11-15

申请号：US12987046

申请日：2011-01-07

Applicant: Joshua Alspector , Aleksander Kolcz

Inventor： Joshua Alspector , Aleksander Kolcz

IPC: G06F17/30

CPC classification number: G06F17/30867 , G06Q30/0255 , Y10S707/99935 , Y10S707/99937

Abstract: Systems and methods are provided for personalizing advertising for a user. In accordance with certain implementations, information is accessed indicating which documents were selected by a user and which documents were not selected by a user. At least one positive word vector is generated using words contained in at least one of the selected documents, and at least one negative word vector is generated using words contained in at least one of the unselected documents. Document word vectors are generated, and a document rank order is established based on a vector space relationship analysis. Categories associated with the documents are ranked based on the document rank order, and the ranked categories are sent to an ad server. Advertising material associated with the ranked categories may then be received from the ad server in a selected context.

Abstract translation: 为用户个性化广告提供了系统和方法。根据某些实现，访问指示哪些文档被用户选择并且哪些文档未被用户选择的信息。使用包含在所选择的文档中的至少一个中的字来生成至少一个正字向量，并且使用至少一个未选择的文档中包含的字来生成至少一个负字向量。生成文档字矢量，并且基于向量空间关系分析建立文档等级顺序。与文档相关联的类别将根据文档排序顺序进行排名，并将排名的类别发送到广告服务器。然后可以在所选择的上下文中从广告服务器接收与排名类别相关联的广告资料。

5.

发明申请
RELIABILITY OF DUPLICATE DOCUMENT DETECTION ALGORITHMS 有权
Title translation: 重复文件检测算法的可靠性

公开(公告)号：US20080319995A1

公开(公告)日：2008-12-25

申请号：US12144021

申请日：2008-06-23

Applicant: Joshua Alspector , Aleksander Kolcz , Abdur R. Chowdhury

Inventor： Joshua Alspector , Aleksander Kolcz , Abdur R. Chowdhury

IPC: G06F7/20 , G06F17/30

CPC classification number: G06F17/30156 , G06F17/30011 , Y10S707/99943 , Y10S707/99945

Abstract: In a single-signature duplicate document system, a secondary set of attributes is used in addition to a primary set of attributes so as to improve the precision of the system. When the projection of a document onto the primary set of attributes is below a threshold, then a secondary set of attributes is used to supplement the primary lexicon so that the projection is above the threshold.

Abstract translation: 在单签名重复文档系统中，除了主要属性集之外还使用辅助属性集，以提高系统的精度。当将文档投影到主要属性集合上时，使用辅助的一组属性来补充主要词典，使得投影高于阈值。

6.

发明授权
Classifier tuning based on data similarities 有权
Title translation: 基于数据相似性的分类器调优

公开(公告)号：US07089241B1

公开(公告)日：2006-08-08

申请号：US10740821

申请日：2003-12-22

Applicant: Joshua Alspector , Aleksander Kolcz , Abdur Chowdhury

Inventor： Joshua Alspector , Aleksander Kolcz , Abdur Chowdhury

IPC: G06F7/00

CPC classification number: G06Q10/107 , H04L51/12 , Y10S707/99937 , Y10S707/99945

Abstract: A probabilistic classifier is used to classify data items in a data stream. The probabilistic classifier is trained, and an initial classification threshold is set, using unique training and evaluation data sets (i.e., data sets that do not contain duplicate data items). Unique data sets are used for training and in setting the initial classification threshold so as to prevent the classifier from being improperly biased as a result of similarity rates in the training and evaluation data sets that do not reflect similarity rates encountered during operation. During operation, information regarding the actual similarity rates of data items in the data stream is obtained and used to adjust the classification threshold such that misclassification costs are minimized given the actual similarity rates.

Abstract translation: 概率分类器用于对数据流中的数据项进行分类。对概率分类器进行训练，并使用独特的训练和评估数据集（即，不包含重复数据项的数据集）设置初始分类阈值。唯一数据集用于训练和设置初始分类阈值，以防止分类器由于在训练和评估数据集中的相似率而不被反映在操作期间遇到的相似性的差异。在操作期间，获得关于数据流中数据项的实际相似度的信息，并用于调整分类阈值，使得鉴于实际相似性，误分类成本最小化。

7.

发明授权
Web query classification 有权

公开(公告)号：US09424346B2

公开(公告)日：2016-08-23

申请号：US13453901

申请日：2012-04-23

Applicant: Abdur R. Chowdhury , Steven Michael Beitzel , David Dolan Lewis , Aleksander Kolcz

Inventor： Abdur R. Chowdhury , Steven Michael Beitzel , David Dolan Lewis , Aleksander Kolcz

IPC: G06F17/30

CPC classification number: G06F17/30707 , G06F17/30657 , G06F17/30864

Abstract: A query phrase may be automatically classified to one or more topics of interest (e.g., categories) to assist in routing the query phrase to one or more appropriate backend databases. A selectional preference query classification technique may be used to classify the query phrase based on a comparison between the query phrase and patterns of query phrases. Additionally, or alternatively, a combination of query classification techniques may be used to classify the query phrase. Topical classification of a query phrase also may be used to assist a search system in delivering auxiliary information to a user who entered the query phrase. Advertisements, for instance, may be tailored based on classification rather than query keywords.

8.

发明申请
Simplifying Lexicon Creation in Hybrid Duplicate Detection and Inductive Classifier System 审中-公开

公开(公告)号：US20130173518A1

公开(公告)日：2013-07-04

申请号：US13621034

申请日：2012-09-15

Applicant: Joshua Alspector , Aleksander Kolcz , Abdur R. Chowdhury

Inventor： Joshua Alspector , Aleksander Kolcz , Abdur R. Chowdhury

IPC: G06N5/02

CPC classification number: G06F16/35 , G06F16/1748 , G06F16/353 , G06N5/02 , H04L51/12

Abstract: A classification system includes a signature-based duplicate detector and an inductive classifier that share attribute information. To perform the duplicate detection and the classification, the duplicate detector and inductive classifier are first initialized by generating a lexicon of attributes for the duplicate detector and a classification model for the classifier. To develop a classification model, a training set of documents of known class are used by the classifier to determine the attributes of the documents that are most useful in classifying an unknown document. The model is developed from these attributes. Attribute information containing the attributes determined by the classifier is then passed to the duplicate detector and the duplicate detector uses the attribute information to generate the lexicon of attributes.

9.

发明申请
META-MODEL DISTRIBUTED QUERY CLASSIFICATION 审中-公开
Title translation: META模型分布式查询分类

公开(公告)号：US20130091131A1

公开(公告)日：2013-04-11

申请号：US13267163

申请日：2011-10-06

Applicant: JAKUB SZYMANSKI , LI JIANG , ALEKSANDER KOLCZ

Inventor： JAKUB SZYMANSKI , LI JIANG , ALEKSANDER KOLCZ

IPC: G06F17/30

CPC classification number: G06F16/353

Abstract: Systems and methods are provided for classifying a search query. A first group of query classifiers can be used to evaluate a query relative to various subject matter domains. The evaluation results from the first group of domain classifiers can then be used by a second group of meta-classifiers. The meta-classifiers are associated with meta-classifier categories that may correspond to a domain or that may correspond to a plurality of domains. The assigned meta-classifier category for a query can be used in any convenient manner, such as by triggering additional uses of the search query to match images or other alternative types of documents, or such as by allowing a subject matter domain to be assigned to the query.

Abstract translation: 提供了用于对搜索查询进行分类的系统和方法。第一组查询分类器可用于评估相对于各主题域的查询。第一组域分类器的评估结果可以由第二组元分类器使用。元分类器与可能对应于域或可对应于多个域的元分类器类别相关联。用于查询的分配的元分类器类别可以以任何便利的方式使用，例如通过触发搜索查询的附加使用来匹配图像或其他替代类型的文档，或者例如通过允许将主题域分配给查询。

10.

发明申请
Reliability of Duplicate Document Detection Algorithms 有权
Title translation: 重复文件检测算法的可靠性

公开(公告)号：US20130007026A1

公开(公告)日：2013-01-03

申请号：US13612840

申请日：2012-09-13

Applicant: Joshua ALSPECTOR , Aleksander KOLCZ , Abdur R. CHOWDHURY

Inventor： Joshua ALSPECTOR , Aleksander KOLCZ , Abdur R. CHOWDHURY

IPC: G06F7/00

CPC classification number: G06F17/30156 , G06F17/30011 , Y10S707/99943 , Y10S707/99945

Abstract: In a single-signature duplicate document system, a secondary set of attributes is used in addition to a primary set of attributes so as to improve the precision of the system. When the projection of a document onto the primary set of attributes is below a threshold, then a secondary set of attributes is used to supplement the primary lexicon so that the projection is above the threshold.

Abstract translation: 在单签名重复文档系统中，除了主要属性集之外还使用辅助属性集，以提高系统的精度。当将文档投影到主要属性集合上时，使用辅助的一组属性来补充主要词典，使得投影高于阈值。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification