Trees of classifiers for detecting email spam
    1.
    发明申请
    Trees of classifiers for detecting email spam 有权
    用于检测电子邮件垃圾邮件的分类树

    公开(公告)号:US20070038705A1

    公开(公告)日:2007-02-15

    申请号:US11193691

    申请日:2005-07-29

    IPC分类号: G06F15/16

    CPC分类号: H04L51/12

    摘要: Decision trees populated with classifier models are leveraged to provide enhanced spam detection utilizing separate email classifiers for each feature of an email. This provides a higher probability of spam detection through tailoring of each classifier model to facilitate in more accurately determining spam on a feature-by-feature basis. Classifiers can be constructed based on linear models such as, for example, logistic-regression models and/or support vector machines (SVM) and the like. The classifiers can also be constructed based on decision trees. “Compound features” based on internal and/or external nodes of a decision tree can be utilized to provide linear classifier models as well. Smoothing of the spam detection results can be achieved by utilizing classifier models from other nodes within the decision tree if training data is sparse. This forms a base model for branches of a decision tree that may not have received substantial training data.

    摘要翻译: 利用分类器模型填充的决策树利用电子邮件的每个功能使用单独的电子邮件分类器来提供增强的垃圾邮件检测。 这通过定制每个分类器模型提供了更高的垃圾邮件检测的概率,以便于在逐个特征的基础上更准确地确定垃圾邮件。 分类器可以基于诸如逻辑回归模型和/或支持向量机(SVM)等线性模型来构建。 分类器也可以基于决策树构建。 基于决策树的内部和/或外部节点的“复合特征”也可以用于提供线性分类器模型。 垃圾邮件检测结果的平滑可以通过使用来自决策树内的其他节点的分类器模型来实现,如果训练数据是稀疏的。 这形成了可能没有接收到大量训练数据的决策树的分支的基本模型。

    Finding phishing sites
    4.
    发明申请
    Finding phishing sites 有权
    寻找钓鱼网站

    公开(公告)号:US20070192855A1

    公开(公告)日:2007-08-16

    申请号:US11335902

    申请日:2006-01-18

    IPC分类号: G06F12/14

    摘要: Described is a technology by which phishing-related data sources are processed into aggregated data and a given site evaluated the aggregated data using a predictive model to automatically determine whether the given site is likely to be a phishing site. The predictive model may be built using machine learning based on training data, e.g., including known phishing sites and/or known non-phishing sites. To determine whether an object corresponding to a site is likely a phishing-related object are described, various criteria are evaluated, including one or more features of the object when evaluated. The determination is output in some way, e.g., made available to a reputation service, used to block access to a site or warn a user before allowing access, and/or used to assist a hand grader in being more efficient in evaluating sites.

    摘要翻译: 描述了一种将钓鱼相关数据源处理为聚合数据的技术,给定的站点使用预测模型评估聚合数据,以自动确定给定站点是否可能是钓鱼站点。 可以使用基于训练数据的机器学习来构建预测模型,例如包括已知的网络钓鱼站点和/或已知的非网络钓鱼站点。 为了确定对应于站点的对象是否可能是与钓鱼相关的对象,评估了各种标准,包括评估时对象的一个​​或多个特征。 该确定以某种方式输出,例如可用于信誉服务,用于阻止对站点的访问或在允许访问之前警告用户,和/或用于帮助平手机更有效地评估站点。

    Phishing Detection, Prevention, and Notification
    6.
    发明申请
    Phishing Detection, Prevention, and Notification 有权
    网络钓鱼检测,预防和通知

    公开(公告)号:US20070039038A1

    公开(公告)日:2007-02-15

    申请号:US11537641

    申请日:2006-09-30

    IPC分类号: H04L9/32 G06K9/00

    摘要: Phishing detection, prevention, and notification is described. In an embodiment, a messaging application facilitates communication via a messaging user interface, and receives a communication, such as an email message, from a domain. A phishing detection module detects a phishing attack in the communication by determining that the domain is similar to a known phishing domain, or by detecting suspicious network properties of the domain. In another embodiment, a Web browsing application receives content, such as data for a Web page, from a network-based resource, such as a Web site or domain. The Web browsing application initiates a display of the content, and a phishing detection module detects a phishing attack in the content by determining that a domain of the network-based resource is similar to a known phishing domain, or that an address of the network-based resource from which the content is received has suspicious network properties.

    摘要翻译: 描述网络钓鱼检测,预防和通知。 在一个实施例中,消息收发应用促进通过消息收发用户界面的通信,并从域接收诸如电子邮件消息之类的通信。 钓鱼检测模块通过确定域与已知的网络钓鱼域相似,或通过检测域的可疑网络属性来检测通信中的网络钓鱼攻击。 在另一个实施例中,Web浏览应用程序从基于网络的资源(诸如网站或域)接收诸如网页的数据的内容。 Web浏览应用程序启动内容的显示,并且网络钓鱼检测模块通过确定基于网络的资源的域类似于已知的网络钓鱼域来检测内容中的网络钓鱼攻击,或者网络 - 收到内容的基于资源的资源具有可疑的网络属性。

    Incremental anti-spam lookup and update service
    7.
    发明申请
    Incremental anti-spam lookup and update service 有权
    增量的反垃圾邮件查询和更新服务

    公开(公告)号:US20060015561A1

    公开(公告)日:2006-01-19

    申请号:US10879626

    申请日:2004-06-29

    IPC分类号: G06F15/16

    CPC分类号: G06Q10/107 H04L51/12

    摘要: The present invention provides a unique system and method that facilitates incrementally updating spam filters in near real time or real time. Incremental updates can be generated in part by difference learning. Difference learning involves training a new spam filter based on new data and then looking for the differences between the new spam filter and the existing spam filter. Differences can be determined at least in part by comparing the absolute values of parameter changes (weight changes of a feature between the two filters). Other factors such as frequency of parameters can be employed as well. In addition, available updates with respect to particular features or messages can be looked up using one or more lookup tables or databases. When incremental and/or feature-specific updates are available, they can be downloaded such as by a client for example. Incremental updates can be automatically provided or can be provided by request according to client or server preferences.

    摘要翻译: 本发明提供了一种独特的系统和方法,其便于实时或实时地逐渐更新垃圾邮件过滤器。 增量更新可以通过差异学习部分产生。 差异学习涉及到根据新数据来培训新的垃圾邮件过滤器,然后寻找新的垃圾邮件过滤器和现有的垃圾邮件过滤器之间的差异。 差异可以至少部分地通过比较参数变化的绝对值(两个滤波器之间的特征的权重变化)来确定。 也可以使用诸如参数频率的其他因素。 此外,可以使用一个或多个查找表或数据库查找关于特定特征或消息的可用更新。 当增量和/或功能特定的更新可用时,可以例如通过客户端下载它们。 增量更新可以自动提供,也可以根据客户端或服务器的偏好请求提供。

    Spam filtering with probabilistic secure hashes
    9.
    发明申请
    Spam filtering with probabilistic secure hashes 有权
    垃圾邮件过滤与概率安全散列

    公开(公告)号:US20060036693A1

    公开(公告)日:2006-02-16

    申请号:US10917077

    申请日:2004-08-12

    IPC分类号: G06F15/16

    摘要: Disclosed are signature-based systems and methods that facilitate spam detection and prevention at least in part by calculating hash values for an incoming message and then determining a probability that the hash values indicate spam. In particular, the signatures generated for each incoming message can be compared to a database of both spam and good signatures. A count of the number of matches can be divided by a denominator value. The denominator value can be an overall volume of messages sent to the system per signature for example. The denominator value can be discounted to account for different treatments and timing of incoming messages. Furthermore, secure hashes can be generated by combining portions of multiple hashing components. A secure hash can be made from a combination of multiple hashing components or multiple combinations thereof. The signature based system can also be integrated with machine learning systems to optimize spam prevention.

    摘要翻译: 公开了基于签名的系统和方法,其至少部分地通过计算输入消息的散列值,然后确定散列值指示垃圾邮件的概率来促进垃圾邮件检测和预防。 特别地,为每个传入消息生成的签名可以与垃圾邮件和良好签名的数据库进行比较。 匹配次数的计数可以除以分母值。 分母值可以是例如每个签名发送到系统的消息的总体积。 分母值可以折扣,以解决传入消息的不同处理和时间。 此外,可以通过组合多个散列分量的部分来生成安全散列。 可以从多个散列组件或其多个组合的组合形成安全散列。 基于签名的系统也可以与机器学习系统集成,以优化垃圾邮件防范。