Methods and systems for transductive data classification
    4.
    发明授权
    Methods and systems for transductive data classification 有权
    用于转换数据分类的方法和系统

    公开(公告)号:US08374977B2

    公开(公告)日:2013-02-12

    申请号:US12721393

    申请日:2010-03-10

    IPC分类号: G06N5/00

    CPC分类号: G06N99/005

    摘要: A system, method, data processing apparatus, and article of manufacture are provided for classifying data. Labeled data points are received, each of the labeled data points having at least one label indicating whether the data point is a training example for data points for being included in a designated category or a training example for data points being excluded from a designated category; receiving unlabeled data points; receiving at least one predetermined cost factor of the labeled data points and unlabeled data points; training a transductive classifier using MED through iterative calculation using the at least one cost factor and the labeled data points and the unlabeled data points as training examples; applying the trained classifier to classify at least one of the unlabeled data points, the labeled data points, and input data points; and outputting a classification of the classified data points, or derivative thereof.

    摘要翻译: 提供了一种用于对数据进行分类的系统,方法,数据处理装置和制品。 标签数据点被接收,每个标记数据点具有至少一个标签,指示数据点是否是用于包括在指定类别中的数据点的训练示例,或者是从指定类别排除的数据点的训练示例; 接收未标记的数据点; 接收标记数据点和未标记数据点的至少一个预定成本因子; 通过使用至少一个成本因子和标记的数据点和未标记的数据点作为训练示例的迭代计算来训练使用MED的转换分类器; 应用经过训练的分类器对未标记的数据点,标记数据点和输入数据点中的至少一个进行分类; 并输出分类数据点或其派生物的分类。

    Effective multi-class support vector machine classification
    5.
    发明授权
    Effective multi-class support vector machine classification 有权
    有效的多类支持向量机分类

    公开(公告)号:US07386527B2

    公开(公告)日:2008-06-10

    申请号:US10412163

    申请日:2003-04-10

    CPC分类号: G06K9/6269

    摘要: An improved method of classifying examples into multiple categories using a binary support vector machine (SVM) algorithm. In one preferred embodiment, the method includes the following steps: storing a plurality of user-defined categories in a memory of a computer; analyzing a plurality of training examples for each category so as to identify one or more features associated with each category; calculating at least one feature vector for each of the examples; transforming each of the at least one feature vectors so as reflect information about all of the training examples; and building a SVM classifier for each one of the plurality of categories, wherein the process of building a SVM classifier further includes: assigning each of the examples in a first category to a first class and all other examples belonging to other categories to a second class, wherein if any one of the examples belongs to another category as well as the first category, such examples are assigned to the first class only; optimizing at least one tunable parameter of a SVM classifier for the first category, wherein the SVM classifier is trained using the first and second classes; and optimizing a function that converts the output of the binary SVM classifier into a probability of category membership.

    摘要翻译: 一种使用二进制支持向量机(SVM)算法将示例分类为多个类别的改进方法。 在一个优选实施例中,该方法包括以下步骤:将多个用户定义的类别存储在计算机的存储器中; 分析每个类别的多个训练示例,以便识别与每个类别相关联的一个或多个特征; 为每个示例计算至少一个特征向量; 转换所述至少一个特征向量中的每一个,以便反映关于所有训练示例的信息; 以及为所述多个类别中的每个类别构建SVM分类器,其中,构建SVM分类器的过程还包括:将第一类别中的每个示例分配给第一类,将属于其他类别的所有其他示例分配给第二类 其中如果任何一个示例属于另一类别以及第一类别,则这些示例仅被分配给第一类; 优化用于所述第一类别的SVM分类器的至少一个可调参数,其中使用所述第一类和第二类训练所述SVM分类器; 并优化将二进制SVM分类器的输出转换成类别成员的概率的函数。

    Methods and systems for improved transductive maximum entropy discrimination classification
    6.
    发明授权
    Methods and systems for improved transductive maximum entropy discrimination classification 有权
    用于改进转换最大熵辨别分类的方法和系统

    公开(公告)号:US07761391B2

    公开(公告)日:2010-07-20

    申请号:US11752634

    申请日:2007-05-23

    IPC分类号: G06N5/00

    CPC分类号: G06N99/005

    摘要: A system, method, data processing apparatus, and article of manufacture are provided for classifying data. Labeled data points are received, each of the labeled data points having at least one label indicating whether the data point is a training example for data points for being included in a designated category or a training example for data points being excluded from a designated category; receiving unlabeled data points; receiving at least one predetermined cost factor of the labeled data points and unlabeled data points; training a transductive classifier using MED through iterative calculation using the at least one cost factor and the labeled data points and the unlabeled data points as training examples; applying the trained classifier to classify at least one of the unlabeled data points, the labeled data points, and input data points; and outputting a classification of the classified data points, or derivative thereof.

    摘要翻译: 提供了一种用于对数据进行分类的系统,方法,数据处理装置和制品。 标签数据点被接收,每个标记数据点具有至少一个标签,指示数据点是否是用于包括在指定类别中的数据点的训练示例,或者是从指定类别排除的数据点的训练示例; 接收未标记的数据点; 接收标记数据点和未标记数据点的至少一个预定成本因子; 通过使用至少一个成本因子和标记的数据点和未标记的数据点作为训练示例的迭代计算来训练使用MED的转换分类器; 应用经过训练的分类器对未标记的数据点,标记数据点和输入数据点中的至少一个进行分类; 并输出分类数据点或其派生物的分类。

    System And Method For Developing A Risk Profile For An Internet Service
    8.
    发明申请
    System And Method For Developing A Risk Profile For An Internet Service 有权
    为互联网服务开发风险资料的系统和方法

    公开(公告)号:US20100269168A1

    公开(公告)日:2010-10-21

    申请号:US12709504

    申请日:2010-02-21

    IPC分类号: G06F17/00 G06F17/30 G06F15/18

    摘要: A method and system for controlling access to an Internet resource is disclosed herein. When a request for an Internet resource, such as a Web site, is transmitted by an end-user of a LAN, a security appliance for the LAN analyzes a reputation index for the Internet resource before transmitting the request over the Internet. The reputation index is based on a reputation vector which includes a plurality of factors for the Internet resource such as country of domain registration, country of service hosting, country of an internet protocol address block, age of a domain registration, popularity rank, internet protocol address, number of hosts, to-level domain, a plurality of run-time behaviors, JavaScript block count, picture count, immediate redirect and response latency. If the reputation index for the Internet resource is at or above a threshold value established for the LAN, then access to the Internet resource is permitted. If the reputation index for the Internet resource is below a threshold value established for the LAN, then access to the Internet resource is denied.

    摘要翻译: 本文公开了一种用于控制对因特网资源的访问的方法和系统。 当LAN的最终用户发送诸如网站的因特网资源的请求时,LAN的安全设备在通过因特网发送请求之前分析因特网资源的信誉索引。 信誉指数基于信誉向量,其包括诸如域名注册国,互联网协议地址块国家,域名注册的年龄,流行度排名,互联网协议等因特网资源的多个因素 地址,主机数量,级别域,多个运行时行为,JavaScript块计数,图像计数,立即重定向和响应延迟。 如果因特网资源的信誉指数等于或高于为LAN建立的阈值,则允许访问因特网资源。 如果Internet资源的声誉索引低于为LAN建立的阈值,则拒绝访问Internet资源。

    EFFECTIVE MULTI-CLASS SUPPORT VECTOR MACHINE CLASSIFICATION
    9.
    发明申请
    EFFECTIVE MULTI-CLASS SUPPORT VECTOR MACHINE CLASSIFICATION 有权
    有效的多级支持向量机分类

    公开(公告)号:US20080183646A1

    公开(公告)日:2008-07-31

    申请号:US12050096

    申请日:2008-03-17

    IPC分类号: G06F15/18

    CPC分类号: G06K9/6269

    摘要: An improved method of classifying examples into multiple categories using a binary support vector machine (SVM) algorithm. In one preferred embodiment, the method includes the following steps: storing a plurality of user-defined categories in a memory of a computer, analyzing a plurality of training examples for each category so as to identify one or more features associated with each category; calculating at least one feature vector for each of the examples; transforming each of the at least one feature vectors so as reflect information about all of the training examples; and building a SVM classifier for each one of the plurality of categories, wherein the process of building a SVM classifier further includes: assigning each of the examples in a first category to a first class and all other examples belonging to other categories to a second class, wherein if anyone of the examples belongs to another category as well as the first category, such examples are assigned to the first class only, optimizing at least one tunable parameter of a SVM classifier for the first category, wherein the SVM classifier is trained using the first and second classes; and optimizing a function that converts the output of the binary SVM classifier into a probability of category membership.

    摘要翻译: 一种使用二进制支持向量机(SVM)算法将示例分类为多个类别的改进方法。 在一个优选实施例中,该方法包括以下步骤:将多个用户定义的类别存储在计算机的存储器中,分析每个类别的多个训练示例,以便识别与每个类别相关联的一个或多个特征; 为每个示例计算至少一个特征向量; 转换所述至少一个特征向量中的每一个,以便反映关于所有训练示例的信息; 以及为所述多个类别中的每个类别构建SVM分类器,其中,构建SVM分类器的过程还包括:将第一类别中的每个示例分配给第一类,将属于其他类别的所有其他示例分配给第二类 其中如果任何示例属于另一类别以及第一类别,则将这些示例仅分配给第一类,优化用于第一类别的SVM分类器的至少一个可调参数,其中,SVM分类器使用 第一类和第二类; 并优化将二进制SVM分类器的输出转换成类别成员的概率的函数。

    METHODS AND SYSTEMS FOR TRANSDUCTIVE DATA CLASSIFICATION
    10.
    发明申请
    METHODS AND SYSTEMS FOR TRANSDUCTIVE DATA CLASSIFICATION 有权
    用于传输数据分类的方法和系统

    公开(公告)号:US20080097936A1

    公开(公告)日:2008-04-24

    申请号:US11752634

    申请日:2007-05-23

    IPC分类号: G06F15/18

    CPC分类号: G06N99/005

    摘要: A system, method, data processing apparatus, and article of manufacture are provided for classifying data. Labeled data points are received, each of the labeled data points having at least one label indicating whether the data point is a training example for data points for being included in a designated category or a training example for data points being excluded from a designated category; receiving unlabeled data points; receiving at least one predetermined cost factor of the labeled data points and unlabeled data points; training a transductive classifier using MED through iterative calculation using the at least one cost factor and the labeled data points and the unlabeled data points as training examples; applying the trained classifier to classify at least one of the unlabeled data points, the labeled data points, and input data points; and outputting a classification of the classified data points, or derivative thereof.

    摘要翻译: 提供了一种用于对数据进行分类的系统,方法,数据处理装置和制品。 标签数据点被接收,每个标记数据点具有至少一个标签,指示数据点是否是用于包括在指定类别中的数据点的训练示例,或者是从指定类别排除的数据点的训练示例; 接收未标记的数据点; 接收标记数据点和未标记数据点的至少一个预定成本因子; 通过使用至少一个成本因子和标记的数据点和未标记的数据点作为训练示例的迭代计算来训练使用MED的转换分类器; 应用经过训练的分类器对未标记的数据点,标记数据点和输入数据点中的至少一个进行分类; 并输出分类数据点或其派生物的分类。