METHOD AND APPARATUS FOR STRUCTURAL DATA CLASSIFICATION
    11.
    发明申请
    METHOD AND APPARATUS FOR STRUCTURAL DATA CLASSIFICATION 有权
    用于结构数据分类的方法和装置

    公开(公告)号:US20090319457A1

    公开(公告)日:2009-12-24

    申请号:US12141251

    申请日:2008-06-18

    IPC分类号: G06N5/02

    CPC分类号: G06N99/005

    摘要: Techniques for classifying structural data with skewed distribution are disclosed. By way of example, a method classifying structural input data comprises a computer system performing the following steps. Multiple classifiers are constructed, wherein each classifier is constructed on a subset of training data, using one or more selected composite features from the subset of training data. A consensus among the multiple classifiers is computed in accordance with a voting scheme such that at least a portion of the structural input data is assigned to a particular class in accordance with the computed consensus. Such techniques for structured data classification are capable of handling skewed class distribution and partial feature coverage issues.

    摘要翻译: 公开了分布具有偏斜分布的结构数据的技术。 作为示例,分类结构输入数据的方法包括执行以下步骤的计算机系统。 构建多个分类器,其中使用来自训练数据的子集的一个或多个选定的复合特征,在训练数据的子集上构建每个分类器。 根据投票方案计算多个分类器之间的共识,使得至少一部分结构输入数据根据所计算的一致性被分配给特定类别。 这种用于结构化数据分类的技术能够处理倾斜的类分布和部分特征覆盖问题。

    System and method for scalable processing of multi-way data stream correlations
    12.
    发明授权
    System and method for scalable processing of multi-way data stream correlations 失效
    用于多路数据流相关性的可扩展处理的系统和方法

    公开(公告)号:US07548937B2

    公开(公告)日:2009-06-16

    申请号:US11417838

    申请日:2006-05-04

    IPC分类号: G06F7/00 G06F15/16

    摘要: A computer implemented method, apparatus, and computer usable program code for processing multi-way stream correlations. Stream data are received for correlation. A task is formed for continuously partitioning a multi-way stream correlation workload into smaller workload pieces. Each of the smaller workload pieces may be processed by a single host. The stream data are sent to different hosts for correlation processing.

    摘要翻译: 一种用于处理多路流相关性的计算机实现的方法,装置和计算机可用程序代码。 接收流数据进行相关。 形成一个任务,用于将多路流相关工作负载连续划分成较小的工作负载。 每个较小的工作负载片段可以由单个主机处理。 流数据被发送到不同的主机进行相关处理。

    System and method for learning models from scarce and skewed training data
    14.
    发明申请
    System and method for learning models from scarce and skewed training data 失效
    从稀缺和倾斜的训练数据中学习模型的系统和方法

    公开(公告)号:US20080071721A1

    公开(公告)日:2008-03-20

    申请号:US11506226

    申请日:2006-08-18

    IPC分类号: G06N5/02

    CPC分类号: G06N99/005

    摘要: A system and method for learning models from scarce and/or skewed training data includes partitioning a data stream into a sequence of time windows. A most likely current class distribution to classify portions of the data stream is determined based on observing training data in a current time window and based on concept drift probability patterns using historical information.

    摘要翻译: 用于从稀缺和/或倾斜的训练数据学习模型的系统和方法包括将数据流划分成时间窗口序列。 基于在当前时间窗口中观察训练数据并且基于使用历史信息的概念漂移概率模式来确定对数据流的部分进行分类的最可能的当前类别分布。

    System and method for creating a unified printable collection of hyperlinked documents
    16.
    发明授权
    System and method for creating a unified printable collection of hyperlinked documents 失效
    用于创建超链接文档的统一可打印集合的系统和方法

    公开(公告)号:US06948116B2

    公开(公告)日:2005-09-20

    申请号:US09727897

    申请日:2000-12-01

    IPC分类号: G06F17/24 G06F17/30 G06F3/00

    CPC分类号: G06F17/30882 G06F17/243

    摘要: The present invention relates to a method for creating a meta-document. The method collects at least one hyperlinked document based on a seed document and cross-references the documents within the collection. Cross-referencing includes resolving an anchor and an object, and indexing the resolved anchor and object based on respective locations within a meta-document. The method organizes the collected documents and seed documents. The method also publishes the meta-document including the cross-referenced documents. Preferably, the method of collecting includes accepting the seed document having an anchor pointing to an object, and adding a document containing the object to the collection. In addition, collecting includes the step of manually modifying the collection. The meta-document is a collection of the seed document and the hyperlinked document. Further, the index is one of a footnote, an end note, a table of contents, and an appendix.

    摘要翻译: 本发明涉及一种创建元文档的方法。 该方法基于种子文档收集至少一个超链接的文档,并交叉引用集合中的文档。 交叉引用包括解析锚和对象,并且基于元文档内的相应位置对解析的锚和对象进行索引。 该方法组织收集的文件和种子文件。 该方法还会发布包含交叉引用文档的元文档。 优选地,收集方法包括接受具有指向对象的锚点的种子文档,以及将包含对象的文档添加到集合。 此外,收集包括手动修改集合的步骤。 元文档是种子文档和超链接文档的集合。 此外,该指数是脚注,终端笔记,目录和附录之一。

    Arrangements and methods for latency-sensitive hashing for collaborative web caching
    17.
    发明授权
    Arrangements and methods for latency-sensitive hashing for collaborative web caching 失效
    用于协同网页缓存的延迟敏感散列的安排和方法

    公开(公告)号:US06823377B1

    公开(公告)日:2004-11-23

    申请号:US09493904

    申请日:2000-01-28

    IPC分类号: G06F15173

    摘要: Systems and methods for collaborative web caching among geographically distributed cache servers, particularly, latency-sensitive hashing systems and methods for collaborative web caching among geographically distributed proxy caches. Network latency delays as well as proxy load conditions are taking into consideration during hashing. As a result, requests can be hashed into geographically closer proxy caches if the load conditions permit. Otherwise, requests will be hashed into geographically distant proxy caches to better balance the load among the caches.

    摘要翻译: 地理分布式缓存服务器之间的协同Web缓存的系统和方法,特别是延迟敏感的散列系统和用于地理位置分布式代理缓存之间的协同Web缓存的方法。 在散列期间考虑网络延迟延迟以及代理负载条件。 因此,如果负载条件允许,请求可以被分散到地理上更接近的代理缓存。 否则,请求将被散列到地理上遥远的代理缓存中,以更好地平衡缓存之间的负载。

    System and method for classification using time sequences
    18.
    发明授权
    System and method for classification using time sequences 有权
    使用时间序列分类的系统和方法

    公开(公告)号:US06721719B1

    公开(公告)日:2004-04-13

    申请号:US09361381

    申请日:1999-07-26

    IPC分类号: G06N502

    CPC分类号: G06N5/025

    摘要: System and method for generating classification using time sequences comprises inputting a set of time dependant feature variable graphs along with a set of time dependant category variable graphs; finding frequent shapes in the time dependant feature variable graphs; utilizing the frequent shapes to generate combinations of frequent shapes; generating rules relating one or more patterns of combinations of frequent shapes to a category variable; and, performing a categorization utilizing the rules generated.

    摘要翻译: 使用时间序列生成分类的系统和方法包括:输入一组时间相关特征变量图以及一组时间依赖类别变量图; 在时间依赖特征变量图中发现频繁的形状; 利用频繁的形状产生频繁形状的组合; 生成与频繁形状的一个或多个组合的模式相关联的规则到类别变量; 并且利用所生成的规则执行分类。

    Sender- specified delivery customization
    19.
    发明授权
    Sender- specified delivery customization 失效
    发件人指定的交货定制

    公开(公告)号:US06643684B1

    公开(公告)日:2003-11-04

    申请号:US09168248

    申请日:1998-10-08

    IPC分类号: G06F1516

    摘要: A system and method that enables a given sending user to specify a set of delivery policies and have them used for the electronic delivery of a given message, the message potentially having several heterogeneous parts (e.g., text and pictures) each of which is handled differently, and delivered to multiple heterogeneous devices (e.g., PCs, Smartphones, fax machines), and possibly to several distinct recipients. The factors with which a sender can qualify their delivery policies include: time/date, transmission cost; whether the transmission can be forwarded; receiving device capability; and network reliability, speed, and security transmission. Methods are also provided enabling a sender to specify that particular transmissions be redirected or copied, e.g., “send fax copy to my broker and my accountant.” In one embodiment, the delivery policies may be specified using PICS.

    摘要翻译: 一种允许给定发送用户指定一组传送策略并将其用于给定消息的电子传递的系统和方法,该消息可能具有几个异构部分(例如,文本和图片),每个异构部分被不同地处理 ,并交付给多个异构设备(例如PC,智能手机,传真机),并可能传送给多个不同的接收者。 发送方可以对其交付政策进行限定的因素包括:时间/日期,传输成本; 传输是否可以转发; 接收设备能力; 和网络可靠性,速度和安全传输。 还提供了使得发送者能够指定特定传输被重定向或复制的方法,例如“将传真副本发送到我的经纪人和我的会计师”。 在一个实施例中,可以使用PICS指定递送策略。

    Apparatus and method for dynamic meta-tagging of compound documents
    20.
    发明授权
    Apparatus and method for dynamic meta-tagging of compound documents 失效
    复合文件的动态元标记的装置和方法

    公开(公告)号:US6094657A

    公开(公告)日:2000-07-25

    申请号:US942171

    申请日:1997-10-01

    IPC分类号: G06F17/30 G06F17/00

    摘要: A method and apparatus to dynamically maintain META-tag information specifying categorization and/or degree of compound documents, which are collections or hierarchy of collections of objects (possibly web pages), for efficient retrieval of leaf or intermediate objects with specific characteristics without the need to search any content of the collection. The specific characteristic and the contents of the collection can change constantly both qualitatively and quantitatively (including the insertion, deletion and update of objects). While dynamically maintaining the META-tag information, there are no inclusion restrictions on these compound documents, i.e., any collection can contain itself either directly or recursively; and all objects within a META-tagged compound document are not required to participate. The PICS protocol may be used to specify this META-tag information with both categorization and degree; to reflect the obsolescence, currency or freshness of an objects; to validate a given object using a digital signature; and to enable charging for the META-tag service. Aggregation methods are provided to enable maximization, minimization, and averaging; to limit the propagation of META-tags; and to handle the time-out of META-tag and information validity.

    摘要翻译: 一种用于动态地维护META标签信息的方法和装置,所述META标签信息指定复合文档的分类和/或程度,所述复合文档是对象(可能的网页)的集合或层次结构,用于有效地检索具有特定特征的叶或中间对象而不需要 搜索集合的任何内容。 集合的具体特征和内容可以在定性和定量上(包括对象的插入,删除和更新)不断变化。 在动态维护META标签信息的同时,对这些复合文档没有包含限制,即任何集合都可以直接或递归地包含它们; 并且META标记的复合文档中的所有对象都不需要参与。 PICS协议可用于指定具有分类和度数的META标签信息; 反映物体的过时,货币或新鲜度; 使用数字签名验证给定对象; 并为META标签服务启用计费。 提供聚合方法以实现最大化,最小化和平均化; 限制META标签的传播; 并处理META标签的超时和信息有效性。