Systems and methods for maintaining closed frequent itemsets over a data stream sliding window
    91.
    发明授权
    Systems and methods for maintaining closed frequent itemsets over a data stream sliding window 失效
    在数据流滑动窗口上维护关闭频繁项目集的系统和方法

    公开(公告)号:US07496592B2

    公开(公告)日:2009-02-24

    申请号:US11046926

    申请日:2005-01-31

    IPC分类号: G06F17/00

    摘要: Towards mining closed frequent itemsets over a sliding window using limited memory space, a synopsis data structure to monitor transactions in the sliding window so that one can output the current closed frequent itemsets at any time. Due to time and memory constraints, the synopsis data structure cannot monitor all possible itemsets, but monitoring only frequent itemsets makes it difficult to detect new itemsets when they become frequent. Herein, there is introduced a compact data structure, the closed enumeration tree (CET), to maintain a dynamically selected set of itemsets over a sliding-window. The selected itemsets include a boundary between closed frequent itemsets and the rest of the itemsets Because the boundary is relatively stable, the cost of mining closed frequent itemsets over a sliding window is dramatically reduced to that of mining transactions that can possibly cause boundary movements in the CET.

    摘要翻译: 通过使用有限的存储空间的滑动窗口挖掘封闭的频繁项集,用于监视滑动窗口中的事务的概要数据结构,以便可以随时输出当前关闭的频繁项集。 由于时间和内存限制,概要数据结构不能监视所有可能的项集,而只监视频繁项集,使得当它们变得频繁时很难检测新的项集。 在这里,引入了一种紧凑的数据结构,封闭的枚举树(CET),以便在滑动窗口上维护动态选择的一组项集。 所选择的项目集包括封闭频繁项集和其余项目集之间的边界由于边界相对稳定,在滑动窗口中挖掘封闭频繁项集的成本大大降低到可能导致边界移动的采矿交易的成本 CET。

    METHOD, APPARATUS AND COMPUTER PROGRAM PRODUCT FOR PRESERVING PRIVACY IN DATA MINING
    92.
    发明申请
    METHOD, APPARATUS AND COMPUTER PROGRAM PRODUCT FOR PRESERVING PRIVACY IN DATA MINING 有权
    方法,设备和计算机程序产品,用于保护数据挖掘中的隐私

    公开(公告)号:US20090049069A1

    公开(公告)日:2009-02-19

    申请号:US11836171

    申请日:2007-08-09

    IPC分类号: G06F17/30

    摘要: Privacy in data mining of sparse high dimensional data records is preserved by transforming the data records into anonymized data records. This transformation involves creating a sketch-based private representation of each data record, each data record containing only a small number of non-zero attribute value in relation to the high dimensionality of the data records.

    摘要翻译: 通过将数据记录转换为匿名数据记录来保留稀疏高维数据记录的数据挖掘隐私。 该变换涉及创建每个数据记录的基于草图的私有表示,每个数据记录仅包含相对于数据记录的高维数的少量非零属性值。

    METHOD, SYSTEM, AND STORAGE MEDIUM FOR IMPLEMENTING A MULTI-STAGE, MULTI-CLASSIFICATION SALES OPPORTUNITY MODELING SYSTEM
    93.
    发明申请
    METHOD, SYSTEM, AND STORAGE MEDIUM FOR IMPLEMENTING A MULTI-STAGE, MULTI-CLASSIFICATION SALES OPPORTUNITY MODELING SYSTEM 审中-公开
    用于实施多级,多类别销售机会建模系统的方法,系统和存储介质

    公开(公告)号:US20080215419A1

    公开(公告)日:2008-09-04

    申请号:US12120475

    申请日:2008-05-14

    IPC分类号: G06Q10/00

    摘要: A method for implementing a multi-stage, multi-classification sales opportunity modeling system. The method includes receiving operational data relating to past sales activities and receiving parameters identified as being relevant in determining a likelihood of whether exploitation of a sales opportunity will be successful. The method also includes generating a multi-stage model by applying the operational data and the parameters to an analytic engine for evaluating different factors affecting success of the sales opportunity.

    摘要翻译: 一种实现多阶段多分类销售机会建模系统的方法。 该方法包括接收与过去的销售活动相关的操作数据,并且接收被确定为与确定销售机会的利用是否成功的可能性相关的参数。 该方法还包括通过将操作数据和参数应用于分析引擎来生成多阶段模型,以评估影响销售机会成功的不同因素。

    Query integrity assurance in database outsourcing
    94.
    发明申请
    Query integrity assurance in database outsourcing 有权
    查询数据库外包的完整性保证

    公开(公告)号:US20080183656A1

    公开(公告)日:2008-07-31

    申请号:US11626847

    申请日:2007-01-25

    IPC分类号: G06F17/30

    摘要: A method, system and computer program product for confirming the validity of data returned from a data store. A data store contains a primary data set encrypted using a first encryption and a secondary data set using a second encryption. The secondary data set is a subset of the primary data set. A client issues a substantive query against the data store to retrieve a primary data result belonging to the primary data set. A query interface issues at least one validating query against the data store. Each validating query returns a secondary data result belonging to the secondary data set. The query interface receives the secondary data result and provides a data invalid notification if data satisfying the substantive query included in an unencrypted form of the secondary data result is not contained in an unencrypted form of the primary data result.

    摘要翻译: 一种用于确认从数据存储返回的数据的有效性的方法,系统和计算机程序产品。 数据存储包含使用第一加密加密的主数据集和使用第二加密的辅数据集。 辅助数据集是主数据集的子集。 客户端对数据存储器发出实质性查询以检索属于主数据集的主数据结果。 查询界面对数据存储区发出至少一个验证查询。 每个验证查询返回属于辅助数据集的辅助数据结果。 如果满足辅助数据结果的未加密形式的实质性查询的数据未包含在主数据结果的未加密形式中,则查询接口接收辅助数据结果并提供数据无效通知。

    System and method for providing service for searching web site addresses
    95.
    发明授权
    System and method for providing service for searching web site addresses 有权
    提供搜索网站地址的服务的系统和方法

    公开(公告)号:US07383299B1

    公开(公告)日:2008-06-03

    申请号:US09565395

    申请日:2000-05-05

    IPC分类号: G06F15/16

    摘要: A method for searching for a partially specified Uniform Resource Locator (URL) addresses includes receiving a user request, from a user, including a partially specified URL address. A URL search request handler is invoked to search for the partially specified URL address within an inverted index of web site URLs. A web search request handler is invoked to rank the search results of the search for the partially specified URL address based on one or more keywords specified in the user request, a list of recently accessed URLs, and a user profile. Search results are returned to the user comprising a list of URL addresses based on the search for the partially specified URL and ranked based on the user search data.

    摘要翻译: 用于搜索部分指定的统一资源定位符(URL)地址的方法包括从用户接收包括部分指定的URL地址的用户请求。 调用URL搜索请求处理程序来搜索网站URL的反向索引中的部分指定的URL地址。 调用网页搜索请求处理程序以根据用户请求中指定的一个或多个关键字,最近访问的URL的列表和用户简档对搜索部分指定的URL地址的搜索结果进行排序。 基于对部分指定的URL的搜索并基于用户搜索数据进行排名,将搜索结果返回给包括URL地址列表的用户。

    Systems and methods for condensation-based privacy in strings
    96.
    发明申请
    Systems and methods for condensation-based privacy in strings 失效
    字符串中基于冷凝的隐私的系统和方法

    公开(公告)号:US20080082566A1

    公开(公告)日:2008-04-03

    申请号:US11540406

    申请日:2006-09-30

    IPC分类号: G06F7/00

    CPC分类号: G06F21/6245

    摘要: Novel methods and systems for the privacy preserving mining of string data with the use of simple template based models. Such template based models are effective in practice, and preserve important statistical characteristics of the strings such as intra-record distances. Discussed herein is the condensation model for anonymization of string data. Summary statistics are created for groups of strings, and use these statistics are used to generate pseudo-strings. It will be seen that the aggregate behavior of a new set of strings maintains key characteristics such as composition, the order of the intra-string distances, and the accuracy of data mining algorithms such as classification. The preservation of intra-string distances is a key goal in many string and biological applications which are deeply dependent upon the computation of such distances, while it can be shown that the accuracy of applications such as classification are not affected by the anonymization process.

    摘要翻译: 使用简单的基于模板的模型,用于隐私保护字符串数据挖掘的新方法和系统。 这种基于模板的模型在实践中是有效的,并且保持字符串的重要统计特征,例如记录内距离。 这里讨论的是字符串数据的匿名化的缩合模型。 针对字符串组创建摘要统计信息,并使用这些统计信息来生成伪字符串。 可以看出,一组新的字符串的聚合行为保持关键特征,例如组合,字符串间距离的顺序以及诸如分类的数据挖掘算法的准确性。 字符串间距离的保留是许多字符串和生物应用中的关键目标,这些应用程序深深地依赖于这种距离的计算,而可以显示诸如分类的应用的准确性不受匿名过程的影响。

    Method and apparatus for adaptive in-operator load shedding
    97.
    发明申请
    Method and apparatus for adaptive in-operator load shedding 审中-公开
    自适应操作员负载脱落的方法和装置

    公开(公告)号:US20080005391A1

    公开(公告)日:2008-01-03

    申请号:US11447433

    申请日:2006-06-05

    IPC分类号: G06F3/00

    摘要: One embodiment of the present method and apparatus adaptive in-operator load shedding includes receiving at least two data streams (each comprising a plurality of tuples, or data items) into respective sliding windows of memory. A throttling fraction is then calculated based on input rates associated with the data streams and on currently available processing resources. Tuples are then selected for processing from the data streams in accordance with the throttling fraction, where the selected tuples represent a subset of all tuples contained within the sliding window.

    摘要翻译: 本发明的方法和设备的一个实施例是自适应操作员卸载包括将至少两个数据流(每个包括多个元组或数据项)接收到存储器的相应滑动窗口中。 然后基于与数据流相关联的输入速率和当前可用的处理资源来计算节流分数。 然后根据节流分数从数据流中选择元组进行处理,其中所选元组表示包含在滑动窗口内的所有元组的子集。

    Identifying optimal multi-scale patterns in time-series streams
    98.
    发明申请
    Identifying optimal multi-scale patterns in time-series streams 审中-公开
    确定时间序列流中的最优多尺度模式

    公开(公告)号:US20070294247A1

    公开(公告)日:2007-12-20

    申请号:US11471002

    申请日:2006-06-20

    IPC分类号: G06F17/30

    CPC分类号: G06K9/00496

    摘要: A method, system, and computer readable medium for identifying local patterns in at least one time series data stream are disclosed. The method comprises generating multiple ordered levels of hierarchal approximation functions. The multiple ordered levels are generated directly from at least one given time series data stream including at least one set of time series data. The hierarchical approximation functions for each level of the multiple levels is based upon creating a set of approximating functions. The hierarchical approximation functions are also based upon selecting a current window with a current window length from a set of varying window lengths. The current window is selected for a current level of the multiple levels.

    摘要翻译: 公开了一种用于识别至少一个时间序列数据流中的局部模式的方法,系统和计算机可读介质。 该方法包括生成层次近似函数的多个有序级别。 多个有序级别直接从包括至少一组时间序列数据的至少一个给定时间序列数据流生成。 多层次的每个级别的层次近似函数基于创建一组近似函数。 层次近似函数还基于从一组变化的窗口长度中选择具有当前窗口长度的当前窗口。 为当前级别选择当前窗口。

    Method for optimizing profits in electronic delivery of digital objects
    99.
    发明授权
    Method for optimizing profits in electronic delivery of digital objects 失效
    优化数字物体电子交付利润的方法

    公开(公告)号:US06631413B1

    公开(公告)日:2003-10-07

    申请号:US09239008

    申请日:1999-01-28

    IPC分类号: G06F1300

    CPC分类号: G06Q10/08

    摘要: In accordance with the present invention, a method for selecting a channel and delivery time for digital objects for a broadcast delivery service including multiple channels of varying bandwidths includes the steps of selecting digital objects to be sent over the multiple channels, generating a schedule and pricing for the digital objects based on the digital object selected and existing delivery commitments and manipulating the schedule and pricing to provide a profitable delivery of the digital objects. A system is also included.

    摘要翻译: 根据本发明,一种用于为包括多个变化带宽的多个信道的广播传送业务的数字对象选择信道和传送时间的方法包括以下步骤:选择要在多个信道上发送的数字对象,生成调度和定价 用于基于选定的数字对象和现有交付承诺的数字对象,并操纵计划和定价以提供数字对象的有利可图的交付。 还包括一个系统。