SYSTEMS AND METHODS FOR COMMUNITY DETECTION
    32.
    发明申请
    SYSTEMS AND METHODS FOR COMMUNITY DETECTION 审中-公开
    用于社区检测的系统和方法

    公开(公告)号:US20100185935A1

    公开(公告)日:2010-07-22

    申请号:US12629047

    申请日:2009-12-02

    IPC分类号: G06F15/16 G06F17/00

    CPC分类号: G06Q99/00

    摘要: Systems and methods are disclosed to detect communities of a social network by receiving linked documents from the social network; generating one or more conditional link models and one or more discriminative content models from the linked documents; creating a discriminative model by combining the one or more conditional link models and discriminative content models; and applying the discriminative model to the social networks.

    摘要翻译: 公开了通过从社交网络接收链接的文档来检测社交网络的社区的系统和方法; 从所述链接的文档生成一个或多个条件链接模型和一个或多个歧视内容模型; 通过组合一个或多个条件链接模型和区分性内容模型来创建歧视性模型; 并将歧视性模型应用于社交网络。

    SOCIAL NETWORK ANALYSIS WITH PRIOR KNOWLEDGE AND NON-NEGATIVE TENSOR FACTORIZATION
    33.
    发明申请
    SOCIAL NETWORK ANALYSIS WITH PRIOR KNOWLEDGE AND NON-NEGATIVE TENSOR FACTORIZATION 有权
    具有先前知识和非负性传感器参数的社会网络分析

    公开(公告)号:US20100185578A1

    公开(公告)日:2010-07-22

    申请号:US12469043

    申请日:2009-05-20

    IPC分类号: G06N7/02 G06N5/02 G06F17/10

    CPC分类号: G06Q30/02

    摘要: Systems and methods are disclosed to analyze a social network by generating a data tensor from social networking data; applying a non-negative tensor factorization (NTF) with user prior knowledge and preferences to generate a core tensor and facet matrices; and rendering information to social networking users based on the core tensor and facet matrices.

    摘要翻译: 公开了通过从社交网络数据生成数据张量来分析社交网络的系统和方法; 应用具有用户先验知识和偏好的非负张量因子分解(NTF)来生成核心张量和小平面矩阵; 并基于核心张量和面矩阵将信息呈现给社交网络用户。

    SYSTEMS AND METHODS FOR CHARACTERIZING LINKED DOCUMENTS USING A LATENT TOPIC MODEL
    34.
    发明申请
    SYSTEMS AND METHODS FOR CHARACTERIZING LINKED DOCUMENTS USING A LATENT TOPIC MODEL 有权
    使用专利主题模型表征链接文档的系统和方法

    公开(公告)号:US20100161611A1

    公开(公告)日:2010-06-24

    申请号:US12629043

    申请日:2009-12-01

    IPC分类号: G06F17/30 G06F15/18 G06N5/04

    CPC分类号: G06N7/005 G06F17/30014

    摘要: Systems and methods are disclosed for extracting characteristics from a corpus of linked documents by deriving a content link model that explicitly captures direct and indirect relations represented by the links, and extracting document topics and the topic distributions for all the documents in the corpus using the content-link model.

    摘要翻译: 公开了系统和方法,用于通过导出显式地捕获由链接表示的直接和间接关系的内容链接模型,以及使用内容来提取语料库中的所有文档的文档主题和主题分布来从链接文档的语料库中提取特征 链接模型。

    Systems and methods for maintaining closed frequent itemsets over a data stream sliding window
    36.
    发明授权
    Systems and methods for maintaining closed frequent itemsets over a data stream sliding window 失效
    在数据流滑动窗口上维护关闭频繁项目集的系统和方法

    公开(公告)号:US07496592B2

    公开(公告)日:2009-02-24

    申请号:US11046926

    申请日:2005-01-31

    IPC分类号: G06F17/00

    摘要: Towards mining closed frequent itemsets over a sliding window using limited memory space, a synopsis data structure to monitor transactions in the sliding window so that one can output the current closed frequent itemsets at any time. Due to time and memory constraints, the synopsis data structure cannot monitor all possible itemsets, but monitoring only frequent itemsets makes it difficult to detect new itemsets when they become frequent. Herein, there is introduced a compact data structure, the closed enumeration tree (CET), to maintain a dynamically selected set of itemsets over a sliding-window. The selected itemsets include a boundary between closed frequent itemsets and the rest of the itemsets Because the boundary is relatively stable, the cost of mining closed frequent itemsets over a sliding window is dramatically reduced to that of mining transactions that can possibly cause boundary movements in the CET.

    摘要翻译: 通过使用有限的存储空间的滑动窗口挖掘封闭的频繁项集,用于监视滑动窗口中的事务的概要数据结构,以便可以随时输出当前关闭的频繁项集。 由于时间和内存限制,概要数据结构不能监视所有可能的项集,而只监视频繁项集,使得当它们变得频繁时很难检测新的项集。 在这里,引入了一种紧凑的数据结构,封闭的枚举树(CET),以便在滑动窗口上维护动态选择的一组项集。 所选择的项目集包括封闭频繁项集和其余项目集之间的边界由于边界相对稳定,在滑动窗口中挖掘封闭频繁项集的成本大大降低到可能导致边界移动的采矿交易的成本 CET。

    SYSTEMS AND METHODS FOR TREND EXTRACTION AND ANALYSIS OF DYNAMIC DATA
    37.
    发明申请
    SYSTEMS AND METHODS FOR TREND EXTRACTION AND ANALYSIS OF DYNAMIC DATA 审中-公开
    用于趋势提取和动态数据分析的系统和方法

    公开(公告)号:US20070100875A1

    公开(公告)日:2007-05-03

    申请号:US11556091

    申请日:2006-11-02

    IPC分类号: G06F7/00

    CPC分类号: G06Q30/02

    摘要: The invention is directed generally to providing methods and systems for trend extraction and analysis. Embodiments include methods and systems for trend extraction and analysis of information extracted from dynamically changing data included in computer systems and/or networks. Various exemplary embodiments are provided that may generate characteristic indicators for trend(s) and/or distribution(s) for one or more data sources by use of, for example, temporal indicators derived through analysis of the difference in contribution separate portions of the data to the whole data set being considered, contribution of individual sources, and/or the interaction of the separate portions of the data with one another. Some exemplary approaches may include the use of singular value decomposition (SVD) and higher-order singular value decomposition (HOSVD) data extraction and analysis techniques. One use of these techniques is in the analysis of the dynamic data contained in Weblogs and the blogosphere.

    摘要翻译: 本发明一般涉及提供用于趋势提取和分析的方法和系统。 实施例包括用于趋势提取和分析从包括在计算机系统和/或网络中的动态变化的数据提取的信息的方法和系统。 提供了各种示例性实施例,其可以通过使用例如通过分析贡献分离而导出的用于一个或多个数据源的趋势和/或分布的特征指标,分离部分的数据 被考虑的整个数据集,各个来源的贡献,和/或数据的分开的部分的相互作用。 一些示例性方法可以包括使用奇异值分解(SVD)和高阶奇异值分解(HOSVD)数据提取和分析技术。 这些技术的一个用途是分析博客和博客圈子中包含的动态数据。

    System and method for load shedding in data mining and knowledge discovery from stream data

    公开(公告)号:US20060184527A1

    公开(公告)日:2006-08-17

    申请号:US11058944

    申请日:2005-02-16

    IPC分类号: H04L27/28

    CPC分类号: G06K9/6297 H04L43/028

    摘要: Load shedding schemes for mining data streams. A scoring function is used to rank the importance of stream elements, and those elements with high importance are investigated. In the context of not knowing the exact feature values of a data stream, the use of a Markov model is proposed herein for predicting the feature distribution of a data stream. Based on the predicted feature distribution, one can make classification decisions to maximize the expected benefits. In addition, there is proposed herein the employment of a quality of decision (QoD) metric to measure the level of uncertainty in decisions and to guide load shedding. A load shedding scheme such as presented herein assigns available resources to multiple data streams to maximize the quality of classification decisions. Furthermore, such a load shedding scheme is able to learn and adapt to changing data characteristics in the data streams.

    Highly scalable cost based SLA-aware scheduling for cloud services
    39.
    发明授权
    Highly scalable cost based SLA-aware scheduling for cloud services 有权
    针对云服务的高度可扩展的基于成本的SLA感知调度

    公开(公告)号:US08776076B2

    公开(公告)日:2014-07-08

    申请号:US12985038

    申请日:2011-01-05

    摘要: An efficient cost-based scheduling method called incremental cost-based scheduling, iCBS, maps each job, based on its arrival time and SLA function, to a fixed point in the dual space of linear functions. Due to this mapping, in the dual space, the job will not change their locations over time. Instead, at the time of selecting the next job with the highest priority to execute, a line with appropriate angle in the query space is used to locate the current job with the highest CBS score in logarithmic time. Because only those points that are located on the convex hull in the dual space can be chosen, a dynamic convex hull maintaining method incrementally maintains the job with the highest CBS score over time.

    摘要翻译: 基于成本的有效调度方法称为增量成本调度,iCBS将每个作业的到达时间和SLA功能映射到线性函数的双重空间中的一个固定点。 由于这种映射,在双重空间中,作业将不会随着时间而改变其位置。 相反,在选择执行最高优先级的下一个作业时,在查询空间中具有适当角度的行用于以对数时间定位具有最高CBS分数的当前作业。 因为只能选择位于双重空间中的凸包上的那些点,所以动态凸包维持方法随着时间的推移逐渐维持CBS得分最高的作业。

    Admission control in cloud databases under service level agreements
    40.
    发明授权
    Admission control in cloud databases under service level agreements 有权
    根据服务级别协议在云数据库中进行接纳控制

    公开(公告)号:US08768875B2

    公开(公告)日:2014-07-01

    申请号:US13251215

    申请日:2011-10-01

    摘要: An admission control system for a cloud database includes a machine learning prediction module to estimate a predicted probability for a newly arrived query with a deadline, if admitted into the cloud database, to finish its execution before said deadline, wherein the prediction considers query characteristics and current system conditions. The system also includes a decision module applying the predicted probability to admit a query into the cloud database with a target of profit maximization with an expected profit determined using one or more service level agreements (SLAs).

    摘要翻译: 用于云数据库的准入控制系统包括:机器学习预测模块,用于在所述截止期限之前估计具有截止日期的新到达查询的预测概率(如果被允许进入云数据库)以完成其执行,其中所述预测考虑查询特性, 当前系统条件。 该系统还包括一个决策模块,将预测的概率应用于使用一个或多个服务水平协议(SLA)确定的预期利润的利润最大化目标的云数据库中进行查询。