Data allocation and replication across distributed storage system
    31.
    发明授权
    Data allocation and replication across distributed storage system 有权
    分布式存储系统的数据分配和复制

    公开(公告)号:US08380960B2

    公开(公告)日:2013-02-19

    申请号:US12264274

    申请日:2008-11-04

    IPC分类号: G06F12/02

    摘要: In a distributed storage system such as those in a data center or web based service, user characteristics and characteristics of the hardware such as storage size and storage throughput impact the capacity and performance of the system. In such systems, an allocation is a mapping from the user to the physical storage devices where data/information pertaining to the user will be stored. Policies regarding quality of service and reliability including replication of user data/information may be provided by the entity managing the system. A policy may define an objective function which quantifies the value of a given allocation. Maximizing the value of the allocation will optimize the objective function. This optimization may include the dynamics in terms of changes in patterns of user characteristics and the cost of moving data/information between the physical devices to satisfy a particular allocation.

    摘要翻译: 在诸如数据中心或基于网络的服务中的分布式存储系统中,诸如存储大小和存储吞吐量的硬件的用户特性和特性影响系统的容量和性能。 在这样的系统中,分配是从用户到存储与用户有关的数据/信息的物理存储设备的映射。 可以由管理系统的实体提供关于服务质量和可靠性的政策,包括用户数据/信息的复制。 策略可以定义量化给定分配值的目标函数。 最大化分配的价值将优化目标函数。 这种优化可以包括用户特征模式的变化以及在物理设备之间移动数据/信息的成本以满足特定分配的动态。

    TIME MODULATED GENERATIVE PROBABILISTIC MODELS FOR AUTOMATED CAUSAL DISCOVERY USING A CONTINUOUS TIME NOISY-OR (CT-NOR) MODELS
    32.
    发明申请
    TIME MODULATED GENERATIVE PROBABILISTIC MODELS FOR AUTOMATED CAUSAL DISCOVERY USING A CONTINUOUS TIME NOISY-OR (CT-NOR) MODELS 有权
    使用连续时间噪声或(CT-NOR)模型的自动发现的时间调制生成概率模型

    公开(公告)号:US20110113004A1

    公开(公告)日:2011-05-12

    申请号:US13007643

    申请日:2011-01-16

    IPC分类号: G06N5/02

    摘要: Dependencies between different channels or different services in a client or server may be determined from the observation of the times of the incoming and outgoing of the packets constituting those channels or services. A probabilistic model may be used to formally characterize these dependencies. The probabilistic model may be used to list the dependencies between input packets and output packets of various channels or services, and may be used to establish the expected strength of the causal relationship between the different events surrounding those channels or services. Parameters of the probabilistic model may be either based on prior knowledge, or may be fit using statistical techniques based on observations about the times of the events of interest. Expected times of occurrence between events may be observed, and dependencies may be determined in accordance with the probabilistic model.

    摘要翻译: 客户端或服务器中的不同信道或不同业务之间的依赖关系可以从对构成这些信道或业务的分组的进入和传出的时间的观察来确定。 概率模型可用于正式表征这些依赖性。 概率模型可以用于列出输入分组和各种信道或服务的输出分组之间的依赖性,并且可以用于建立围绕这些信道或服务的不同事件之间的因果关系的预期强度。 概率模型的参数可以基于现有知识,或者可以使用基于关于感兴趣事件的时间的观察的统计技术来拟合。 可以观察事件之间的预期发生时间,并且依赖性可以根据概率模型来确定。

    Automatic discovery of service/host dependencies in computer networks
    33.
    发明授权
    Automatic discovery of service/host dependencies in computer networks 有权
    自动发现计算机网络中的服务/主机依赖关系

    公开(公告)号:US07821947B2

    公开(公告)日:2010-10-26

    申请号:US11739312

    申请日:2007-04-24

    IPC分类号: G01R31/08 G06F15/173

    CPC分类号: H04L43/04 H04L43/16

    摘要: An activity model is generated at a computer. The activity model may be generated by monitoring incoming and outgoing channels for packets for a predetermined window of time. To generate an activity model, an input and an output channel are selected. A probability distribution function describing the observed waiting time between packet arrivals on the selected input channel and the selected output channel is generated by mining the data collected during the selected window of time. A probability distribution function describing the observed waiting time between a randomly chosen instant and receiving a packet on the selected input channel is also generated. The distance between the two generated probability distribution functions is computed. If the computed distance is greater than a predefined confidence level, then the two selected channels are deemed to be related. Otherwise, the selected channels are deemed to be unrelated. The activity model is further generated by comparing each input and output channel pair entering or leaving a particular computer.

    摘要翻译: 在计算机上生成活动模型。 可以通过在预定时间窗口内监视分组的传入和传出信道来生成活动模型。 要生成活动模型,选择输入和输出通道。 通过挖掘在所选择的时间窗口内收集的数据,生成描述所选输入通道上的分组到达之间观察到的等待时间和所选择的输出通道的概率分布函数。 还产生描述在所选择的输入通道上随机选择的瞬间和接收分组之间观察到的等待时间的概率分布函数。 计算两个生成的概率分布函数之间的距离。 如果计算出的距离大于预定义的置信水平,则两个所选择的信道被认为是相关的。 否则,所选频道被认为是无关的。 通过比较进入或离开特定计算机的每个输入和输出通道对,进一步产生活动模型。

    AUTOMATED HEALTH MODEL GENERATION AND REFINEMENT
    34.
    发明申请
    AUTOMATED HEALTH MODEL GENERATION AND REFINEMENT 有权
    自动健康模型生成与修改

    公开(公告)号:US20100241903A1

    公开(公告)日:2010-09-23

    申请号:US12408570

    申请日:2009-03-20

    IPC分类号: G06F11/28 G06F15/00

    摘要: The present invention extends to methods, systems, and computer program products for automatically generating and refining health models. Embodiments of the invention use machine learning tools to analyze historical telemetry data from a server deployment. The tools output fingerprints, for example, small groupings of specific metrics-plus-behavioral parameters, that uniquely identify and describe past problem events mined from the historical data. Embodiments automatically translate the fingerprints into health models that can be directly applied to monitoring the running system. Fully-automated feedback loops for identifying past problems and giving advance notice as those problems emerge in the future is facilitated without any operator intervention. In some embodiments, a single portion of expert knowledge, for example, Key Performance Indicator (KPI) data, initiates health model generation. Once initiated, the feedback loop can be fully automated to access further telemetry and refine health models based on the further telemetry.

    摘要翻译: 本发明延伸到用于自动生成和改进健康模型的方法,系统和计算机程序产品。 本发明的实施例使用机器学习工具来分析来自服务器部署的历史遥测数据。 这些工具输出指纹,例如,特定指标加行为参数的小组,可以唯一地识别和描述从历史数据中挖掘的过去的问题事件。 实施例将指纹自动转换为可直接应用于监视运行系统的健康模型。 全面自动化的反馈回路用于识别过去的问题,并在未来出现这些问题时提前通知,无需任何操作员干预。 在一些实施例中,专家知识的单一部分,例如关键绩效指标(KPI)数据,启动健康模型生成。 一旦启动,反馈回路可以完全自动化,以进一步遥测和基于进一步的遥测来改进健康模型。

    Automatic Discovery Of Service/Host Dependencies In Computer Networks
    35.
    发明申请
    Automatic Discovery Of Service/Host Dependencies In Computer Networks 有权
    计算机网络中服务/主机依赖关系的自动发现

    公开(公告)号:US20080267083A1

    公开(公告)日:2008-10-30

    申请号:US11739312

    申请日:2007-04-24

    IPC分类号: G01R31/08

    CPC分类号: H04L43/04 H04L43/16

    摘要: An activity model is generated at a computer. The activity model may be generated by monitoring incoming and outgoing channels for packets for a predetermined window of time. To generate an activity model, an input and an output channel are selected. A probability distribution function describing the observed waiting time between packet arrivals on the selected input channel and the selected output channel is generated by mining the data collected during the selected window of time. A probability distribution function describing the observed waiting time between a randomly chosen instant and receiving a packet on the selected input channel is also generated. The distance between the two generated probability distribution functions is computed. If the computed distance is greater than a predefined confidence level, then the two selected channels are deemed to be related. Otherwise, the selected channels are deemed to be unrelated. The activity model is further generated by comparing each input and output channel pair entering or leaving a particular computer.

    摘要翻译: 在计算机上生成活动模型。 可以通过在预定时间窗口内监视分组的传入和传出信道来生成活动模型。 要生成活动模型,选择输入和输出通道。 通过挖掘在所选择的时间窗口内收集的数据,生成描述所选输入通道上的分组到达之间观察到的等待时间和所选择的输出通道的概率分布函数。 还产生描述在所选择的输入通道上随机选择的瞬间和接收分组之间观察到的等待时间的概率分布函数。 计算两个生成的概率分布函数之间的距离。 如果计算出的距离大于预定义的置信水平,则两个所选择的信道被认为是相关的。 否则,所选频道被认为是无关的。 通过比较进入或离开特定计算机的每个输入和输出通道对,进一步产生活动模型。