Cross-trace scalable issue detection and clustering
    1.
    发明授权
    Cross-trace scalable issue detection and clustering 有权
    交叉跟踪可扩展问题检测和集群

    公开(公告)号:US08538897B2

    公开(公告)日:2013-09-17

    申请号:US12960015

    申请日:2010-12-03

    IPC分类号: G06F11/07 G06F15/18

    摘要: Techniques and systems for cross-trace scalable issue detection and clustering that scale-up trace analysis for issue detection and root-cause clustering using a machine learning based approach are described herein. These techniques enable a scalable performance analysis framework for computing devices addressing issue detection, which is designed as a multiple scale feature for learning based on issue detection, and root cause clustering. In various embodiments the techniques employ a cross-trace similarity model, which is defined to hierarchically cluster problems detected in the learning based issue detection via butterflies of trigram stacks. The performance analysis framework is scalable to manage millions of traces, which include high problem complexity.

    摘要翻译: 本文描述了用于交叉跟踪可扩展问题检测和聚类的技术和系统,其中使用基于机器学习的方法对用于问题检测和根本原因聚类进行放大跟踪分析。 这些技术使得可扩展的性能分析框架用于处理问题检测的计算设备,其被设计为基于问题检测和根本原因聚类的用于学习的多尺度特征。 在各种实施例中,该技术采用交叉跟踪相似性模型,其被定义为通过三元组栈的蝴蝶在基于学习的问题检测中分层检测集群问题。 性能分析框架是可扩展的,以管理数百万条跟踪,其中包括高问题复杂性。

    CROSS-TRACE SCALABLE ISSUE DETECTION AND CLUSTERING
    2.
    发明申请
    CROSS-TRACE SCALABLE ISSUE DETECTION AND CLUSTERING 有权
    跨轨迹可扩展问题检测和聚类

    公开(公告)号:US20120143795A1

    公开(公告)日:2012-06-07

    申请号:US12960015

    申请日:2010-12-03

    IPC分类号: G06F11/07 G06F15/18

    摘要: Techniques and systems for cross-trace scalable issue detection and clustering that scale-up trace analysis for issue detection and root-cause clustering using a machine learning based approach are described herein. These techniques enable a scalable performance analysis framework for computing devices addressing issue detection, which is designed as a multiple scale feature for learning based issue detection, and root cause clustering. In various embodiments the techniques employ a cross-trace similarity model, which is defined to hierarchically cluster problems detected in the learning based issue detection via butterflies of trigram stacks. The performance analysis framework is scalable to manage millions of traces, which include high problem complexity.

    摘要翻译: 本文描述了用于交叉跟踪可扩展问题检测和聚类的技术和系统,其中使用基于机器学习的方法对用于问题检测和根本原因聚类进行放大跟踪分析。 这些技术使得可扩展的性能分析框架用于处理问题检测的计算设备,其被设计为用于基于学习的问题检测的多尺度特征以及根本原因聚类。 在各种实施例中,该技术采用交叉跟踪相似性模型,其被定义为通过三元组栈的蝴蝶在基于学习的问题检测中分层检测集群问题。 性能分析框架是可扩展的,以管理数百万条跟踪,其中包括高问题复杂性。

    Analyzing Program Execution
    3.
    发明申请
    Analyzing Program Execution 审中-公开
    分析程序执行

    公开(公告)号:US20120278659A1

    公开(公告)日:2012-11-01

    申请号:US13095336

    申请日:2011-04-27

    IPC分类号: G06F11/36

    摘要: A call pattern database is mined to identify frequently occurring call patterns related to program execution instances. An SVM classifier is iteratively trained based at least in part on classifications provided by human analysts; at each iteration, the SVM classifier identifies boundary cases, and requests human analysis of these cases. The trained SVM classifier is then applied to call pattern pairs to produce similarity measures between respective call patterns of each pair, and the call patterns are clustered based on the similarity measures.

    摘要翻译: 调用模式数据库被开采以识别与程序执行实例有关的频繁出现的调用模式。 至少部分基于人类分析师提供的分类,对SVM分类器进行迭代训练; 在每个迭代中,SVM分类器识别边界情况,并请求对这些情况的人类分析。 然后将经过训练的SVM分类器应用于呼叫模式对以在每对的相应呼叫模式之间产生相似性度量,并且基于相似性度量来呼叫模式聚类。

    Frequent pattern mining
    6.
    发明授权
    Frequent pattern mining 有权
    频繁模式挖掘

    公开(公告)号:US09348852B2

    公开(公告)日:2016-05-24

    申请号:US13095415

    申请日:2011-04-27

    IPC分类号: G06F17/30

    摘要: A system for frequent pattern mining uses two layers of processing: a plurality of computing nodes, and a plurality of processors within each computing node. Within each computing node, the data set against which the frequent pattern mining is to be performed is stored in shared memory, accessible concurrently by each of the processors. The search space is partitioned among the computing nodes, and sub-partitioned among the processors of each computing node. If a processor completes its sub-partition, it requests another sub-partition. The partitioning and sub-partitioning may be performed dynamically, and adjusted in real time.

    摘要翻译: 用于频繁模式挖掘的系统使用两层处理:多个计算节点和每个计算节点内的多个处理器。 在每个计算节点内,将要执行频繁模式挖掘的数据集存储在共享存储器中,由每个处理器并发访问。 搜索空间在计算节点之间划分,并在每个计算节点的处理器之间进行子分区。 如果处理器完成其子分区,则它请求另一个子分区。 可以动态执行分区和子分区,并实时调整。

    Frequent Pattern Mining
    7.
    发明申请
    Frequent Pattern Mining 有权
    频繁模式挖掘

    公开(公告)号:US20120278346A1

    公开(公告)日:2012-11-01

    申请号:US13095415

    申请日:2011-04-27

    IPC分类号: G06F17/30

    摘要: A system for frequent pattern mining uses two layers of processing: a plurality of computing nodes, and a plurality of processors within each computing node. Within each computing node, the data set against which the frequent pattern mining is to be performed is stored in shared memory, accessible concurrently by each of the processors. The search space is partitioned among the computing nodes, and sub-partitioned among the processors of each computing node. If a processor completes its sub-partition, it requests another sub-partition. The partitioning and sub-partitioning may be performed dynamically, and adjusted in real time.

    摘要翻译: 用于频繁模式挖掘的系统使用两层处理:多个计算节点和每个计算节点内的多个处理器。 在每个计算节点内,将要执行频繁模式挖掘的数据集存储在共享存储器中,由每个处理器并发访问。 搜索空间在计算节点之间划分,并在每个计算节点的处理器之间进行子分区。 如果处理器完成其子分区,则它请求另一个子分区。 可以动态执行分区和子分区,并实时调整。

    Combining online and offline recognizers in a handwriting recognition system

    公开(公告)号:US08363950B2

    公开(公告)日:2013-01-29

    申请号:US13426427

    申请日:2012-03-21

    IPC分类号: G06K9/00 G06F17/00

    摘要: Described is a technology by which online recognition of handwritten input data is combined with offline recognition and processing to obtain a combined recognition result. In general, the combination improves overall recognition accuracy. In one aspect, online and offline recognition is separately performed to obtain online and offline character-level recognition scores for candidates (hypotheses). A statistical analysis-based combination algorithm, an AdaBoost algorithm, and/or a neural network-based combination may determine a combination function to combine the scores to produce a result set of one or more results. Online and offline radical-level recognition may be performed. For example, a HMM recognizer may generate online radical scores used to build a radical graph, which is then rescored using the offline radical recognition scores. Paths in the rescored graph are then searched to provide the combined recognition result, e.g., corresponding to the path with the highest score.

    Radical Set Determination For HMM Based East Asian Character Recognition
    9.
    发明申请
    Radical Set Determination For HMM Based East Asian Character Recognition 失效
    基于HMM的东亚字符识别的激进集确定

    公开(公告)号:US20080205761A1

    公开(公告)日:2008-08-28

    申请号:US11680566

    申请日:2007-02-28

    IPC分类号: G06K9/18

    摘要: Exemplary techniques are described for selecting radical sets for use in probabilistic East Asian character recognition algorithms. An exemplary technique includes applying a decomposition rule to each East Asian character of the set to generate a progressive splitting graph where the progressive splitting graph comprises radicals as nodes, formulating an optimization problem to find an optimal set of radicals to represent the set of East Asian characters using maximum likelihood and minimum description length and solving the optimization problem for the optimal set of radicals. Another exemplary technique includes selecting an optimal set of radicals by using a general function that characterizes a radical with respect to other East Asian characters and a complex function that characterizes complexity of a radical.

    摘要翻译: 描述了用于选择在概率东亚字符识别算法中使用的激进集合的示例性技术。 一个示例性的技术包括将分解规则应用于集合的每个东亚字符以生成逐行分割图,其中渐进分割图包括基数作为节点,制定优化问题以找到最佳的一组基团以表示东亚集 字符使用最大似然和最小描述长度,并解决优化问题的最佳组的自由基。 另一个示例性技术包括通过使用表征相对于其他东亚字符的基数的一般函数和表征激进的复杂度的复杂函数来选择最佳的自由基集合。

    Feature design for character recognition
    10.
    发明授权
    Feature design for character recognition 有权
    字符识别功能设计

    公开(公告)号:US08463043B2

    公开(公告)日:2013-06-11

    申请号:US13526236

    申请日:2012-06-18

    IPC分类号: G06K9/00 G06K9/46

    CPC分类号: G06K9/00416 G06K2209/011

    摘要: An exemplary method for online character recognition of characters includes acquiring time sequential, online ink data for a handwritten character, conditioning the ink data to produce conditioned ink data where the conditioned ink data includes information as to writing sequence of the handwritten character and extracting features from the conditioned ink data where the features include a tangent feature, a curvature feature, a local length feature, a connection point feature and an imaginary stroke feature. Such a method may determine neighborhoods for ink data and extract features for each neighborhood. An exemplary character recognition system may use various exemplary methods for training and character recognition.

    摘要翻译: 用于字符的在线字符识别的示例性方法包括获取用于手写字符的时间顺序在线墨水数据,调节墨水数据以产生经调节的墨水数据,其中经调节的墨水数据包括关于写入手写字符的序列的信息并从 调节的油墨数据,其中特征包括切线特征,曲率特征,局部长度特征,连接点特征和假想笔划特征。 这种方法可以确定墨水数据的邻域并提取每个邻域的特征。 示例性字符识别系统可以使用用于训练和字符识别的各种示例性方法。