Structural data classification
    41.
    发明授权
    Structural data classification 有权
    结构数据分类

    公开(公告)号:US08121967B2

    公开(公告)日:2012-02-21

    申请号:US12141251

    申请日:2008-06-18

    IPC分类号: G06F17/00 G06N5/02

    CPC分类号: G06N99/005

    摘要: Techniques for classifying structural data with skewed distribution are disclosed. By way of example, a method classifying structural input data comprises a computer system performing the following steps. Multiple classifiers are constructed, wherein each classifier is constructed on a subset of training data, using one or more selected composite features from the subset of training data. A consensus among the multiple classifiers is computed in accordance with a voting scheme such that at least a portion of the structural input data is assigned to a particular class in accordance with the computed consensus. Such techniques for structured data classification are capable of handling skewed class distribution and partial feature coverage issues.

    摘要翻译: 公开了分布具有偏斜分布的结构数据的技术。 作为示例,分类结构输入数据的方法包括执行以下步骤的计算机系统。 构建多个分类器,其中使用来自训练数据的子集的一个或多个选定的复合特征,在训练数据的子集上构建每个分类器。 根据投票方案计算多个分类器之间的共识,使得至少一部分结构输入数据根据所计算的一致性被分配给特定类别。 这种用于结构化数据分类的技术能够处理倾斜的类分布和部分特征覆盖问题。

    Automatically and adaptively determining execution plans for queries with parameter markers
    42.
    发明授权
    Automatically and adaptively determining execution plans for queries with parameter markers 失效
    自动和自适应地确定具有参数标记的查询的执行计划

    公开(公告)号:US07958113B2

    公开(公告)日:2011-06-07

    申请号:US12125221

    申请日:2008-05-22

    IPC分类号: G06F7/00 G06F17/30 G06F15/16

    CPC分类号: G06F17/30469

    摘要: A method and system for automatically and adaptively determining query execution plans for parametric queries. A first classifier trained by an initial set of training points is generated. A query workload and/or database statistics are dynamically updated. A new set of training points is collected off-line. Using the new set of training points, the first classifier is modified into a second classifier. A database query is received at a runtime subsequent to the off-line phase. The query includes predicates having parameter markers bound to actual values. The predicates are associated with selectivities. A mapping of the selectivities into a plan determines the query execution plan. The determined query execution plan is included in an augmented set of training points, where the augmented set includes the initial set and the new set.

    摘要翻译: 一种用于自动和自适应地确定参数查询的查询执行计划的方法和系统。 产生由初始训练点训练的第一分类器。 动态更新查询工作负载和/或数据库统计信息。 离线收集了一套新的培训点。 使用新的一组训练点,第一个分类器被修改为第二个分类器。 在离线阶段之后的运行时间接收数据库查询。 该查询包括具有绑定到实际值的参数标记的谓词。 谓词与选择性相关联。 将选择性映射到计划中确定查询执行计划。 确定的查询执行计划被包括在增强的训练点集合中,其中增强集合包括初始集合和新集合。

    Electrode tuning method and apparatus for a layered heater structure
    44.
    发明授权
    Electrode tuning method and apparatus for a layered heater structure 有权
    用于层状加热器结构的电极调谐方法和装置

    公开(公告)号:US07777160B2

    公开(公告)日:2010-08-17

    申请号:US12002381

    申请日:2007-12-17

    IPC分类号: H05B3/68 C23C16/00

    摘要: A layered heater structure including an electrode layer and a localized tuning method for tuning the electrode layer of a layered heater structure with high precision is provided. The localized tuning method tunes the electrode layer to its proper local resistance to minimize temperature offsets on the heater surface and thus provide a desired thermal profile that is in marked contrast to conventional, non-localized resistance tuning approaches based on thickness trimming practices, such as grinding or blasting, or resistivity adjustment, such as local heat treatment.

    摘要翻译: 提供了包括电极层和用于高精度地调谐层状加热器结构的电极层的局部调谐方法的层状加热器结构。 局部调谐方法将电极层调整到其适当的局部电阻以最小化加热器表面上的温度偏移,并因此提供期望的热分布,其与基于厚度修整实践的常规非局部电阻调谐方法形成鲜明对比,例如 研磨或爆破,或电阻率调整,如局部热处理。

    SYSTEM AND METHOD FOR SCALABLE COST-SENSITIVE LEARNING
    45.
    发明申请
    SYSTEM AND METHOD FOR SCALABLE COST-SENSITIVE LEARNING 有权
    可衡量敏感性学习的系统和方法

    公开(公告)号:US20100169252A1

    公开(公告)日:2010-07-01

    申请号:US12690502

    申请日:2010-01-20

    IPC分类号: G06N3/12 G06F15/18

    CPC分类号: G06N99/005

    摘要: A method (and structure) for processing an inductive learning model for a dataset of examples, includes dividing the dataset of examples into a plurality of subsets of data and generating, using a processor on a computer, a learning model using examples of a first subset of data of the plurality of subsets of data. The learning model being generated for the first subset comprises an initial stage of an evolving aggregate learning model (ensemble model) for an entirety of the dataset, the ensemble model thereby providing an evolving estimated learning model for the entirety of the dataset if all the subsets were to be processed. The generating of the learning model using data from a subset includes calculating a value for at least one parameter that provides an objective indication of an adequacy of a current stage of the ensemble model.

    摘要翻译: 一种用于处理实例的数据集的感应学习模型的方法(和结构),包括将示例的数据集划分成多个数据子集,并使用计算机上的处理器生成使用第一子集的示例的学习模型 的多个数据子集的数据。 为第一子集生成的学习模型包括用于整个数据集的演进聚合学习模型(集合模型)的初始阶段,从而为整个数据集提供演进的估计学习模型,如果所有子集 被处理。 使用来自子集的数据生成学习模型包括计算至少一个参数的值,所述参数提供对所述集合模型的当前阶段的充分性的客观指示。

    System and method for tree structure indexing that provides at least one constraint sequence to preserve query-equivalence between xml document structure match and subsequence match
    46.
    发明授权
    System and method for tree structure indexing that provides at least one constraint sequence to preserve query-equivalence between xml document structure match and subsequence match 失效
    用于树结构索引的系统和方法,其提供至少一个约束序列以保持xml文档结构匹配和子序列匹配之间的查询等价

    公开(公告)号:US07475070B2

    公开(公告)日:2009-01-06

    申请号:US11035889

    申请日:2005-01-14

    IPC分类号: G06F17/30 G06F17/00

    摘要: Sequence-based XML indexing aims at avoiding expensive join operations in query processing. It transforms structured XML data into sequences so that a structured query can be answered holistically through subsequence matching. Herein, there is addressed the problem of query equivalence with respect to this transformation, and thereis introduced a performance-oriented principle for sequencing tree structures. With query equivalence, XML queries can be performed through subsequence matching without join operations, post-processing, or other special handling for problems such as false alarms. There is identified a class of sequencing methods for this purpose, and there is presented a novel subsequence matching algorithm that observe query equivalence. Also introduced is a performance-oriented principle to guide the sequencing of tree structures. For any given XML dataset, the principle finds an optimal sequencing strategy according to its schema and its data distribution; there is thus presented herein a novel method that realizes this principle.

    摘要翻译: 基于序列的XML索引旨在避免查询处理中的昂贵的联接操作。 它将结构化XML数据转换为序列,以便可以通过子序列匹配整体回答结构化查询。 这里,针对这种转换的查询等价问题,提出了一种用于排序树结构的性能导向原理。 通过查询等价,可以通过子序列匹配执行XML查询,无需连接操作,后处理或其他特殊处理,例如虚假警报等问题。 确定了一类用于此目的的测序方法,并提出了一种观察查询等价性的新颖的子序列匹配算法。 还引入了一种以性能为导向的原则来指导树结构的排序。 对于任何给定的XML数据集,该原理根据其模式及其数据分布找到最佳排序策略; 因此在此呈现了实现这一原理的新颖方法。

    System and method for continuous diagnosis of data streams
    47.
    发明授权
    System and method for continuous diagnosis of data streams 失效
    用于连续诊断数据流的系统和方法

    公开(公告)号:US07464068B2

    公开(公告)日:2008-12-09

    申请号:US10880913

    申请日:2004-06-30

    IPC分类号: G06F17/30

    摘要: In connection with the mining of time-evolving data streams, a general framework that mines changes and reconstructs models from a data stream with unlabeled instances or a limited number of labeled instances. In particular, there are defined herein statistical profiling methods that extend a classification tree in order to guess the percentage of drifts in the data stream without any labelled data. Exact error can be estimated by actively sampling a small number of true labels. If the estimated error is significantly higher than empirical expectations, there preferably re-sampled a small number of true labels to reconstruct the decision tree from the leaf node level.

    摘要翻译: 与挖掘时间不断变化的数据流有关的一般框架,即从具有未标记实例的数据流或有限数量的标记实例中挖掘变更和重建模型。 特别地,这里定义了扩展分类树的统计分析方法,以便在没有任何标记数据的情况下猜测数据流中漂移的百分比。 可以通过主动抽取少量真实标签来估计精确误差。 如果估计的误差明显高于经验期望值,则最好重新采样少量的真实标签,以从叶节点级别重建决策树。

    Etch resistant wafer processing apparatus and method for producing the same
    48.
    发明授权
    Etch resistant wafer processing apparatus and method for producing the same 失效
    耐蚀刻晶片处理装置及其制造方法

    公开(公告)号:US07446284B2

    公开(公告)日:2008-11-04

    申请号:US11322809

    申请日:2005-12-30

    IPC分类号: H05B3/68 H01L23/58

    摘要: A wafer processing apparatus is fabricated by depositing a film electrode onto the surface of a base substrate, the structure is then overcoated with a protective coating film layer comprising at least one of a nitride, carbide, carbonitride or oxynitride of elements selected from a group consisting of B, Al, Si, Ga, refractory hard metals, transition metals, and combinations thereof. The film electrode has a coefficient of thermal expansion (CTE) that closely matches the CTE of the underlying base substrate layer as well as the CTE of the protective coating layer.

    摘要翻译: 通过将膜电极沉积在基底表面上来制造晶片处理装置,然后用包括氮化物,碳化物,碳氮化物或氮氧化物中的至少一种的保护涂膜层进行外涂,所述元素选自: B,Al,Si,Ga,难熔硬金属,过渡金属及其组合。 膜电极具有与下层基底层的CTE以及保护涂层的CTE密切匹配的热膨胀系数(CTE)。

    System and method for indexing weighted-sequences in large databases
    49.
    发明授权
    System and method for indexing weighted-sequences in large databases 有权
    用于索引大数据库中加权序列的系统和方法

    公开(公告)号:US07418455B2

    公开(公告)日:2008-08-26

    申请号:US10723229

    申请日:2003-11-26

    IPC分类号: G06F7/00 G06F17/00

    摘要: The present invention provides an index structure for managing weighted-sequences in large databases. A weighted-sequence is defined as a two-dimensional structure in which each element in the sequence is associated with a weight. A series of network events, for instance, is a weighted-sequence because each event is associated with a timestamp. Querying a large sequence database by events' occurrence patterns is a first step towards understanding the temporal causal relationships among the events. The index structure proposed herein enables the efficient retrieval from the database of all subsequences (contiguous and non-contiguous) that match a given query sequence both by events and by weights. The index structure also takes into consideration the nonuniform frequency distribution of events in the sequence data.

    摘要翻译: 本发明提供了一种用于在大数据库中管理加权序列的索引结构。 加权序列被定义为二维结构,其中序列中的每个元素与权重相关联。 例如,一系列网络事件是加权序列,因为每个事件都与时间戳相关联。 通过事件发生模式查询大序列数据库是了解事件之间的时间因果关系的第一步。 这里提出的索引结构使得能够通过事件和权重从数据库有效地检索与给定查询序列匹配的所有子序列(连续的和不连续的)。 索引结构还考虑了序列数据中事件的不均匀频率分布。

    SYSTEMS AND METHODS FOR SEQUENTIAL MODELING IN LESS THAN ONE SEQUENTIAL SCAN
    50.
    发明申请
    SYSTEMS AND METHODS FOR SEQUENTIAL MODELING IN LESS THAN ONE SEQUENTIAL SCAN 失效
    用于顺序建模的系统和方法不超过一次连续扫描

    公开(公告)号:US20080052255A1

    公开(公告)日:2008-02-28

    申请号:US11931129

    申请日:2007-10-31

    IPC分类号: G06F15/18 G06N7/00

    CPC分类号: G06N99/005 Y10S707/99931

    摘要: Most recent research of scalable inductive learning on very large streaming dataset focuses on eliminating memory constraints and reducing the number of sequential data scans. However, state-of-the-art algorithms still require multiple scans over the data set and use sophisticated control mechanisms and data structures. There is discussed herein a general inductive learning framework that scans the dataset exactly once. Then, there is proposed an extension based on Hoeffding's inequality that scans the dataset less than once. The proposed frameworks are applicable to a wide range of inductive learners.

    摘要翻译: 对最大流式数据集的可伸缩归纳学习的最新研究着重于消除记忆限制并减少顺序数据扫描的次数。 然而,最先进的算法仍然需要对数据集进行多次扫描,并使用复杂的控制机制和数据结构。 这里讨论了一般的归纳学习框架,该框架一次扫描数据集。 然后,提出了一种基于Hoeffding不等式的扩展,可以扫描数据集不止一次。 提出的框架适用于广泛的归纳学习者。