摘要:
A general framework for mining concept-drifting data streams using weighted ensemble classifiers. An ensemble of classification models, such as C4.5, RIPPER, naive Bayesian, etc., is trained from sequential chunks of the data stream. The classifiers in the ensemble are judiciously weighted based on their expected classification accuracy on the test data under the time-evolving environment. Thus, the ensemble approach improves both the efficiency in learning the model and the accuracy in performing classification. An empirical study shows that the proposed methods have substantial advantage over single-classifier approaches in prediction accuracy, and the ensemble framework is effective for a variety of classification models.
摘要:
In connection with the mining of time-evolving data streams, a general framework that mines changes and reconstructs models from a data stream with unlabeled instances or a limited number of labeled instances. In particular, there are defined herein statistical profiling methods that extend a classification tree in order to guess the percentage of drifts in the data stream without any labelled data. Exact error can be estimated by actively sampling a small number of true labels. If the estimated error is significantly higher than empirical expectations, there preferably re-sampled a small number of true labels to reconstruct the decision tree from the leaf node level.
摘要:
A system for the real-time and in-situ macro and micro measurement of in-plane deformations of a microelectronic package or the like comprises a closed environmental chamber (3) within which a test sample may be subjected to thermal cycle loading and/or humidity loading, an incoherent white light source (6) for illuminating the sample, a long-working-distance microscope (2) and image acquisition means (7) for capturing speckle patterns from the surface of the sample during loading, and a control (8) for automating the co-ordination of the various components and for analysing the speckle images using digital image speckle correlation.
摘要:
An improved microfluidic system with an improved microfluidic valve module is disclosed. The microfluidic system includes a microfluidic chip and one or more valve modules. The microfluidic chip has microfluidic channels and one or more cavities formed in the chip, each of the one or more cavities designed to receive one of the one or more valve modules. Each of the one or more valve modules includes a first layer, a control layer and one or more second layers. The first layer includes a deformable material. The control layer has a microfluidic control chamber formed in a portion of it. The control layer is also located adjoining the first layer and the deformable material of the first layer forms a deformable surface of the control chamber. The one or more second layers include an input microfluidic channel and an output microfluidic channel. The input microfluidic channel and the output microfluidic channel are fluidically coupled to the microfluidic control chamber, and fluid flow through the input microfluidic channel, the microfluidic control chamber and the output microfluidic channel is controlled in response to a force deforming the deformable material of the first layer at least a predetermined amount.
摘要:
A horizontal anomaly detection method includes receiving at plurality of objects described in a plurality of information sources, wherein each individual information source captures a plurality of similarity relationships between the objects, combining the information sources to determine a similarity matrix whose entries represent quantitative scores of similarity between pairs of the objects, and identifying at least one horizontal anomaly of the objects within the similarity matrix, wherein the horizontal anomalies are anomalous relationships across the plurality of information sources.
摘要:
An interconnect structure, an interconnect structure for interconnecting first and second components, an interconnect structure for interconnecting a multiple component stack and a substrate, and a method of fabricating an interconnect structure. The interconnect structure comprising a base portion formed on a mounting surface of a first component; a pillar portion extending from the base portion and substantially perpendicularly to the mounting surface; and a head portion formed on the pillar portion and having larger lateral dimensions than the pillar portion; wherein the base portion and the pillar portion are integrally formed of a homogeneous material.
摘要:
A method (and structure) for processing an inductive learning model for a dataset of examples, includes dividing the dataset of examples into a plurality of subsets of data and generating, using a processor on a computer, a learning model using examples of a first subset of data of the plurality of subsets of data. The learning model being generated for the first subset comprises an initial stage of an evolving aggregate learning model (ensemble model) for an entirety of the dataset, the ensemble model thereby providing an evolving estimated learning model for the entirety of the dataset if all the subsets were to be processed. The generating of the learning model using data from a subset includes calculating a value for at least one parameter that provides an objective indication of an adequacy of a current stage of the ensemble model.
摘要:
Unlike traditional clustering methods that focus on grouping objects with similar values on a set of dimensions, clustering by pattern similarity finds objects that exhibit a coherent pattern of rise and fall in subspaces. Pattern-based clustering extends the concept of traditional clustering and benefits a wide range of applications, including e-Commerce target marketing, bioinformatics (large scale scientific data analysis), and automatic computing (web usage analysis), etc. However, state-of-the-art pattern-based clustering methods (e.g., the pCluster algorithm) can only handle datasets of thousands of records, which makes them inappropriate for many real-life applications. Furthermore, besides the huge data volume, many data sets are also characterized by their sequentiality, for instance, customer purchase records and network event logs are usually modeled as data sequences. Hence, it becomes important to enable pattern-based clustering methods i) to handle large datasets, and ii) to discover pattern similarity embedded in data sequences. There is presented herein a novel method that offers this capability.
摘要:
Most recent research of scalable inductive learning on very large streaming dataset focuses on eliminating memory constraints and reducing the number of sequential data scans. However, state-of-the-art algorithms still require multiple scans over the data set and use sophisticated control mechanisms and data structures. There is discussed herein a general inductive learning framework that scans the dataset exactly once. Then, there is proposed an extension based on Hoeffding's inequality that scans the dataset less than once. The proposed frameworks are applicable to a wide range of inductive learners.
摘要:
A high thermal conductivity/low coefficient of thermal expansion thermally conductive composite material for heat sinks and an electronic apparatus comprising a heat sink formed from such composites. The thermally conductive composite comprises a high thermal conductivity layer disposed between two substrates having a low coefficient of thermal expansion. The substrates have a low coefficient of thermal expansion and a relatively high modulus of elasticity, and the composite exhibits high thermal conductivity and low coefficient of thermal expansion even for composites with high loadings of the thermally conductive material.