摘要:
In certain embodiments, generating a hierarchy of terms includes accessing a corpus comprising terms. The following is performed for one or more terms to yield parent-child relationships: one or more parent terms of a term are identified according to directional affinity; and one or more parent-child relationships are established from the parent terms and each term. A hierarchical graph is automatically generated from the parent-child relationships.
摘要:
In one embodiment, modeling topics includes accessing a corpus comprising documents that include words. Words of a document are selected as keywords of the document. The documents are clustered according to the keywords to yield clusters, where each cluster corresponds to a topic. A statistical distribution is generated for a cluster from words of the documents of the cluster. A topic is modeled using the statistical distribution generated for the cluster corresponding to the topic.
摘要:
According to certain embodiments, a set of samples of sensor data is accessed. The set of samples records measurements taken by one or more sensors. Each sample is represented as a minterm to yield a set of minterms. A characteristic function is generated from the set of minterms. The characteristic function indicates whether a given minterm is a member of the set of minterms.
摘要:
In particular embodiments, a method includes receiving data sets, constructing a first binary decision diagram (BDD) representing the data sets, iteratively adding data from the data sets to the first BDD until a compression rate of the first BDD reaches a threshold compression rate, constructing a second BDD representing data from the data sets received after the compression rate of the first BDD equals a threshold compression rate, and iteratively adding data from the data sets to the second BDD.
摘要:
In particular embodiments, a method includes receiving from a remote system a binary decision diagram (BDD) representing data streams from sensors, an input, and a first hash code, transforming the received BDD to a second arithmetic function by performing the arithmetic transformation on the received BDD, calculating a second hash code from the second arithmetic function and the input, and if the first hash code equals the second hash code, then indicating that the received BDD is uncorrupted data, else indicating that the received BDD is corrupted data.
摘要:
In particular embodiments, a method includes accessing first binary decision diagrams (BDDs) representing data streams from sensors, selecting portions from the first BDDs based on ease-of-analysis, and constructing a second BDD by performing an OR operation between the selected portions of the first BDDs.
摘要:
According to certain embodiments, one or more sets of model samples of model sensor data are accessed. Each set comprises one or more model samples corresponding to an annotation of one or more annotations. The following are performed for each set to yield one or more annotated model characteristic functions: represent each model sample of the each set as a model minterm to yield a set of model minterms; generate a model characteristic function from the set of model minterms, the model characteristic function indicating whether a given minterm is a member of the set of model minterms; and annotate the model characteristic function to yield an annotated model characteristic function. A general model characteristic function is generated from the one or more annotated model characteristic functions.
摘要:
One embodiment accesses a binary decision diagram (BDD) representing a function having n variables, where n≧2, wherein the BDD comprises n layers corresponding to the n variables, respectively; separates the n variables into n ! 2 ⌊ n / 2 ⌋ groups, wherein each group comprises ⌈ n 2 ⌉ ordered sets, and each set in each group comprises 1 or 2 variables; for each of the n ! 2 ⌊ n / 2 ⌋ groups, determines a locally optimum variable order that yields a smallest size among 2└n/2┘ different variable orders of the BDD obtained within the group; and selects from n ! 2 ⌊ n / 2 ⌋ locally optimum variable orders corresponding to the n ! 2 ⌊ n / 2 ⌋ groups an optimum variable order of the BDD that yields a smallest size among the n ! 2 ⌊ n / 2 ⌋ locally optimum variable orders.
摘要:
According to certain embodiments, a first Boolean function and a second Boolean function are received. The first Boolean function represents a first data set, and the second Boolean function represents a second data set. The first Boolean function and the second Boolean function are transformed to a first arithmetic function and a second arithmetic function, respectively. A first hash code and a second hash code are calculated from the first arithmetic function and the second arithmetic function, respectively. If the first hash code equals the second hash code, the first Boolean function and the second Boolean function are designated as equivalent; otherwise, the first Boolean function and the second Boolean function are designated as not equivalent.
摘要:
In one embodiment, determining a document specificity includes accessing a record that records the clusters of documents. The number of themes of a document is determined from the number of clusters of the document. The specificity of the document is determined from the number of themes.