DATA ALLOCATION BASED ON SECURE INFORMATION RETRIEVAL

    公开(公告)号:US20180322304A1

    公开(公告)日:2018-11-08

    申请号:US15774708

    申请日:2015-11-10

    CPC classification number: G06F21/6227 G06F16/2455 G06F16/24578 G06F21/62

    Abstract: Data allocation based on secure information retrieval is disclosed. One example is a system including an information processor communicatively linked to a query processor and a plurality of data processors respectively associated with a plurality of datasets. The information processor receives a request from the query processor for identification of a target dataset to be associated with a query term. The information processor generates a random permutation, and receives a secure version of the query term from the query processor, and receives secure versions of a collection of candidate terms from each of a plurality of data processors, each candidate term representing a cluster of similar terms in the associated dataset. The information processor determines similarity scores between the secure version of the query term and secure versions of the candidate terms, and identifies the target dataset of the plurality of datasets based on the determined similarity scores.

    SENTENCE CONSTRUCTION FOR DNA CLASSIFICATION
    23.
    发明申请

    公开(公告)号:US20180113978A1

    公开(公告)日:2018-04-26

    申请号:US15298412

    申请日:2016-10-20

    CPC classification number: G06F19/28 G06F17/30707 G06N99/005

    Abstract: In some examples, a method may include obtaining, from a DNA sequence, a DNA bin that includes a number of consecutive DNA elements equal to a bin length parameter and constructing sentences from the DNA bin to form a constructed sentence set that includes a number of sentences equal to a size parameter. Each sentence of the constructed sentence set may be constructed by partitioning the DNA bin into words, each word comprising a number of DNA elements equal to the size parameter. Each sentence of the constructed sentence set may include overlapping DNA elements with other sentences of the constructed sentence set and may start with a different DNA element of the DNA bin. The method may further include using the constructed sentence set to train a classifier and determining a DNA classification for an unclassified DNA subsequence through the classifier trained using the constructed sentence set.

    HYPERPLANE DETERMINATION THROUGH SPARSE BINARY TRAINING VECTORS

    公开(公告)号:US20170316340A1

    公开(公告)日:2017-11-02

    申请号:US15142798

    申请日:2016-04-29

    CPC classification number: G06N20/00

    Abstract: In some examples, a system includes an access engine and a hyperplane determination engine. The access engine may access a training vector set that includes sparse binary training vectors and a set of labels classifying each of the sparse binary training vectors through a positive label or a negative label. The hyperplane determination engine may initialize a candidate hyperplane vector and maintain a scoring vector including scoring vector elements to track separation variances of the sparse binary training vectors with respect to the candidate hyperplane vector. Through iterations of identifying, according to the scoring vector, a particular sparse binary training vector with a greatest separation variance with respect to the candidate hyperplane vector, the hyperplane determination engine may incrementally update the candidate hyperplane vector and incrementally update the scoring vector to adjust separation variances affected by updates to the candidate hyperplane vector.

    PROXIMITY OF DATA TERMS BASED ON WALSH-HADAMARD TRANSFORMS

    公开(公告)号:US20170206202A1

    公开(公告)日:2017-07-20

    申请号:US15324058

    申请日:2014-07-23

    CPC classification number: G06F16/24578 G06F16/2228 G06F16/24534 G06F16/285

    Abstract: Determining proximity of data terms based on Walsh-Hadamard transforms is disclosed. One example is a system including a modifier, a Walsh-Hadamard transformer, an indexer, and an evaluator. A dataset, including a plurality of numerical data terms, is received via a processing system. The modifier extends a given data term of the plurality of data terms, the extension based on multiple concatenations of the given data term with itself. The Walsh-Hadamard transformer provides coefficients of the Walsh-Hadamard transform of the modified given data term. The indexer provides a set of keys based on the coefficients, and associates the set of keys with the given data term. The evaluator determines a similarity measure for a pair of data terms of the plurality of data terms, the similarity measure based on a number of overlaps between respective sets of keys, and indicative of proximity of the pair of data terms.

    Hash suppression
    27.
    发明授权

    公开(公告)号:US11709798B2

    公开(公告)日:2023-07-25

    申请号:US17500843

    申请日:2021-10-13

    CPC classification number: G06F16/137

    Abstract: An example method is provided in according with one implementation of the present disclosure. The method comprises generating, via a processor, a set of hashes for each of a plurality of objects. The method also comprises computing, via the processor, a high-dimensional sparse vector for each object, where the vector represents the set of hashes for each object. The method further comprises computing, via the processor, a combined high-dimensional sparse vector from the high-dimensional sparse vectors for all objects and computing a hash suppression threshold. The method also comprises determining, via the processor, a group of hashes to be suppressed by using the hash suppression threshold, and suppressing, via the processor, the group of selected hashes when performing an action.

    Data stream analytics
    28.
    发明授权

    公开(公告)号:US11599561B2

    公开(公告)日:2023-03-07

    申请号:US15142504

    申请日:2016-04-29

    Abstract: Examples disclosed herein involve data stream analytics. In examples herein, a data stream may be analyzed by computing a set of hashes of a real-valued vector, the real-valued vector corresponding to a sample data object of a data stream; generating a list of data objects from a database corresponding to the sample data object based on the set of hashes, the list of data objects ordered based on similarity of the data objects to the sample data object of the data stream; and updating a data structure representative of activity of the sample data object in the data stream based on the list of data objects, the data structure to provide incremental analysis corresponding to the sample data object.

    Incremental automatic update of ranked neighbor lists based on k-th nearest neighbors

    公开(公告)号:US10810458B2

    公开(公告)日:2020-10-20

    申请号:US16073891

    申请日:2015-12-03

    Abstract: Incremental automatic update of ranked neighbor lists based on k-th nearest neighbors is disclosed. One example is a system including an indexing module to retrieve an incoming data stream, and retrieve ranked neighbor lists for received data objects. An evaluator determines similarity measures between the received data objects and their respective k-th nearest neighbors. A threshold determination module determines a statistical distribution based on the determined similarity measures, and a threshold based on the statistical distribution. The evaluator determines additional similarity measures between a new data object in the data stream and the received data objects. A neighbor update module automatically selects a sub-plurality of the received data objects by comparing the additional similarity measures to the threshold, and determines, for each selected data object, if the respective retrieved neighbor list is to be incrementally updated based on neighborhood comparisons for the new data object and the selected data object.

    Data allocation based on secure information retrieval

    公开(公告)号:US10783268B2

    公开(公告)日:2020-09-22

    申请号:US15774708

    申请日:2015-11-10

    Abstract: Data allocation based on secure information retrieval is disclosed. One example is a system including an information processor communicatively linked to a query processor and a plurality of data processors respectively associated with a plurality of datasets. The information processor receives a request from the query processor for identification of a target dataset to be associated with a query term. The information processor generates a random permutation, and receives a secure version of the query term from the query processor, and receives secure versions of a collection of candidate terms from each of a plurality of data processors, each candidate term representing a cluster of similar terms in the associated dataset. The information processor determines similarity scores between the secure version of the query term and secure versions of the candidate terms, and identifies the target dataset of the plurality of datasets based on the determined similarity scores.

Patent Agency Ranking