摘要:
A privacy-preserving index system addresses the problem of providing a privacy-preserving search over distributed access-controlled content. Indexed documents can be readily reconstructed from inverted indexes used in the search. The privacy-preserving index system builds a centralized privacy-preserving index in conjunction with a distributed access-control enforcing search protocol. The privacy-preserving index utilizes a randomized algorithm for constructing a privacy-preserving index. The privacy-preserving index is strongly resilient to privacy breaches. The privacy-preserving index system allows content providers to maintain complete control in defining access groups and ensuring its compliance, and further allows system implementors to retain tunable knobs to balance privacy and efficiency concerns for their particular domains.
摘要:
A system and method for using numbers to query a corpus of documents, particularly but not exclusively for data spaces that have low reflectivity, i.e., for a point xi described by one or more numbers, the data space does not contain very many permutations of the numbers. For each document to be searched, each query number is matched with one and only one document number preferably using a bipartite graph or heuristic rule such that a distance function is minimized. The distance function can, but not must, take into account attribute names and unit names. A limiting algorithm can be used to limit the number of documents that must be searched.
摘要翻译:一种用于使用数字查询文档语料库的系统和方法,特别地但不排他地用于具有低反射率的数据空间,即对于由一个或多个数字描述的点x i i,数据空间 不包含很多排列的数字。 对于要搜索的每个文档,每个查询号码与仅一个文档号码匹配,优选地使用二分图或启发式规则,使得距离函数被最小化。 距离函数可以但不一定要考虑属性名称和单位名称。 限制算法可用于限制必须搜索的文档数量。
摘要:
A system and method for mining data while preserving a user's privacy includes perturbing user-related information at the user's computer and sending the perturbed data to a Web site. At the Web site, perturbed data from many users is aggregated, and from the distribution of the perturbed data, the distribution of the original data is reconstructed, although individual records cannot be reconstructed. Based on the reconstructed distribution, a decision tree classification model or a Naive Bayes classification model is developed, with the model then being provided back to the users, who can use the model on their individual data to generate classifications that are then sent back to the Web site such that the Web site can display a page appropriately configured for the user's classification. Or, the classification model need not be provided to users, but the Web site can use the model to, e.g., send search results and a ranking model to a user, with the ranking model being used at the user computer to rank the search results based on the user's individual classification data.
摘要:
The present invention provides a method and system of partitioning authors on a given topic in a newsgroup into two opposite classes of the authors. In an exemplary embodiment, the method and system include identifying all links among the authors, where each link represents a response from one of the authors to another of the authors and analyzing the identified links, where the identified links are assumed to be more likely to be antagonistic links rather than non-antagonistic links. In an exemplary embodiment, the identifying includes assigning a vertex of a graph to each of the authors and assigning an edge of the graph to each interaction between two of the assigned vertices corresponding to two of the authors. In an exemplary embodiment, the analyzing includes solving a min-weight approximately balanced cut problem on a co-citation matrix of the graph, thereby generating the two opposite classes of the authors.
摘要:
A privacy-preserving index system addresses the problem of providing a privacy-preserving search over distributed access-controlled content. Indexed documents can be readily reconstructed from inverted indexes used in the search. The privacy-preserving index system builds a centralized privacy-preserving index in conjunction with a distributed access-control enforcing search protocol. The privacy-preserving index utilizes a randomized algorithm for constructing a privacy-preserving index. The privacy-preserving index is strongly resilient to privacy breaches. The privacy-preserving index system allows content providers to maintain complete control in defining access groups and ensuring its compliance, and further allows system implementors to retain tunable knobs to balance privacy and efficiency concerns for their particular domains.
摘要:
A database including vertical tables useful for storing large numbers of objects having potentially thousands of attributes in, e.g., e-commerce applications. To support querying the vertical database using conventional SQL, a horizontal view over the underlying vertical tables is defined, and then queries are posed against the view. The queries are automatically transformed and executed against the vertical tables. If desired, the query results can be transformed back to a horizontal format. In this way, it appears to the user that a conventional horizontal data format is being used.
摘要:
A user can easily organize computerized document folders by associating a few sample documents in the document database with each folder. The present invention learns folder profiles based on the sample documents and moves the remaining documents into the folders accordingly. In this way, the user can construct new folders, or rearrange existing folders, or cause the computer to automatically rearrange and maintain the folders. This is particularly useful for managing a database of perhaps thousands of emails.
摘要:
A cryocooler system comprising a heat exchanger for cooling a compressed returning warmed cryogenic fluid stream, the heat exchanger having a bypass loop to produce a major stream and a minor stream exiting the heat exchanger. The minor stream is further cooled by expansion and used as a heat exchange medium for an external heat load after which it is compressed and returned to the heat exchanger for heat exchange with the compressed return warmed cryogenic fluid stream. The major stream is further cooled by expansion and recirculated to the heat exchanger to cool the compressed returning warmed cryogenic fluid stream. The major stream and minor stream are combined either inside or outside of said heat exchanger to form the warmed cryogenic fluid inlet stream for compression.
摘要:
A single bed pressure swing adsorption process with at least one transfer tank is utilized to separate less adsorbable components from more adsorbable components such as the separation of oxygen from air. Depressurization gas is collected in the transfer tank and is used later exclusively for purging the bed during the regeneration period.
摘要:
A method and apparatus for mining text databases, employing sequential pattern phrase identification and shape queries, to discover trends. The method passes over a desired database using a dynamically generated shape query. Documents within the database are selected based on specific classifications and user defined partitions. Once a partition is specified, transaction IDs are assigned to the words in the text documents depending on their placement within each document. The transaction IDs encode both the position of each word within the document as well as representing sentence, paragraph, and section breaks, and are represented in one embodiment as long integers with the sentence boundaries. A maximum and minimum gap between words in the phrases and the minimum support all phrases must meet for the selected time period may be specified. A generalized sequential pattern method is used to generate those phrases in each partition that meet the minimum support threshold. The shape query engine takes the set of phrases for the partition of interest and selects those that match a given shape query. A query may take the form of requesting a trend such as “recent upwards trend”, “recent spikes in usage”, “downward trends”, and “resurgence of usage”. Once the phrases matching the shape query are found, they are presented to the user.