摘要:
A method, computer program product, and computer system for latent collaborative retrieval are described. A first mathematical representation of a query received from a user is generated. A second mathematical representation of a user profile is generated. A plurality of mathematical representations associated with a plurality of items is accessed. The first mathematical representation, the second mathematical representation, and the plurality of mathematical representations are transformed to have a uniform length. A first results subset of items is generated, based upon, at least in part, a first similarity measurement of the first mathematical representation and the plurality of mathematical representations. A second result subset of items is generated based upon, at least in part, a second similarity measurement of the second mathematical representation and the plurality of mathematical representations. A result set of items is generated based upon, at least in part, the first and second result subsets.
摘要:
A method and system for labeling a selected word of a sentence using a deep neural network includes, in one exemplary embodiment, determining an index term corresponding to each feature of the word, transforming the index term or terms of the word into a vector, and predicting a label for the word using the vector. The method and system, in another exemplary embodiment, includes determining, for each word in the sentence, an index term corresponding to each feature of the word, transforming the index term or terms of each word in the sentence into a vector, applying a convolution operation to the vector of the selected word and at least one of the vectors of the other words in the sentence, to transform the vectors into a matrix of vectors, each of the vectors in the matrix including a plurality of row values, constructing a single vector from the vectors in the matrix, and predicting a label for the selected word using the single vector.
摘要:
A network-based system is provided for performing data analysis services using a support vector machine for analyzing data received from a remote user connected to the network. The user transmits a data set to be analyzed and along with an account identifier that allows the analysis service provider to collect payment for the processing services. Once payment has been confirmed, the service provider's server transmits the analysis results to the remote user.
摘要:
Identification of a determinative subset of features from within a group of features is performed by training a support vector machine using training samples with class labels to determine a value of each feature, where features are removed based on their the value. One or more features having the smallest values are removed and an updated kernel matrix is generated using the remaining features. The process is repeated until a predetermined number of features remain which are capable of accurately separating the data into different classes. In some embodiments, features are eliminated by a ranking criterion based on a Lagrange multiplier corresponding to each training sample.
摘要:
A system and method for determining a similarity between a document and a query includes building a weight vector for each of a plurality of documents in a corpus of documents stored in memory and building a weight vector for a query input into a document retrieval system. A weight matrix is generated which distinguishes between relevant documents and lower ranked documents by comparing document/query tuples using a gradient step approach. A similarity score is determined between weight vectors of the query and documents in a corpus by determining a product of a document weight vector, a query weight vector and the weight matrix.
摘要:
A method and system for labeling a selected word of a sentence using a deep neural network includes, in one exemplary embodiment, determining an index term corresponding to each feature of the word, transforming the index term or terms of the word into a vector, and predicting a label for the word using the vector. The method and system, in another exemplary embodiment, includes determining, for each word in the sentence, an index term corresponding to each feature of the word, transforming the index term or terms of each word in the sentence into a vector, applying a convolution operation to the vector of the selected word and at least one of the vectors of the other words in the sentence, to transform the vectors into a matrix of vectors, each of the vectors in the matrix including a plurality of row values, constructing a single vector from the vectors in the matrix, and predicting a label for the selected word using the single vector.
摘要:
A system and method for semantic extraction using a neural network architecture includes indexing each word in an input sentence into a dictionary and using these indices to map each word to a d-dimensional vector (the features of which are learned). Together with this, position information for a word of interest (the word to labeled) and a verb of interest (the verb that the semantic role is being predicted for) with respect to a given word are also used. These positions are integrated by employing a linear layer that is adapted to the input sentence. Several linear transformations and squashing functions are then applied to output class probabilities for semantic role labels. All the weights for the whole architecture are trained by backpropagation.
摘要:
Features are preprocessed (204) to minimize classification error in a Support Vector Machines (200) used to identify patterns in large databases. Pre-processing (204) is performed to constrain features used to train (210) the SVM learning machine. Live data (226) is collected and processed (232) with SVM.
摘要:
Methods and systems to associate semantically-related items of a plurality of item types using a joint embedding space are disclosed. The disclosed methods and systems are scalable to large, web-scale training data sets. According to an embodiment, a method for associating semantically-related items of a plurality of item types includes embedding training items of a plurality of item types in a joint embedding space configured in a memory coupled to at least one processor, learning one or more mappings into the joint embedding space for each of the item types to create a trained joint embedding space and one or more learned mappings, and associating one or more embedded training items with a first item based upon a distance in the trained joint embedding space from the first item to each said associated embedded training items. Exemplary item types that may be embedded in the joint embedding space include images, annotations, audio and video.
摘要:
A system and method for determining a similarity between a document and a query includes building a weight vector for each of a plurality of documents in a corpus of documents stored in memory and building a weight vector for a query input into a document retrieval system. A weight matrix is generated which distinguishes between relevant documents and lower ranked documents by comparing document/query tuples using a gradient step approach. A similarity score is determined between weight vectors of the query and documents in a corpus by determining a product of a document weight vector, a query weight vector and the weight matrix.