摘要:
A first mapping function automatically maps a plurality of documents each with a concept of ontology to create a documents-to-ontology distribution. An ontology-to-class distribution that maps concepts in the ontology to class labels, respectively, is received, and a classifier is generated that labels a selected document with an associated class identified based on the documents-to-ontology distribution and the ontology-to-class distribution.
摘要:
A first mapping function automatically maps a plurality of documents each with a concept of ontology to create a documents-to-ontology distribution. An ontology-to-class distribution that maps concepts in the ontology to class labels, respectively, is received, and a classifier is generated that labels a selected document with an associated class identified based on the documents-to-ontology distribution and the ontology-to-class distribution.
摘要:
A method, system and computer program product for inferring topic evolution and emergence in a set of documents. In one embodiment, the method comprises forming a group of matrices using text in the documents, and analyzing these matrices to identify a first group of topics as evolving topics and a second group of topics as emerging topics. The matrices includes a first matrix X identifying a multitude of words in each of the documents, a second matrix W identifying a multitude of topics in each of the documents, and a third matrix H identifying a multitude of words for each of the multitude of topics. These matrices are analyzed to identify the evolving and emerging topics. In an embodiment, the documents form a streaming dataset, and two forms of temporal regularizers are used to help identify the evolving topics and the emerging topics in the streaming dataset.
摘要:
A method, system and computer program product for inferring topic evolution and emergence in a set of documents. In one embodiment, the method comprises forming a group of matrices using text in the documents, and analyzing these matrices to identify evolving topics and emerging topics. The matrices includes a matrix X identifying a multitude of words in each of the documents, a matrix W identifying a multitude of topics in each of the documents, and a matrix H identifying a multitude of words for each of the multitude of topics. These matrices are analyzed to identify the evolving and emerging topics. In an embodiment, two forms of temporal regularizers are used to help identify the evolving and emerging topics. In another embodiment, a two stage approach involving detection and clustering is used to help identify the evolving and emerging topics.
摘要:
A method, system and computer program product for inferring topic evolution and emergence in a set of documents. In one embodiment, the method comprises forming a group of matrices using text in the documents, and analyzing these matrices to identify a first group of topics as evolving topics and a second group of topics as emerging topics. The matrices includes a first matrix X identifying a multitude of words in each of the documents, a second matrix W identifying a multitude of topics in each of the documents, and a third matrix H identifying a multitude of words for each of the multitude of topics. These matrices are analyzed to identify the evolving and emerging topics. In an embodiment, the documents form a streaming dataset, and two forms of temporal regularizers are used to help identify the evolving topics and the emerging topics in the streaming dataset.
摘要:
In response to issues of high dimensionality and sparsity in machine learning, it is proposed to use a multiple output regression modeling module that takes into account information on groups of related predictor features and groups of related regressions, both given as input, and outputs a regression model with selected feature groups. Optionally, the method can be employed as a component in methods of causal influence detection, which are applied on a time series training data set representing the time-evolving content generated by community members, output a model of causal relationships and a ranking of the members according to their influence.
摘要:
In a computerized social network, expert and user chat sessions are stored and rated probabilistically. Later user requests for information are met with an expert ranking, based on a balance of similarities between expert profile and questions; similarity between expert profile and prior chat sessions, and dynamically updated chat session ratings. New sessions can be rated automatically with reference to keywords distilled from past sessions responsive to user ratings—and based on session length.
摘要:
Systems and methods for processing Machine Learning (ML) algorithms in a MapReduce environment are described. In one embodiment of a method, the method includes receiving a ML algorithm to be executed in the MapReduce environment. The method further includes parsing the ML algorithm into a plurality of statement blocks in a sequence, wherein each statement block comprises a plurality of basic operations (hops). The method also includes automatically determining an execution plan for each statement block, wherein at least one of the execution plans comprises one or more low-level operations (lops). The method further includes implementing the execution plans in the sequence of the plurality of the statement blocks.
摘要:
There are provided a system, a method and a computer program product for increasing of productivity of sales force in a first entity. The system locates or constructs at least one enterprise social network in the first entity. The system constructs at least one market social network. The system creates at least one connection between the enterprise social network and the market social network. Sales representative in the first entity expands new sales operations and/or identify new markets via the connected social networks.
摘要:
A system, method and computer program product automatically present at least one product to at least one client for at least one possible purchase. The system applies a matrix factorization on a binary matrix X representing which clients purchased which products. The system optimizes zero-valued elements in the matrix X that correspond to unknown client-product affinities. The system constructs based on the optimization, a prediction matrix {circumflex over (X)} whose each element value represents a likelihood that a corresponding client purchases a corresponding product. The system identifies at least one client-product pair with the highest value in the matrix {circumflex over (X)}. The system recommends at least one product to at least one client according to the client-product pair with the highest value.