摘要:
A method, system and computer program product are disclosed for predicting influence in a social network. In one embodiment, the method comprises identifying a set of users of the social network, and identifying a subset of the users as influential users based on defined criteria. A multitude of measures are identified as predictors of which ones of the set of users are the influential users. These measures are aggregated, and a composite predictor model is formed based on this aggregation. This composite predictor model is used to predict which ones of the set of users will have a specified influence in the social network in the future. In one embodiment, the specified influence is based on messages sent from the users, and for example, may be based on the number of the messages sent from each user that are re-sent by other users.
摘要:
A system and method a Multi-Task Multi-View (M2TV) learning problem. The method uses the label information from related tasks to make up for the lack of labeled data in a single task. The method further uses the consistency among different views to improve the performance. It is tailored for the above complicated dual heterogeneous problems where multiple related tasks have both shared and task-specific views (features), since it makes full use of the available information.
摘要:
A novel domain adaption/transfer learning method applied to the problem of classifying abbreviated documents, e.g., short text messages, instant messages, tweets. The method uses a large number of multi-labeled examples (source domain) to improve the learning on the partial observations (target domain). Specifically, a hidden, higher-level abstraction space is learned that is meaningful for the multi-labeled examples in the source domain. This is done by simultaneously minimizing the document reconstruction error and the error in a classification model learned in the hidden space using known labels from the source domain. The partial observations in the target space are then mapped to the same hidden space, and classified into the label space determined by the source domain.
摘要:
A first mapping function automatically maps a plurality of documents each with a concept of ontology to create a documents-to-ontology distribution. An ontology-to-class distribution that maps concepts in the ontology to class labels, respectively, is received, and a classifier is generated that labels a selected document with an associated class identified based on the documents-to-ontology distribution and the ontology-to-class distribution.
摘要:
System, method and computer program product provides a novel domain adaption/transfer learning approach applied to the problem of classifying abbreviated documents, e.g., short text messages, instant messages, tweets. The proposed method uses a large number of multi-labeled examples (source domain) to improve the learning on the partial observations (target domain). Specifically, a hidden, higher-level abstraction space is learned that is meaningful for the multi-labeled examples in the source domain. This is done by simultaneously minimizing the document reconstruction error and the error in a classification model learned in the hidden space using known labels from the source domain. The partial observations in the target space are then mapped to the same hidden space, and classified into the label space determined by the source domain. Exemplary results provided for a Twitter dataset demonstrate that the method identifies meaningful hidden topics and provides useful classifications of specific tweets.
摘要:
A method for identifying companies with specific business objectives that includes using existing sources of company firmographic data to identify a broad set of companies and associated websites, crawling the websites associated with the identified companies and indexing web site content for each of the identified companies with the specific business objective to realize indexed web content. The method further includes joining the company firmographic data with the indexed web content using a business objective common identifier to generate a store of joined structured firmographic data and indexed web content and presenting a display image representation of the store of joined structured firmographic data and indexed web content for user review. The display image further receives user input to score each of said companies identified therein, and using a search interface, querying the store of scored, joined structured firmographic data and indexed web content. The method further includes augmenting the search interface, or search results from a query, with predictive, machine-leaning processes that allow rapid identification of companies possibly missed in the query.
摘要:
A method (and structure) for developing a distribution function for the probability of winning a bid by a seller for a product or service, using the seller's own historical data for winning bids and lost bids, includes normalizing the data for winning bids and the data for lost bids and merging the normalized data into a single set of data.
摘要:
A method, system and computer program product for inferring topic evolution and emergence in a set of documents. In one embodiment, the method comprises forming a group of matrices using text in the documents, and analyzing these matrices to identify a first group of topics as evolving topics and a second group of topics as emerging topics. The matrices includes a first matrix X identifying a multitude of words in each of the documents, a second matrix W identifying a multitude of topics in each of the documents, and a third matrix H identifying a multitude of words for each of the multitude of topics. These matrices are analyzed to identify the evolving and emerging topics. In an embodiment, the documents form a streaming dataset, and two forms of temporal regularizers are used to help identify the evolving topics and the emerging topics in the streaming dataset.
摘要:
A first mapping function automatically maps a plurality of documents each with a concept of ontology to create a documents-to-ontology distribution. An ontology-to-class distribution that maps concepts in the ontology to class labels, respectively, is received, and a classifier is generated that labels a selected document with an associated class identified based on the documents-to-ontology distribution and the ontology-to-class distribution.
摘要:
A model for impact analysis determines impact of part removal from a product. An entity is identifies that includes a plurality of sub-components. One or more performance measures associated with the entity are identified. One or more of the sub-components to be removed from the entity are identified. A substitution impact function is defined. Impact on said one or more performance measures is determined using the substitution impact function.