Abstract:
Described is a system for learning and predicting key phrases. The system learns based on a dataset of historical forecasting questions, their associated time-series data for a quantity of interest, and associated keyword sets. The system learns the optimal policy of actions to take given the associated keyword sets and the optimal set of keywords which are predictive of the quantity of interest. Given a new forecasting question, the system extracts an initial keyword set from a new forecasting question, which are perturbed to generate an optimal predictive key-phrase set. Key-phrase time-series data are extracted for the optimal predictive key-phrase set, which are used to generate a forecast of future values for a value of interest. The forecast can be used for a variety of purposes, such as advertising online.
Abstract:
Described is a system for predicting future social activity. The system extracts social activities from spatial-temporal social network data collected in a first time period ranging from hours to days to capture spatial structures of social activities in a graph network representation. A graph matching technique is applied over a set of spatial-temporal social network data collected in a second time period ranging from weeks to months to capture temporal structures of the social activities. A spatial-temporal structure of each social activity is represented as an activity core, where each activity core is defined as active nodes that participate in the social activity with a frequency over a predetermined threshold over the second time period. For each activity core, the system computes statistics of the social activity and uses the statistics to generate a prediction of future behaviors of the social activity.
Abstract:
Described is a system for learning and predicting key phrases. The system learns based on a dataset of historical forecasting questions, their associated time-series data for a quantity of interest, and associated keyword sets. The system learns the optimal policy of actions to take given the associated keyword sets and the optimal set of keywords which are predictive of the quantity of interest. Given a new forecasting question, the system extracts an initial keyword set from a new forecasting question, which are perturbed to generate an optimal predictive key-phrase set. Key-phrase time-series data are extracted for the optimal predictive key-phrase set, which are used to generate a forecast of future values for a value of interest. The forecast can be used for a variety of purposes, such as advertising online.
Abstract:
Described is a system for identifying communication behavior patterns in communication activity time series. For each pair of variables in the communication activity time series, the system determines a transfer entropy measure, an effective transfer entropy measure from a randomly reordered version of the communication activity time series, and a partial effective transfer entropy measure. A dependency matrix is generated using pair-wised effective transfer entropy measures and partial effective transfer entropy measures, where each element in the matrix represents a total influence of a communication activity time series on another communication activity time series in the future. The dependency matrix is compared with dependency matrices generated from a predefined set of communication patterns to identify the communication behavior pattern. The system generates instructions regarding positioning of a sensor, such that the instructions provide guidance regarding placement of the sensor at a geographical region related to the identified communication pattern.
Abstract:
Described is a system for extracting multi-scale hierarchical clustering on customer observables (COs) data in a vehicle. The system selects a parameter for a set of incident data of COs data. Simplicial complexes are generated from the COs data based on the selected parameter. Face networks are generated from the simplicial complexes. For each face network, a set of connected components is extracted. Each connected component is transformed to a cluster of related COs, resulting in a first extracted relation between COs. The first extracted relation is used to automatically generate an alert at a client device when a second extracted relation different from the first extracted relation results from the transformation.
Abstract:
Described is a system for tracking and predicting social events. The system filters a time series of data obtained from a social media source. Enhanced filtered signals (EFS) are extracted from the filtered time series data based on an amplification signal obtained via a summation of signals relevant to a process of interest in the filtered time series data. A level of human social activity in the social media source is monitored by comparing the extracted EFS to an event database to detect an increase in a number of social activity events in the social media source compared to the event database.
Abstract:
Described is a system for using social media data to supplement survey data for discrete choice analysis. Survey data from consumers is segmented into demographic groups. Individual demographic attributes and consumer product attribute preferences are extracted from a set of social media data. Consumer product attribute preferences are determined for each demographic group using the set of social media data. Consumers' preference coefficients are generated for each demographic group. Finally, individualized incentives for a target consumer product are determined using the consumers' preference coefficients.
Abstract:
Described is a system for identifying emerging trends in a consumer product from heterogeneous online data sources. Data extracted from heterogeneous data sources is fused, and consumer product data is identified from the fused data. A baseline distribution for consumer issues related to consumer products is generated from the set of consumer product data. A deviation value from the baseline distribution is determined for a specific consumer product. Indicators for future consumer issues regarding the specific consumer product are identified based on the deviation value. The indicators are reported to a system analyst.
Abstract:
Described is a system for automatic date detection in digital texts written in Farsi. The system includes a unique date tagger that reviews the string of texts and detect dates either compatible with the Farsi grammar or some popular yet unofficial ways of writing dates in Farsi.
Abstract:
Described is a cyber security system for digital artifact genetic modeling and forensic analysis. The system identifies the provenance (origin) of a digital artifact by first receiving a plurality of digital artifacts, each digital artifact possessing features. Raw features are extracted from the digital artifacts. The raw features are classified into descriptive genotype-phonotype structures. Finally, lineage, heredity, and provenance of the digital artifacts are determined based on mapping of the genotype-phenotype structures.