摘要:
The present invention relates to a system and methodology to facilitate extraction of information from a large unstructured corpora such as from the World Wide Web and/or other unstructured sources. Information in the form of answers to questions can be automatically composed from such sources via probabilistic models and cost-benefit analyses to guide resource-intensive information-extraction procedures employed by a knowledge-based question answering system. The analyses can leverage predictions of the ultimate quality of answers generated by the system provided by Bayesian or other statistical models. Such predictions, when coupled with a utility model can provide the system with the ability to make decisions about the number of queries issued to a search engine (or engines), given the cost of queries and the expected value of query results in refining an ultimate answer. Given a preference model, information extraction actions can be taken with the highest expected utility. In this manner, the accuracy of answers to questions can be balanced with the cost of information extraction and analysis to compose the answers.
摘要:
The present invention relates to a system and methodology to facilitate extraction of information from a large unstructured corpora such as from the World Wide Web and/or other unstructured sources. Information in the form of answers to questions can be automatically composed from such sources via probabilistic models and cost-benefit analyses to guide resource-intensive information-extraction procedures employed by a knowledge-based question answering system. The analyses can leverage predictions of the ultimate quality of answers generated by the system provided by Bayesian or other statistical models. Such predictions, when coupled with a utility model can provide the system with the ability to make decisions about the number of queries issued to a search engine (or engines), given the cost of queries and the expected value of query results in refining an ultimate answer. Given a preference model, information extraction actions can be taken with the highest expected utility. In this manner, the accuracy of answers to questions can be balanced with the cost of information extraction and analysis to compose the answers.
摘要:
The present invention relates to systems and methods that employ user models to personalize generalized queries and/or search results according to information that is relevant to respective user characteristics. A system is provided that facilitates generating personalized searches of information. The system includes a user model to determine characteristics of a user. The user model may be assembled automatically via an analysis of a user's content, activities, and overall context. A personalization component automatically modifies queries and/or search results in view of the user model in order to personalize information searches for the user. A user interface receives the queries and displays the search results from one or more local and/or remote search engines, wherein the interface can be adjusted in a range from more personalized searches to more generalized searches.
摘要:
One or more models of memorability are provided that facilitate various computer-based applications including those centering on the storage, retrieval, and processing of information, applications that remind people about items they risk not recalling or overlooking, and facilitating communications of reminders. In one application, the models are used to help compose and navigate large personal stores of information about a user's activities, communications, images, and other content. In another application, views of files in directories are extended with the addition of memory landmarks, and a means for controlling the number of landmarks provided via changing a threshold on inferred memorability. Another application centers on the use of models of memorability to select subsets of images from larger sets representing events, for display in a slide show or ambient photo display. In another application, a system is provided that facilitates computer-based searching for information by providing for the design and analysis of timeline visualizations in connection with displaying results to queries based at least in part on an index of content. A query is received by a query component (which can be part of search engine that provides a unified index of information a user has been exposed to). The query component parses the query into portions relevant to effecting a meaningful search in accordance with the subject invention. The query component can access and populate a data store which may include information searched for. A landmark component receives and/or accesses information from the query component as well as the data store, and anchors public and/or personal landmark events to search results-related information.
摘要:
Various components and processes are provided to enable data processing on multiple data types where aspects of the history of user activity, attention, interest, location, or other interaction with data is determined and employed to enhance information storage and access. In one particular aspect, a data manipulation system is provided. The system includes one or more data items that are associated with one or more tags and indicate at least one user's interaction or activity with the data items. A manipulation tool that processes the data items to determine a subset of data items based at least in part on the user's interaction with the data items. Methods are described for using the manipulation tool to weight terms in an index, to compress indexes, to influence the rank of items returned in a search, to generate additional queries for data items either automatically or with user direction, or for improved presentation of data items.
摘要:
A system and methodology is provided for filtering temporal streams of information such as news stories by statistical measures of information novelty. Various techniques can be applied to custom tailor news feeds or other types of information based on information that a user has already reviewed. Methods for analyzing information novelty are provided along with a system that personalizes and filters information for users by identifying the novelty of stories in the context of stories they have already reviewed. The system employs novelty-analysis algorithms that represent articles as a bag of words and named entities. The algorithms analyze inter- and intra-document dynamics by considering how information evolves over time from article to article, as well as within individual articles.
摘要:
The present invention relates to systems and methods providing content-access-based information retrieval. Information items from a plurality of disparate information sources that have been previously accessed or considered are automatically indexed in a data store, whereby a multifaceted user interface is provided to efficiently retrieve the items in a cognitively relevant manner. Various display output arrangements are possible for the retrieved information items including timeline visualizations and multidimensional grid visualizations. Input options include explicit, implicit, and standing queries for retrieving data along with explicit and implicit tagging of items for ease of recall and retrieval. In one aspect, an automated system is provided that facilitates concurrent searching across a plurality of information sources. A usage analyzer determines user accessed items and a content analyzer stores subsets of data corresponding to the items, wherein at least two of the items are associated with disparate information sources, respectively. An automated indexing component indexes the data subsets according to past data access patterns as determined by the usage analyzer. A search component responds to a search query, initiates a search across the indexed data, and outputs links to locations of a subset and/or provides sparse representations of the subset.
摘要:
The subject invention relates to probabilistic models that are trained from transitions among various topics of pages visited by a sample population of search users. In one aspect, probabilistic models of topic transitions are learned for individual users and groups of users. Topic transitions for individuals versus larger groups are analyzed, wherein the relative accuracies of personal models of topic dynamics with models constructed from sets of pages drawn from similar groups and from a larger population of users are compared. To exploit temporal dynamics, the accuracy of these models are tested for predicting transitions in topics of visits at increasingly more distant times in the future. The models can be applied to search topic dynamics of tagged pages, and then utilized to predict topics of subsequent pages visited by users.
摘要:
One or more models of memorability are provided that facilitate various computer-based applications including those centering on the storage, retrieval, and processing of information, applications that remind people about items they risk not recalling or overlooking, and facilitating communications of reminders. In one application, the models are used to help compose and navigate large personal stores of information about a user's activities, communications, images, and other content. In another application, views of files in directories are extended with the addition of memory landmarks, and a means for controlling the number of landmarks provided via changing a threshold on inferred memorability. Another application centers on the use of models of memorability to select subsets of images from larger sets representing events, for display in a slide show or ambient photo display. In another application, a system is provided that facilitates computer-based searching for information by providing for the design and analysis of timeline visualizations in connection with displaying results to queries based at least in part on an index of content. A query is received by a query component (which can be part of search engine that provides a unified index of information a user has been exposed to). The query component parses the query into portions relevant to effecting a meaningful search in accordance with the subject invention. The query component can access and populate a data store which may include information searched for. A landmark component receives and/or accesses information from the query component as well as the data store, and anchors public and/or personal landmark events to search results-related information.
摘要:
Techniques and systems are disclosed that provide a risk-based assessment for a user based on user location information. Incident data is acquired for incidents that involve potential risks (e.g., to people and/or property) from a plurality of locations and contexts, considering such factors as date, time, weather, traffic, and velocity. The incident data is matched to the user's location and context directly or indirectly to provide one or more potential outcomes of interest (e.g., accidents, injuries, fatalities), and inferences regarding the likelihood of events are made available. These measures are compared to desired risk thresholds for the user. In one embodiment, routes, times, and conditions of travel may be preferred over others routes, times, and conditions. In another embodiment, users may be notified of a condition or a vehicle's maximum velocity may be reduced when the matched incident data meets/exceeds a user's risk threshold.