摘要:
A system and method for maintaining persistent object identifiers across versions of a collection of data. According to one embodiment of the present invention, a first collection of objects is compared to a second collection of objects. If an object in the first collection matches an object in the second collection, a reference is added to the object in the first collection referring to the object in the second collection, allowing the identifier to persist in both collections of objects. Additionally, according to one embodiment of the present invention, the data (or “facts”) associated with the object from the first collection are moved to the object from the second collection. In this way, data associated with matching objects is combined between two collections of objects while maintaining persistent object identifiers.
摘要:
A fact repository supports searches of facts relevant to search queries comprising keywords and phrases. A service engine retrieves the objects that are associated with facts relevant to the query. The objects are displayed on a search results page. Each object is displayed with selection of the facts associated with the object. The selected facts are ordered according to their relevance to the query.
摘要:
Links between facts associated with objects are automatically created and maintained in a fact repository. Names of objects are automatically identified in the facts, and collected into a list of names. The facts are then processed to identifying such names in the facts. Identified names are used as anchor text for search links. A search link includes a search query for a service engine which search the fact repository for facts associated with objects having the same name.
摘要:
Links between facts associated with objects are automatically created and maintained in a fact repository. Names of objects are automatically identified in the facts, and collected into a list of names. The facts are then processed to identifying such names in the facts. Identified names are used as anchor text for search links. A search link includes a search query for a service engine which search the fact repository for facts associated with objects having the same name.
摘要:
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for providing geographically relevant search results. In one aspect, a method includes receiving a geotoken for a resource. The geotoken can be a resource token that references a geographic location. A semantic geotoken can be selected using the received geotoken. The semantic geotoken is a standardized representation of the geographic location that includes one or more location-specific terms. The semantic geotoken is stored with a reference to the resource. Neighboring locations for the geographic location are determined. The neighboring locations are within a predetermined distance of the geographic location. Semantic geotokens for the neighboring locations are selected and stored with the reference to the resource. Data specifying the semantic geotokens and the reference to the resource are provided.
摘要:
A fact repository stores objects. Each object includes a collection of facts, where a fact comprises an attribute and a value. A set of objects from the fact repository are designated for analysis. The presentation engine presents the facts of the objects in a user interface (UI) having a table. Through manipulation of the UI, an end-user can add or remove facts from the table, and sort the table based on the values of particular facts. The presentation engine also presents the facts of the objects in a UI having a graph. Through manipulation of the UI, the end-user can add or remove facts from the graph, and can sort the facts shown in the graph based on values that are shown, or not shown, in the graph. The presentation engine can further present the facts of the objects in UIs including maps and timelines.
摘要:
A fact repository supports searches of facts relevant to search queries comprising keywords and phrases. A service engine retrieves the objects that are associated with facts relevant to a query. The query language described is designed for use with such a repository of facts and searches both the attributes of facts and the values of the attributes.
摘要:
A set of objects having facts is established. Facts of objects having positions in a order are identified. Some facts explicitly describe the positions in the linear order, while are facts do not explicitly describe the positions. The facts are presented in the order on a linear graph, such as a timeline. Facts of the objects describing geographic positions are presented on a map.
摘要:
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for machine learning. In one aspect, a method includes receiving a collection of facts, each fact represented as an entity-attribute-value tuple; identifying expected values for one or more individual attributes, where the identifying expected values includes, for each particular attribute: identifying facts having the attribute, calculating a value score for facts of the collection of facts having the particular attribute for each particular value, calculating a global score for all facts of the collection having the attribute, and comparing the value score to the global score such that a value is identified as an expected value if the comparison satisfies a specified threshold.
摘要:
A system and method for resolving ambiguities in date values associated with an attribute in a memory of the computer system. If a first text string conforms to one or more date formats, a confidence value is assigned for each of the date formats for the first text string based on the amount of specificity with which the first text string conforms to each date format. Similarly, if a second text string conforms to one or more date formats, a confidence value is assigned for each of the date formats for the second text string based on the amount of specificity with which the second text string conforms to each date format. The date format with the highest confidence value for the first text string and the date format with the highest confidence value for the second text string are merged to obtain a date value for the attribute.