摘要:
Techniques are provided for the efficient location, processing, and retrieval of local product information derived from web pages generally locatable through form queries submitted to web pages often referred to as the “deep” or “hidden” web. In an embodiment, information such as product information and dealer-location information is located on a web page form such as a dealer-locator form. After location of a suitable web page form, editorial wrapping is performed to create an automated information extraction process. Using the automated information extractor, deep-web crawling is performed. A grid-based extraction of individual business records is performed, and matching and ingestion are performed in conjunction with a business listing database. Finally, metadata tags are added to entries in the business listing database. Metadata tags also may be added to entries in other databases.
摘要:
Techniques are provided for the efficient location, processing, and retrieval of local product information derived from web pages generally locatable through form queries submitted to web pages often referred to as the “deep” or “hidden” web. In an embodiment, information such as product information and dealer-location information is located on a web page form such as a dealer-locator form. After location of a suitable web page form, editorial wrapping is performed to create an automated information extraction process. Using the automated information extractor, deep-web crawling is performed. A grid-based extraction of individual business records is performed, and matching and ingestion are performed in conjunction with a business listing database. Finally, metadata tags are added to entries in the business listing database. Metadata tags also may be added to entries in other databases.
摘要:
A method for identifying a brand name is described herein. The method involves obtaining category keywords associated with a category, designating a subgroup of the category keywords as brand name keywords for a particular brand name, receiving a search term, determining that the search term is a brand name keyword, and identifying the particular brand name corresponding to the brand name keyword.
摘要:
Methods, systems and computer readable mediums are provided for indexing network resources. One method includes accessing, using one or more computer systems, a data store of menu items. The method further includes accessing identification information associated with one or more food providers from one or more data sources. One or more network resources are crawled based on the identification information to search for one or more menu items in the data store of menu items associated with corresponding ones of the food providers. Using the one or more computing systems, an index feed is generated, the index feed comprising the identification information of one or more of the food providers, and one or more menu items associated with the identification information of corresponding food providers based on the crawl and search.
摘要:
Methods, systems and computer readable mediums are provided for indexing network resources. One method includes accessing, using one or more computer systems, a data store of menu items. The method further includes accessing identification information associated with one or more food providers from one or more data sources. One or more network resources are crawled based on the identification information to search for one or more menu items in the data store of menu items associated with corresponding ones of the food providers. Using the one or more computing systems, an index feed is generated, the index feed comprising the identification information of one or more of the food providers, and one or more menu items associated with the identification information of corresponding food providers based on the crawl and search.
摘要:
A computer-implemented method to determine a robust wrapper includes developing a model indicative of the temporal history of a document, such as a web document written in a markup language. Based on the developed model, robustness characteristics are determined for a plurality of different wrappers representing associated paths to the data item in a representation of the document. Based on a result of the determining operation, a result wrapper of the plurality of wrappers is provided. The result wrapper has a desired robustness characteristic.
摘要:
Methods and apparatuses are provided for dynamically reorganizing the data within a replicated database system. One method, for example, includes performing a split operation across a plurality of replicated databases with regard to an existing partition therein, wherein the existing partition comprises a plurality of data records and the two new partitions each include at least a portion of the plurality of data records, and allowing at least one type of access to the plurality of data records during the split operation.
摘要:
Methods and apparatuses are provided for dynamically reorganizing the data within a replicated database system. One method, for example, includes performing a split operation across a plurality of replicated databases with regard to an existing partition therein, wherein the existing partition comprises a plurality of data records and the two new partitions each include at least a portion of the plurality of data records, and allowing at least one type of access to the plurality of data records during the split operation.
摘要:
Method, system, and programs for providing one or more explanations. An inquiry is received via a communication platform where the inquiry is about how a set of entities are related. Information is retrieved from a knowledge storage in accordance with the set of entities and such information describes a plurality of entities and relationships existing among the plurality of entities. Based on such retrieved information, one or more explanations with respect to each relationship by which the set of entities are connected are generated. The one or more explanations are then transmitted as a response to the inquiry.
摘要:
There are provided means for implementing an interface to view and explore socially relevant concepts of an entity graph including, for example, means of a social network system to perform operations including retrieving contextually relevant data for a plurality of concepts within an entity graph of the social network system; retrieving socially relevant data for a user's node within a social graph of the social network system; identifying intersects between the plurality of concepts within the entity graph and the social relevant data for the user's node within the social graph; selecting one of the plurality of concepts within the entity graph based on the intersects identified; and displaying the one of the plurality of concepts within the entity graph at a user interface associated with the user's node.