摘要:
A system and method of generating bid values for sponsored search includes steps or acts of: receiving a bid phrase for an advertisement for an item, wherein the bid phrase specifies a search query for which the advertisement should be displayed; receiving first information at a first input/output interface, the first information related to a bidding behavior of the advertiser; receiving second information at a second input/output interface, the second information relating to a history of bids by other advertisers for the bid phrase; and generating a bid value for the bid phrase submitted for the advertisement for the search query, based on the information received.
摘要:
Automatic generation of bid phrases for online advertising comprising storing a computer code representation of a landing page for use with a language model and a translation model (with a parallel corpus) to produce a set of candidate bid phrases that probabilistically correspond to the landing page, and/or to web search phrases. Operations include extracting a set of raw candidate bid phrases from a landing page, generating a set of translated candidate bid phrases using a parallel corpus in conjunction with the raw candidate bid phrases. In order to score and/or reduce the number of candidate bid phrases, a translation table is used to capture the probability that a bid phrase from the raw bid phrases is generated from a bid phrase from the set of translated candidate bid phrases. Scoring and ranking operations reduce the translated candidate bid phrases to just those most relevant to the landing page inputs.
摘要:
Described are a system and method for determining an event occurrence rate. A sample set of content items may be obtained. Each of the content items may be associated with at least one region in a hierarchical data structure. A first impression volume may be determined for the at least one region as a function of a number of impressions registered for the content items associated with the at least one region. A scale factor may be applied to the first impression volume to generate a second impression volume. The scale factor may be selected so that the second impression volume is within a predefined range of a third impression volume. A click-through-rate (CTR) may be estimated as a function of the second impression volume and a number of clicks on the content item.
摘要:
Provided is a method for modeling the cost of XML as well as relational operators. As with traditional relational cost estimation, a set of system catalog statistics that summarizes the XML data is exploited; however, the novel use of a set of simple path statistics is also proposed. A new statistical learning technique called transform regression is utilized instead of detailed analytical models to predict the overall cost of an operator. Additionally, a query optimizer in a database is enabled to be self-tuning, automatically adapting to changes over time in the query workload and in the system environment.
摘要:
Provided are techniques for processing a query. A query is received, wherein the query is formed by one or more paths, and wherein each path includes one or more steps. A hierarchical document including one or more document nodes is received. While processing the query and traversing the hierarchical document, one or more extraction entries are constructed, wherein each extraction entry includes a step instance match candidate identifying a document node and a step instance ancestor path for the document node, and one or more tuples are constructed using the one or more extraction entries by associating the step instance match candidate from one of the one or more extraction entries with the step instance match candidate from at least one of the one or more other extraction entries.
摘要:
A system, method, and computer program product for updating a partitioned index of a dataset. A document is indexed by separating it into indexable sections, such that different ones of the indexable sections may be contained in different partitions of the partitioned index. The partitioned index is updated using an updated version of the document by updating only those sections of the index corresponding to sections of the document that have been updated in the updated version.
摘要:
An XML wrapper queries an XML document in an on-the-fly manner so that only parent nodes in the document that satisfy the query are extracted and then unnested. The parent nodes and associated descendent nodes are located using XPath expressions contained as options in data definition language (DDL) statements. The parent nodes satisfying the query and associated descendent nodes are extracted and stored outside of a database according to a relational schema. The wrapper facilitates applications that use convention SQL queries and views to operate on that information stored according to the relational schema. The wrapper also responds to query optimizer requests for costs associated with queries against external data sources associated with the wrapper.
摘要:
A system and method for parsing documents in query processing comprises producing at least one index of a document written in a mark-up language, corresponding the index to the document, scanning the document, and selectively skipping portions of the document based on instructions from the index. Furthermore, the mark-up language comprises any of HTML and XML; the skipped portions of the document comprise portions irrelevant to the query; the index comprises a plurality of elements representing textual categories of the query; and the instructions match the elements to the query. If the elements do not match the query, then the parser uses the index to skip the portions of the document corresponding to the unmatched elements. Moreover, each of the elements corresponds to a position in the document, wherein the position comprises an end position, which determines where to resume scanning the document upon skipping the portions of the document.
摘要:
A method is disclosed for expansion of rare queries to improve advertisement results, including receiving a query from a user by a search engine; determining that the query does not match an entry in an ad query lookup table coupled with the search engine; retrieving one or more expanded queries located within a query feature index whose features relate to one or more features of the received query, wherein the query feature index includes a plurality of queries expanded based on at least corresponding search results; generating, in real time and by the search engine, an ad query including an expanded version of the received query based on features of the retrieved expanded queries; and selecting one or more advertisements based on the generated ad query, wherein the one or more advertisements are displayed to the user in response to the query received from the user.
摘要:
A method and apparatus are provided for better web ad matching by combining relevance with consumer click feedback. In one example, the method includes receiving a query page, extracting features from the query page, re-weighting the query page, evaluating the query page in light of each ad in order to score each ad and pick substantially best ad matches of the indexed ads, and returning the substantially best ad matches to the consumer computer.