Abstract:
System and methods of this disclosure are directed to recommending content in real-time or near real-time. The system comprises a number of pipelines updated a different frequencies that process temporally different sets of web property visit data. Within each pipeline, the system can employ different number of algorithms to process visit data to generate content recommendations. One algorithm is a content filter that filters from the visit data those determined to be unsuitable as recommendations. Another is a content analyzer that analyzes the content of each URL in the visit data by topic category and attribute. Another is an item-to-item collaborative filter that determines a correlation score for each URL based on those in the visit data in a single session Another is a device-to-item matrix factorization that determines an affinity score for each URL based on visit data, context information, and topic category.
Abstract:
Systems and methods for determining correlation scores for product pairs are provided. Contextual user behavior indicator data relating to a plurality of user behavior indicator types is received. A correlation score is computed for a first product and a second product for each user behavior indicator type from the plurality of user behavior indicator types. A final correlation score is computed for the first product and the second product by combining the computed correlation scores for each user behavior indicator type. The computed final correlation score for the first product and the second product is stored into a first data storage.
Abstract:
Systems and methods provide distantly supervised wrapper induction for semi-structured documents, including automatically generating and annotating training documents for the wrapper. Training of the wrapper may occur in two phases using the training documents. An example method includes identifying a training set of semi-structured web pages having a subject entity that exists in a knowledge base and, for each training page, identifying target objects, identifying predicates in the knowledge base that connect the subject entity to a target objects identified in the training page, and annotating the training page. Annotating a training page includes generating a feature set for a mention of the target object, generating predicate-target object pairs for the mention, and labeling each predicate-target object pair with a corresponding example type and weight. The annotated training pages are used to train the wrapper to extract new subject entities and new facts from the set of semi-structured web pages.
Abstract:
System and methods of this disclosure are directed to recommending content in real-time or near real-time. The system comprises a number of pipelines updated a different frequencies that process temporally different sets of web property visit data. Within each pipeline, the system can employ different number of algorithms to process visit data to generate content recommendations. One algorithm is a content filter that filters from the visit data those determined to be unsuitable as recommendations. Another is a content analyzer that analyzes the content of each URL in the visit data by topic category and attribute. Another is an item-to-item collaborative filter that determines a correlation score for each URL based on those in the visit data in a single session Another is a device-to-item matrix factorization that determines an affinity score for each URL based on visit data, context information, and topic category.
Abstract:
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for combining authentication and application shortcut. An example method includes responsive to a user request identifying an entity: identifying a first time period associated with the entity based at least on a type of the entity; determining, within the first time period, a plurality of first candidate entities associated with the first entity; selecting first entities in the plurality of first candidate entities according to one or more selection criteria; and providing, for presentation to the user, first user-selectable graphical elements on a first graphical user-interactive timeline. Each first user-selectable graphical element identifies a corresponding first entity in the first entities.