摘要:
Grammatical parsing is utilized to parse structured layouts that are modeled as grammars. This type of parsing provides an optimal parse tree for the structured layout based on a grammatical cost function associated with a global search. Machine learning techniques facilitate in discriminatively selecting features and setting parameters in the grammatical parsing process. In one instance, labeled examples are parsed and a chart is generated. The chart is then converted into a subsequent set of labeled learning examples. Classifiers are then trained utilizing conventional machine learning and the subsequent example set. The classifiers are then employed to facilitate scoring of succedent sub-parses. A global reference grammar can also be established to facilitate in completing varying tasks without requiring additional grammar learning, substantially increasing the efficiency of the structured layout analysis techniques.
摘要:
Dynamic inference is leveraged to provide online sequence data labeling. This provides real-time alternatives to current methods of inference for sequence data. Instances estimate an amount of uncertainty in a prediction of labels of sequence data and then dynamically predict a label when an uncertainty in the prediction is deemed acceptable. The techniques utilized to determine when the label can be generated are tunable and can be personalized for a given user and/or a system. Employed decoding techniques can be dynamically adjusted to tradeoff system resources for accuracy. This allows for fine tuning of a system based on available system resources. Instances also allow for online inference because the inference does not require knowledge of a complete set of sequence data.
摘要:
A discriminative grammar framework utilizing a machine learning algorithm is employed to facilitate in learning scoring functions for parsing of unstructured information. The framework includes a discriminative context free grammar that is trained based on features of an example input. The flexibility of the framework allows information features and/or features output by arbitrary processes to be utilized as the example input as well. Myopic inside scoring is circumvented in the parsing process because contextual information is utilized to facilitate scoring function training.
摘要:
Dynamic inference is leveraged to provide online sequence data labeling. This provides real-time alternatives to current methods of inference for sequence data. Instances estimate an amount of uncertainty in a prediction of labels of sequence data and then dynamically predict a label when an uncertainty in the prediction is deemed acceptable. The techniques utilized to determine when the label can be generated are tunable and can be personalized for a given user and/or a system. Employed decoding techniques can be dynamically adjusted to tradeoff system resources for accuracy. This allows for fine tuning of a system based on available system resources. Instances also allow for online inference because the inference does not require knowledge of a complete set of sequence data.
摘要:
Computer-readable media, computer systems, and computing devices facilitate generating binary classifier and entity extractor training data. Seed URLs are selected and URL patterns within the seed URLs are identified. Matching URLs in a data structure are identified and corresponding queries and their associated weights are added to a potential training data set from which training data is selected.
摘要:
In one embodiment, a method includes receiving a request for a webpage from a mobile-client system of a user, where the request includes an http-header, accessing information describing the user, determining the attributes of the mobile-client system based on the http-header and the information describing the user, and transmitting the webpage to the mobile-client system in response to the request, where the webpage has been customized based on the determined attributes of the mobile-client system.
摘要:
A social networking system leverages user's social information to evaluate content submitted for inclusion in objects. If the evaluated submission is accepted, the submission is added to the content of an object. Accepted submissions are also used to predict associations between metadata and objects. Metadata is used to predict which objects will match user searches for information. The social networking system also provides a user interface configured to prompt users to submit information to objects. When a user completes a submission to an object, the user is provided with other options for groups of objects to contribute to. The objects offered are chosen to increase the likelihood that the user will choose to provide submissions to one of the provided objects.
摘要:
In one embodiment, a method includes, by one or more server computing devices, receiving state data of a client computing device. The state data includes event data indicating events generated by or occurring at the client computing device and context data associated with the event data. The context data indicates device states of the client computing device that each coincide with one or more of the events and indicate a context of the one or more of the events. The method also includes, by one or more server computing devices, ordering the events and the device states in the event and context data into a state-data-review structure and analyzing the state-data-review structure to generate one or more recommendations on operation of the client computing device.
摘要:
In one embodiment, one or more computing systems receive a request for a location prediction for a user from a service. The computing systems access one or more real-time location signals and one or more aggregated location signals, generate one or more location predictions from the one or more real-time location signals and the one or more aggregated location signals, and calculate a single location prediction for the user from the one or more location predictions. The computing systems then transmit, in response to the request, the single location prediction for the user to the requesting service.
摘要:
A technique for increasing efficiency of inference of structure variables (e.g., an inference problem) using a priority-driven algorithm rather than conventional dynamic programming. The technique employs a probable approximate underestimate which can be used to compute a probable approximate solution to the inference problem when used as a priority function (“a probable approximate underestimate function”) for a more computationally complex classification function. The probable approximate underestimate function can have a functional form of a simpler, easier to decode model. The model can be learned from unlabeled data by solving a linear/quadratic optimization problem. The priority function can be computed quickly, and can result in solutions that are substantially optimal. Using the priority function, computation efficiency of a classification function (e.g., discriminative classifier) can be increased using a generalization of the A* algorithm.