Mining training data for training dependency model

    公开(公告)号:US11816636B2

    公开(公告)日:2023-11-14

    申请号:US17412753

    申请日:2021-08-26

    CPC classification number: G06Q10/1053 G06N5/01 G06N20/20 G06Q10/063112

    Abstract: Techniques for mining training data for use in training a dependency model are disclosed herein. In some embodiments, a computer-implemented method comprises: obtaining training data comprising a plurality of reference skill pairs, each reference skill pair comprising a corresponding first reference skill and a corresponding second reference skill, the plurality of reference skill pairs being included in the training data based on a co-occurrence of the corresponding first and second reference skills for each reference skill pair in the plurality of reference skill pairs, the co-occurrence comprising the corresponding first and second reference skills co-occurring for a same entity; and training a dependency model with a machine learning algorithm using the training data, the dependency model comprising a logistic regression model or a data gradient boosted decision tree (GBDT) model. The dependency model may then be used to identify corresponding dependency relations for a plurality of target skill pairs.

    Multi-task learning framework for multi-context machine learning

    公开(公告)号:US11604990B2

    公开(公告)日:2023-03-14

    申请号:US16902587

    申请日:2020-06-16

    Abstract: In an example embodiment, a framework to infer a user's value for a particular attribute based upon a multi-task machine learning process with uncertainty weighting that incorporates signals from multiple contexts is provided. In an example embodiment, the framework aims to measure a level of a user attribute under a certain context. Rather than attempting to devise a universal, one-size-fits-all value for the attribute, the framework acknowledges that the user's value for that attribute can vary depending on context and factors in the context under which the user's attribute levels are measured. Multiple contexts are defined depending on different situations where users and entities such as companies and organizations need to evaluate user attribute levels. Signals for attribute levels are then collected for each context. Machine learning models are utilized to estimate attribute values for different contexts. Multi-task deep learning is used to level attributes from different contexts.

    NEURAL NETWORK PREDICTION USING TRAJECTORY MODELING

    公开(公告)号:US20230075600A1

    公开(公告)日:2023-03-09

    申请号:US17468028

    申请日:2021-09-07

    Abstract: Techniques for training and using a neural network to make predictions using trajectory modelling are disclosed herein. In some embodiments, a computer-implemented method comprises: training a first neural network with a first machine learning algorithm using training data, the first neural network being a recurrent neural network, the training data including a plurality of reference career trajectories, each reference career trajectory in the plurality of reference career trajectories comprising a sequence of reference career segments, each reference career segment in the sequence of reference career segments comprising reference profile data and reference time data indicating a position of the reference career segment within the sequence of reference career segments, the training data also including a corresponding set of reference skills for each reference career segment.

    Machine learning techniques for analyzing textual content

    公开(公告)号:US11487947B2

    公开(公告)日:2022-11-01

    申请号:US16716402

    申请日:2019-12-16

    Abstract: Techniques are provided for using machine learning techniques to analyze textual content. In one technique, a potential item is identified within a document. An analysis of the potential item is performed at multiple levels of granularity that includes two or more of a sentence level, a segment level, or a document level. The analysis produces multiple outputs, one for each level of granularity in the multiple levels of granularity. The outputs are input into a machine-learned model to generate a score for the potential item. Based on the score, the potential item is presented on a computing device. In response to user selection of the potential item, an association between the potential item and the document is created. The association may be used later to identify a set of users to which the document (or data thereof) is to be presented.

    SEMANTIC MATCHING AND RETRIEVAL OF STANDARDIZED ENTITIES

    公开(公告)号:US20210303638A1

    公开(公告)日:2021-09-30

    申请号:US16836546

    申请日:2020-03-31

    Abstract: The disclosed embodiments provide a system for processing user-generated input. During operation, the system obtains a first embedding produced by an embedding model from an input string representing an entity and a hierarchy of clusters of embeddings generated by the embedding model from a set of standardized entities. Next, the system searches the hierarchy of clusters for a subset of the embeddings that are within a threshold proximity to the first embedding in a vector space. The system then calculates embedding match scores between the input string and a first subset of the standardized entities represented by the subset of the embeddings based on distances between the subset of the embeddings and the first embedding in the vector space. Finally, the system modifies, based on the embedding match scores, content outputted in response to the input string within a user interface of an online system.

    MULTI-TIERED SYSTEM FOR SCALABLE ENTITY REPRESENTATION LEARNING

    公开(公告)号:US20210065047A1

    公开(公告)日:2021-03-04

    申请号:US16556097

    申请日:2019-08-29

    Abstract: Techniques for learning entity representations in a scalable manner are provided. A graph that comprises a plurality of nodes representing a set of entities is stored. A first subset of the set of entities and a second subset of the set of entities are identified. For each entity in the first subset of the set of entities, one or more machine learning techniques are used to generate a machine-learned embedding for the entity. For each entity in the second subset of the set of entities, a subset of entities in the first subset that are associated with the entity is identified. One or more embeddings are identified for the subset of entities. Based on the one or more embeddings, an inferred embedding is generated for the entity.

    NEXT CAREER MOVE PREDICTION WITH CONTEXTUAL LONG SHORT-TERM MEMORY NETWORKS

    公开(公告)号:US20190130281A1

    公开(公告)日:2019-05-02

    申请号:US15799396

    申请日:2017-10-31

    Abstract: Techniques for predicting a next company and next title of a user are disclosed herein. In some embodiments, an encoder is used for encoding a representation of the user's profile. The encoding includes accessing discrete entities comprising context information included in the user's profile, constructing a plurality of embedding vectors from the context information, and generating a context vector from the plurality of embedding vectors. The plurality of embedding vectors including a skill embedding vector, a school embedding vector, and a location embedding vector. A decoder is for decoding a career path from the context vector. The decoding includes applying a long short-term memory (LSTM) model to the context vector to generate perform the prediction of the user's next company and next title for presentation in a user interface.

Patent Agency Ranking