MACHINE LEARNING TECHNIQUES FOR AUTOMATIC VALIDATION OF EVENTS

    公开(公告)号:US20200097605A1

    公开(公告)日:2020-03-26

    申请号:US16141853

    申请日:2018-09-25

    Abstract: A system and method are provided for automatic identification, extraction, and validation of data pertaining to receiving entity events (REE). Feature (or attribute) values associated with web content are identified. The web content may contain news and features on current/past affairs. The identified feature values are considered by a rule-based or a machine-learned model and, based upon output of the model, a determination as to whether the set of data comprises a REE is made. If the determination is positive, then multiple data items are extracted from the set of data and, optionally, from other data from the source.

    Secure storage and processing of data for generating training data

    公开(公告)号:US12197539B2

    公开(公告)日:2025-01-14

    申请号:US17169161

    申请日:2021-02-05

    Abstract: Techniques for securely storing and processing data for training data generation are provided. In one technique, multiple encrypted records are retrieved from a first persistent storage. For each encrypted record, that record is decrypted in memory to generate a decrypted record that comprises multiple attribute values. Then, based on the attribute values and a definition of multiple features of a machine-learned model, multiple feature values are generated and stored, along with a label, in a training instance, which is then stored in a second persistent storage. One or more machine learning techniques are used to train the machine-learned model based on training data that includes the training instances that are stored in the second persistent storage.

    IDENTIFYING RELATIONSHIPS BETWEEN ENTITIES USING MACHINE LEARNING

    公开(公告)号:US20190197176A1

    公开(公告)日:2019-06-27

    申请号:US15851142

    申请日:2017-12-21

    CPC classification number: G06F16/285 G06N5/022 G06N20/00 H04L63/102

    Abstract: Techniques for identifying relationships between entities using machine learning are disclosed herein. In some embodiments, a computer-implemented method comprises: ingesting natural language text comprising a first target entity and a second target entity; identifying a relationship between the first target entity and the second target entity using at least one model; and performing a function using the identified relationship between the first target entity and the second target entity based on the identifying of the relationship, the function comprising a database modification operation or a relationship verification operation, the database modification operation comprising modifying at least one of a graph, a corresponding profile of the first target entity, and a corresponding profile of the second target entity stored in the database of the online service to indicate the identified relationship, and the relationship verification operation comprising causing the identified relationship to be displayed on a computing device.

    Identifying duplicate entities
    4.
    发明授权

    公开(公告)号:US11436532B2

    公开(公告)日:2022-09-06

    申请号:US16703386

    申请日:2019-12-04

    Abstract: The disclosed embodiments provide a system that identifies duplicate entities. During operation, the system selects training data for a first machine learning model based on confidence scores representing likelihoods that pairs of entities in an online system are duplicates. Next, the system updates parameters of the first machine learning model based on features and labels in the training data. The system then identifies a first subset of additional pairs of the entities as duplicate entities based on scores generated by the first machine learning model from values of the features for the additional pairs and a first threshold associated with the scores. The system also determines a canonical entity in each of the duplicate entities based on additional features. Finally, the system updates content outputted in a user interface of the online system based on the identified first subset of the additional pairs.

    IDENTIFYING DUPLICATE ENTITIES
    5.
    发明申请

    公开(公告)号:US20210173825A1

    公开(公告)日:2021-06-10

    申请号:US16703386

    申请日:2019-12-04

    Abstract: The disclosed embodiments provide a system that identifies duplicate entities. During operation, the system selects training data for a first machine learning model based on confidence scores representing likelihoods that pairs of entities in an online system are duplicates. Next, the system updates parameters of the first machine learning model based on features and labels in the training data. The system then identifies a first subset of additional pairs of the entities as duplicate entities based on scores generated by the first machine learning model from values of the features for the additional pairs and a first threshold associated with the scores. The system also determines a canonical entity in each of the duplicate entities based on additional features. Finally, the system updates content outputted in a user interface of the online system based on the identified first subset of the additional pairs.

    SEARCH-BASED URL-INFERENCE MODEL
    6.
    发明申请

    公开(公告)号:US20200311156A1

    公开(公告)日:2020-10-01

    申请号:US16370642

    申请日:2019-03-29

    Abstract: Techniques of inferring an organic website URL of an organization based on a web search result are provided. A query that includes an organization name is sent to a search engine and a set of search results is received from the search engine as a result of the query. Each search result in the set of search results includes a URL for the organization website address. For each search result in the set of search results, a set of feature values that is associated with each search result is identified. The set of feature values is inputted to a prediction model that generates a prediction, and based on the prediction, a determination of whether to associate the URL of each search result with the organization name is made.

Patent Agency Ranking