PROFILE-DRIVEN DATA VALIDATION
    1.
    发明申请

    公开(公告)号:US20200210389A1

    公开(公告)日:2020-07-02

    申请号:US16235441

    申请日:2018-12-28

    Abstract: The disclosed embodiments provide a system for performing profile-driven data validation. During operation, the system obtains a validation configuration containing declarative specifications of fields in a data set and validation rules to be applied to the data set. Next, the system analyzes the data set based on the validation configuration to produce a set of metrics related to the data set and stores the metrics in a profile for the data set. The system also matches a metric in the profile to the type of validation associated with a validation rule in the validation configuration. Finally, the system applies the validation rule to a value of the metric in the profile to produce a validation result for the validation rule.

    PROACTIVE AUTOMATED DATA VALIDATION
    3.
    发明申请

    公开(公告)号:US20200210401A1

    公开(公告)日:2020-07-02

    申请号:US16235347

    申请日:2018-12-28

    Abstract: The disclosed embodiments provide a system for processing data. During operation, the system obtains a validation configuration containing declarative specifications of fields in a data set and validation rules to be applied to the data set, wherein the validation rules include a field in the data set, a type of validation to be applied to the field, and a parameter for managing a validation failure during evaluation of the validation rules with the data set. Next, the system automatically applies the validation rules to the data set within a workflow for generating the data set to produce validation results indicating passing or failing of the validation rules by the data set. The system then outputs the validation results for use in managing the data set.

    TRANSFORMER FOR ENCODING TEXT FOR USE IN RANKING ONLINE JOB POSTINGS

    公开(公告)号:US20220284028A1

    公开(公告)日:2022-09-08

    申请号:US17195261

    申请日:2021-03-08

    Abstract: Described herein is machine learning model comprising a neural network that is trained to generate a ranking score for an online job posting. The neural network takes as input a variety of input features, including at least a first input feature that is an encoded representation of a search query as generated by a first Transformer encoder, an encoded representation of a job title as generated by a second Transformer encoder, and an encoded representation of a company name as generated by a third Transformer encoder. Once a plurality of online job postings are ranked, some subset of the plurality are presented in a user interface, ordered based on their respective ranking scores.

    AUTOMATIC LABELING OF LARGE DATASETS
    6.
    发明公开

    公开(公告)号:US20230418841A1

    公开(公告)日:2023-12-28

    申请号:US17847755

    申请日:2022-06-23

    Inventor: Sriram Vasudevan

    Abstract: Methods, systems, and computer programs are presented for labeling datasets. An example method can include generating rules for labeling data records within a first dataset. The rules can indicate an extent to which a data record matches query criteria. The method can further include generating an aggregated label for the corresponding data record based on the rules and training a machine learning model using the first dataset and the aggregated label. The method can include receiving an indication of user engagement and combining the indication of user engagement with the aggregated label to generate a score.

Patent Agency Ranking