Systems and methods for machine learning dataset generation

    公开(公告)号:US11657292B1

    公开(公告)日:2023-05-23

    申请号:US16743977

    申请日:2020-01-15

    CPC classification number: G06N3/088 G06N3/0454

    Abstract: Disclosed herein are embodiments of systems, methods, and products comprising an analytic server that automates training dataset generation for different application areas. The server may perform an automated, iterative refinement process to build a collection of dataset generator models over time. The server may receive a set of seed examples in a domain and generate candidate examples based on the features of the seed examples using data synthesis techniques. The server may execute a pre-trained label discriminator (LD) and domain discriminator (D2) on the candidate examples. The LD may identify and reject mislabeled data. The D2 may identify and reject out of domain data. The analytic server may regenerate new labeled data based on the feedback of the LD and D2. The analytic server may train a dataset generator by iteratively performing these steps for refinement until the regenerated candidate examples reach a pass rate threshold.

    Adaptive data processing system and method

    公开(公告)号:US10990611B1

    公开(公告)日:2021-04-27

    申请号:US15802570

    申请日:2017-11-03

    Abstract: A method for adaptively providing processed data to elements of a distributed network, includes a processor partitioning data from a plurality of data sources, including big data from a plurality of big data sources based on defined needs of the elements; the processor storing the partitioned data in a central data source and a subset of the partitioned data in one or more cache memories in proximity to the elements; receiving a data request from a network element; determining a time-sensitivity of data responsive to the data request; supplying a response to the data request for non-time-sensitive data; and supplying the response to the data request for time-sensitive data.

Patent Agency Ranking