Test suite for different kinds of biases in data

    公开(公告)号:US11610079B2

    公开(公告)日:2023-03-21

    申请号:US16777912

    申请日:2020-01-31

    发明人: Michael Yang

    IPC分类号: G06K9/62 G06N20/00

    摘要: There is provided computer implemented method for detecting and reducing or removing bias for generating a machine learning model, comprising: prior to generating the machine learning model: receiving a training dataset, comprising target inputs, each comprising parameters and labelled with a corresponding target output, wherein at least one of the parameters of at least of the target inputs comprises a sensitive parameter indicative of the corresponding target input assigned to a sensitive group that is potentially biased against other target inputs that are excluded from the sensitive group, analyzing the training dataset to identify target inputs affected by label bias when a statistically significant difference is detected between target inputs assigned to the sensitive group and target inputs excluded from the sensitive group, correcting labels of the target inputs affected by label bias, and generating the machine learning model using the corrected labels.

    Systems and methods of determining target database for replication of tenant data

    公开(公告)号:US11609928B2

    公开(公告)日:2023-03-21

    申请号:US16548937

    申请日:2019-08-23

    发明人: Swaroop Jayanthi

    摘要: Systems and methods are provided for retrieving a source database replication configuration profile that is used to determine one or more databases of a plurality of target databases to store selected tenant data of a source database to be replicated, retrieving from each the plurality of target databases a target database replication configuration profile and transforming the profiles to persist in a management platform database, comparing the retrieved source database replication configuration profile and the target database replication configuration profiles to determine which target databases are usable to replicate the selected tenant data to, classifying the target database replication configuration profiles based on results of the comparison, and generating a list of one or more target databases of the plurality of target databases for the selected tenant data of the source database to be replicated to based on the classification of the target database replication configuration profiles.

    TECHNIQUES FOR DATA RETENTION
    25.
    发明申请

    公开(公告)号:US20230084317A1

    公开(公告)日:2023-03-16

    申请号:US18049117

    申请日:2022-10-24

    摘要: Systems and techniques for managing data in a relational database environment and a non-relational database environment. Data in the relational database environment that is static and to be maintained beyond a preselected threshold length of time is identified. The data is copied from the relational database and stored in the data the non-relational database. Access to the data is provided from the non-relational database via a user interface that accesses both the relational database and the non-relational database.

    Sequence-to-sequence prediction using a neural network model

    公开(公告)号:US11604956B2

    公开(公告)日:2023-03-14

    申请号:US15885576

    申请日:2018-01-31

    摘要: A method for sequence-to-sequence prediction using a neural network model includes A method for sequence-to-sequence prediction using a neural network model, generating an encoded representation based on an input sequence using an encoder of the neural network model, predicting a fertility sequence based on the input sequence, generating an output template based on the input sequence and the fertility sequence, and predicting an output sequence based on the encoded representation and the output template using a decoder of the neural network model. The neural network model includes a plurality of model parameters learned according to a machine learning process. Each item of the fertility sequence includes a fertility count associated with a corresponding item of the input sequence.

    Interest groups based on network feed items

    公开(公告)号:US11604814B2

    公开(公告)日:2023-03-14

    申请号:US17249162

    申请日:2021-02-22

    发明人: Ashok Gadamsetty

    摘要: Disclosed are some examples of systems, apparatus, methods and storage media for creating groups in a social networking database system, and more specifically, to creating groups based on network feed items. In some implementations, a database system is capable of maintaining a database including data associated with a plurality of users and groups to which the users can be subscribed. The system is configurable to provide a feed for display to a first user, and to receive input entered in a publication field by the first user. The system is configurable to create a feed item for display to the first user and to at least one second user based on the received input. The system is configurable to receive second input associated with the feed item from the second user. The system is additionally configurable to provide a selectable user interface (UI) element for display to the first user. Responsive to the selection of the UI element, the system is further configurable to create a new group based on the feed item, and to subscribe the first and the second user to the new group without additional input.

    WORKFLOWS FOR AUTOMATED OPERATIONS MANAGEMENT

    公开(公告)号:US20230073909A1

    公开(公告)日:2023-03-09

    申请号:US18055489

    申请日:2022-11-15

    发明人: Mark F. Wilding

    摘要: Techniques are disclosed relating to automated operations management. In various embodiments, a computer system accesses operational information that defines commands for an operational scenario and accesses blueprints that describe operational entities in a target computer environment related to the operational scenario. The computer system implements the operational scenario for the target computer environment. The implementing may include executing a hierarchy of controller modules that include an orchestrator controller module at top level of the hierarchy that is executable to carry out the commands by issuing instructions to controller modules at a next level. The controller modules may be executable to manage the operational entities according to the blueprints to complete the operational scenario. In various embodiments, the computer system includes additional features such as an application programming interface (API), a remote routing engine, a workflow engine, a reasoning engine, a security engine, and a testing engine.

    SYSTEMS AND METHODS FOR SEQUENTIAL RECOMMENDATION

    公开(公告)号:US20230073754A1

    公开(公告)日:2023-03-09

    申请号:US17586451

    申请日:2022-01-27

    IPC分类号: G06K9/62

    摘要: Embodiments described herein provides an intent prototypical contrastive learning framework that leverages intent similarities between users with different behavior sequences. Specifically, user behavior sequences are encoded into a plurality of user interest representations. The user interest representations are clustered into a plurality of clusters based on mutual distances among the user interest representations in a representation space. Intention prototypes are determined based on centroids of the clusters. A set of augmented views for user behavior sequences are created and encoded into a set of view representations. A contrastive loss is determined based on the set of augmented views and the plurality of intention prototypes. Model parameters are updated based at least in part on the contrastive loss.

    SYSTEMS AND METHODS FOR EXPLAINABLE AND FACTUAL MULTI-DOCUMENT SUMMARIZATION

    公开(公告)号:US20230070497A1

    公开(公告)日:2023-03-09

    申请号:US17589675

    申请日:2022-01-31

    摘要: Embodiments described herein provide methods and systems for summarizing multiple documents. A system receives a plurality of documents and generates embeddings of the sentences from the plurality of documents. The embedded sentences are clustered in a representation space. Sentences from a reference summary are embedded and aligned with the closest cluster. Sentences from each cluster are summarized with the aligned reference sentences as a target. A loss is computed based on the summarized sentences and the aligned references, and the natural language processing model is updated based on the loss. Sentences may be masked from being used in the summarization by identifying sentences that are contradicted by other sentences within the plurality of documents.