METHOD AND APPARATUS FOR ANALYZING COVERAGE, BIAS, AND MODEL EXPLANATIONS IN LARGE DIMENSIONAL MODELING DATA

    公开(公告)号:US20220358111A1

    公开(公告)日:2022-11-10

    申请号:US17741324

    申请日:2022-05-10

    IPC分类号: G06F16/22 G06F16/28

    摘要: A system and method for analyzing coverage, bias and model explanations in large dimensional modeling data includes discretizing three or more variables of a dataset to generate a discretized phase space represented as a grid of a plurality of cells, the dataset comprising a plurality of records, each record of the plurality of records having a value and a unique identifier (ID). A grid transformation is applied to each record in the dataset to assign each record to a cell of the plurality of cells of the grid according to the grid transformation. A grid index is generated to reference each cell using a discretized feature vector. A grid storage for storing the records assigned to each cell of the grid is then created. The grid storage using the ID of each record as a reference to each record and the discretized feature vector as a key to each cell.

    DATA DISTILLERY FOR SIGNAL DETECTION

    公开(公告)号:US20220245556A1

    公开(公告)日:2022-08-04

    申请号:US17659643

    申请日:2022-04-18

    摘要: Computer-implemented methods, systems and products for analytics and discovery of patterns or signals. The method includes a set of operations or steps, including collecting data from a plurality of data sources, the data having a plurality of associated data types, and filtering the collected data based on identifying viable data sources from which the data is collected. The method further includes prioritizing discovery objectives based on analyzing the filtering results, and enriching the filtered collected data from viable data sources according to the prioritized discovery objectives. The method further includes extracting one or more signals from the enriched data using one or more machine learning mechanisms in combination with qualified subject matter expertise input, and graphically displaying the extracted signals in a meaningful way to a human operator such that the human operator is enabled to understand importance of extracted signals.

    Visualization for payment card transaction fraud analysis

    公开(公告)号:US11380171B2

    公开(公告)日:2022-07-05

    申请号:US16357311

    申请日:2019-03-18

    摘要: A computer-implemented method and system for visualizing card transaction fraud analysis is presented. Transaction data and account data related to one or more payment card accounts is stored in a database. The transaction data includes a fraud score. A computer processor generates one or more of a plurality of visualizations of activity of at least one suspicious account from the one or more payment card accounts for display in a graphical user interface, each of the plurality of visualizations providing at least a graphical representation of the transaction data and which is selectable from a menu provided by the computer processor in the graphical user interface. The visualizations assist in case judgment of the one or more payment cards.

    LATENT FEATURE DIMENSIONALITY BOUNDS FOR ROBUST MACHINE LEARNING ON HIGH DIMENSIONAL DATASETS

    公开(公告)号:US20210406724A1

    公开(公告)日:2021-12-30

    申请号:US16917603

    申请日:2020-06-30

    IPC分类号: G06N5/04 G06N20/00

    摘要: Computer-implemented methods and systems for quantifying appropriate machine learning model complexity corresponding to training dataset are provided. The method comprises monitoring, using one or more processors, N observed variables, v1 through vN, of a training dataset for a machine learning model; translating the N observed variables into m equisized bin indexes which generate mN possible equisized hypercells to estimate a fundamental dimensionality for the dataset; generating one or more samples by assigning a record in the dataset with numbers j through k as set id; generating a merged sample Si, for one or more values of the set id i, where i goes from j to k; and computing a fractal dimension of the equisized hypercube phase space based on count of cells with data coverage of at least one data point.

    Distributed data processing framework

    公开(公告)号:US11210271B1

    公开(公告)日:2021-12-28

    申请号:US16998909

    申请日:2020-08-20

    摘要: In one aspect, there is provided a system. The system may store instructions that result in operations when executed by the at least one data processor. The operations may include receiving raw transactional data, collating, and reading the raw transactional data from the plurality of data sources. The operations may further include randomly sampling the raw transactional data. The operations may further include transforming the raw transactional data into at least one resilient distributed dataset. The operations may further include mapping the at least one resilient distributed dataset with a corresponding unique key. The operations may further include aggregating the at least one resilient distributed dataset on a key field. The operations may further include iterating over a lookup table. The operations may further include aggregating the data lines corresponding to the unique key associated with the at least one resilient distributed dataset. The operations may further include appending in-memory data lines serially to form a consumer level data string.

    TRAINING ARTIFICIAL NEURAL NETWORKS WITH CONSTRAINTS

    公开(公告)号:US20210295175A1

    公开(公告)日:2021-09-23

    申请号:US16823193

    申请日:2020-03-18

    摘要: Systems and methods for training a machine learning model implemented over a network configured to represent the machine learning model are provided. At least one or more directed edges connect the one or more nodes an edge representing a connection between a first node and a second node, the second node computing an activation depending on the values of activations on first nodes and values associated with the connections, the connection being either conforming or non-conforming. The machine learning model may be trained by iteratively adjusting parameters w and b, respectively associated with weights and biases associated with edges connecting computational nodes. Connections between nodes may be sparsified by adjusting the parameter w to a first value for non-conforming connections during the training phase to reduce complexity of the connections among the plurality of nodes, or to ensure the input-output function of the network adheres to additional constraints.

    CONFIGURATION PACKAGES FOR SOFTWARE PRODUCTS

    公开(公告)号:US20210279053A1

    公开(公告)日:2021-09-09

    申请号:US17206637

    申请日:2021-03-19

    IPC分类号: G06F8/76 G06F8/60

    摘要: A configuration package receives user-generated input that configures a decision service to generate decision data. The configuration package includes artifacts and the user-generated input selects the artifacts from an artifact library in a configuration database. A configured decision service is generated, where the generating includes receiving, by a decision service factory, the configuration package. Also, the decision service factory receives a decision template including configurable decision elements and non-configurable decision elements. Further, the decision service factory receives a user configuration specifying a parameter in the corresponding artifact. The artifact from the configuration package, the user configuration and the decision template are combined to generate the configured decision service. The configured decision service receives, from a client computer, input for each of the configurable decision elements. Based on the received input, the decision data is generated by the configured decision service. The generated decision data is transmitted to the client computer.

    Fraud score manipulation in self-defense of adversarial artificial intelligence learning

    公开(公告)号:US11100506B2

    公开(公告)日:2021-08-24

    申请号:US15590921

    申请日:2017-05-09

    摘要: A system and method for programmatically revealing misleading confidence values in Fraud Score is presented to protect artificial intelligence models from adversarial neural networks. The method is used to reduce an adversarial learning neural network model effectiveness. With the score manipulation implemented, the adversary models are shown to systematically become less successful in predicting the true behavior of the Fraud detection artificial intelligence model and what it will flag as fraudulent transactions, thus reducing the true fraud dollars penetrated or taken by adversaries.

    Data-driven product grouping
    9.
    发明授权

    公开(公告)号:US11087339B2

    公开(公告)日:2021-08-10

    申请号:US15727949

    申请日:2017-10-09

    IPC分类号: G06Q30/02 G06N20/00

    摘要: Data for a plurality of entities that can be offered a plurality of products can be obtained. The data can include categorical data and numeric data. Based on business constraints, some of all of the data can be selected. The selected data can be converted to another set of numeric data, wherein the categorical values are converted to numeric values. Dimensions of the converted data can be reduced to generate another set of data. Based on this another set of data, clusters of entities can be formed. The products can be grouped by assigning a unique product identifier of each product to a corresponding cluster. This grouping of products can be used by a predictive model to predict a likelihood of an entity to purchase a particular product in a future time period. Related methods, apparatus, systems, techniques and articles are also described.

    FACIAL RECOGNITION FOR USER AUTHENTICATION

    公开(公告)号:US20210240808A1

    公开(公告)日:2021-08-05

    申请号:US16781846

    申请日:2020-02-04

    摘要: Systems and methods for utilizing an image capture device to scan facial features of a user, responsive to recognition of a plurality of beam projection points on the face of the user. The first data captured from scanning the facial features may be authenticated against a facial depth map stored as a data structure in a data storage medium. In response to successful authentication, the facial features of the user may be continually scanned to detect facial movements indicative of the user's liveness. Access may be granted to the user, in response to verifying the user's liveness.