HIDDEN MARKOV MODEL BASED DATA RANKING FOR ENHANCEMENT OF CLASSIFIER PERFORMANCE TO CLASSIFY IMBALANCED DATASET

    公开(公告)号:US20230128462A1

    公开(公告)日:2023-04-27

    申请号:US17510458

    申请日:2021-10-26

    Abstract: A hybrid Hidden Markov Model (HMM) and Machine Learning (ML) systems and apparatus for classification in the case of data instances with imbalanced class distribution, including a Hidden Markov Model for generating a log-likelihood score for each data instance. Implementations of the hybrid system and method detect fraudulent activity and classifies documents with accuracy that surpasses conventional classifiers. In one implementation, Hidden Markov Model (HMM) for generating a log-likelihood score based on an attribute value vector for a set of keyword features characterizing a Web page. In one implementation, the HMM generates a log-likelihood score based on an attribute value vector for page layout characterizing a document image. Resulting attribute value vectors are ranked and divided into bins grouped by log-likelihood scores within equal ranges. Various machine learning models are trained using the balanced vectors obtained by accumulating from all the bins of vectors.

    SYSTEM FOR DETECTING WEB PAGE FRAUD BASED ON WORDLIST CATEGORIZATION

    公开(公告)号:US20240364719A1

    公开(公告)日:2024-10-31

    申请号:US18768090

    申请日:2024-07-10

    CPC classification number: H04L63/1416 G06N20/00

    Abstract: A hybrid Hidden Markov Model (HMM) and Machine Learning (ML) systems and apparatus for classification in the case of data instances with imbalanced class distribution, including a Hidden Markov Model for generating a log-likelihood score for each data instance. Implementations of the hybrid system and method detect fraudulent activity and classifies documents with accuracy that surpasses conventional classifiers. In one implementation, Hidden Markov Model (HMM) for generating a log-likelihood score based on an attribute value vector for a set of keyword features characterizing a Web page. In one implementation, the HMM generates a log-likelihood score based on an attribute value vector for page layout characterizing a document image. Resulting attribute value vectors are ranked and divided into bins grouped by log-likelihood scores within equal ranges. Various machine learning models are trained using the balanced vectors obtained by accumulating from all the bins of vectors.

    DOCUMENT IMAGE CLASSIFYING SYSTEM
    3.
    发明公开

    公开(公告)号:US20240364718A1

    公开(公告)日:2024-10-31

    申请号:US18768082

    申请日:2024-07-10

    CPC classification number: H04L63/1416 G06N20/00

    Abstract: A hybrid Hidden Markov Model (HMM) and Machine Learning (ML) systems and apparatus for classification in the case of data instances with imbalanced class distribution, including a Hidden Markov Model for generating a log-likelihood score for each data instance. Implementations of the hybrid system and method detect fraudulent activity and classifies documents with accuracy that surpasses conventional classifiers. In one implementation, Hidden Markov Model (HMM) for generating a log-likelihood score based on an attribute value vector for a set of keyword features characterizing a Web page. In one implementation, the HMM generates a log-likelihood score based on an attribute value vector for page layout characterizing a document image. Resulting attribute value vectors are ranked and divided into bins grouped by log-likelihood scores within equal ranges. Various machine learning models are trained using the balanced vectors obtained by accumulating from all the bins of vectors.

    METHOD FOR WEB PAGE FRAUD ACTIVITY
    4.
    发明公开

    公开(公告)号:US20240364717A1

    公开(公告)日:2024-10-31

    申请号:US18768079

    申请日:2024-07-10

    CPC classification number: H04L63/1416 G06N20/00

    Abstract: A hybrid Hidden Markov Model (HMM) and Machine Learning (ML) systems and apparatus for classification in the case of data instances with imbalanced class distribution, including a Hidden Markov Model for generating a log-likelihood score for each data instance. Implementations of the hybrid system and method detect fraudulent activity and classifies documents with accuracy that surpasses conventional classifiers. In one implementation, Hidden Markov Model (HMM) for generating a log-likelihood score based on an attribute value vector for a set of keyword features characterizing a Web page. In one implementation, the HMM generates a log-likelihood score based on an attribute value vector for page layout characterizing a document image. Resulting attribute value vectors are ranked and divided into bins grouped by log-likelihood scores within equal ranges. Various machine learning models are trained using the balanced vectors obtained by accumulating from all the bins of vectors.

    GEOLOGICAL FORMATION PERMEABILITY PREDICTION SYSTEM

    公开(公告)号:US20230186126A1

    公开(公告)日:2023-06-15

    申请号:US16791571

    申请日:2020-02-14

    CPC classification number: G06N7/01 G06N3/086 G01N15/08 E21B49/00 G01N33/246

    Abstract: Systems, methods, and apparatuses are provided for permeability prediction. The method acquires data associated with one or more geological formations, calculates, using processing circuitry and a trained Hidden Markov model, log-likelihood values to group the data into a plurality of clusters, and trains an artificial neural network for each of the plurality of clusters when the mode of operation is training mode. Further, the method acquires one or more formation properties corresponding to a geological formation, determines using the trained Hidden Markov model, a log-likelihood score associated with the one or more formation properties, identifies a cluster associated with the one or more formation properties as a function of the log-likelihood score, and predicts a permeability based at least in part on the one or more formation properties and a trained artificial neural network associated with the identified cluster when the mode of operation is forecasting mode.

Patent Agency Ranking