Method, System, and Computer Program Product for Embedding Compression and Regularization

    公开(公告)号:US20240289613A1

    公开(公告)日:2024-08-29

    申请号:US18656024

    申请日:2024-05-06

    CPC classification number: G06N3/08 G06N3/0455

    Abstract: A method, system, and computer program product is provided for embedding compression and reconstruction. The method includes receiving embedding vector data comprising a plurality of embedding vectors. A beta-variational autoencoder is trained based on the embedding vector data and a loss equation. The method includes determining a respective entropy of a respective mean and a respective variance of each respective dimension of a plurality of dimensions. A first subset of the plurality of dimensions is determined based on the respective entropy of the respective mean and the respective variance for each respective dimension of the plurality of dimensions. A second subset of the plurality of dimensions is discarded based on the respective entropy of the respective mean and the respective variance for each respective dimension of the plurality of dimensions. The method includes generating a compressed representation of the embedding vector data based on the first subset of dimensions.

    TIME SERIES PREDICTIVE MODEL FOR ESTIMATING METRIC FOR A GIVEN ENTITY

    公开(公告)号:US20240127035A1

    公开(公告)日:2024-04-18

    申请号:US18275598

    申请日:2022-02-01

    CPC classification number: G06N3/0455

    Abstract: A method performed by a computer is disclosed. The method comprises receiving interaction data between electronic devices of a plurality of entities. The interaction data is used to form an entity interaction vector containing a number of interactions between the electronic devices of a chosen entity and an entity time series containing a plurality of metrics per unit time of the interactions. An interaction encoder of the computer can generate an interaction hidden representation of the entity interaction vector using embeddings of the plurality of entities. A temporal encoder of the computer can generate a temporal hidden representation of the entity time series. The interaction hidden representation and the temporal hidden representation can be used to generate a predicted scale and a shape estimation of a target interaction metric. The computer can then generate an estimated interaction metric of a time period using the predicted scale and the shape estimation.

    ERROR-BOUNDED APPROXIMATE TIME SERIES JOIN USING COMPACT DICTIONARY REPRESENTATION OF TIME SERIES

    公开(公告)号:US20240273095A1

    公开(公告)日:2024-08-15

    申请号:US18567717

    申请日:2022-06-01

    CPC classification number: G06F16/24537 G06F16/2465 G06F16/2477

    Abstract: A method is disclosed. The method comprises determining a time series, a subsequence length. The length of the time series may then be determined, and an initial matrix profile may then be computed. The method may then form a processed matrix profile for a first subsequence of the subsequence length by applying the first subsequence to the initial matrix profile. A second subsequence may then be determined from the processed matrix profile. The method may then include comparing the second subsequence to other subsequences in a dictionary and adding it to the dictionary. The subsequences in the dictionary may be used to generate a plurality of subsequence matrix profiles. The method may then include forming an approximate matrix profile using the plurality of subsequence matrix profiles and then determining one or more anomalies in the time series or another time series using the approximate matrix profile.

    System, Method, and Computer Program Product for Debiasing Embedding Vectors of Machine Learning Models

    公开(公告)号:US20240160854A1

    公开(公告)日:2024-05-16

    申请号:US18280792

    申请日:2022-03-30

    CPC classification number: G06F40/40

    Abstract: Described are a system, method, and computer program product for debiasing embedding vectors of machine learning models. The method includes receiving embedding vectors and generating two clusters thereof. The method includes determining a first mean vector of the first cluster and a second mean vector of the second cluster. The method includes determining a bias associated with each of a plurality of first candidate vectors and replacing the first mean vector with a first candidate vector based on the bias. The method includes determining a bias associated with each of a plurality of second candidate vectors and replacing the second mean vector with a second candidate vector based on the bias. The method includes repeatedly replacing the first and second mean vectors until an extremum of the bias score is reached, and debiasing the embedding vectors by linear projection using a direction defined by the first and second mean vectors.

Patent Agency Ranking