Method and system for calculating minwise hash signatures from weighted sets

    公开(公告)号:US12061878B2

    公开(公告)日:2024-08-13

    申请号:US18141506

    申请日:2023-05-01

    Applicant: Dynatrace LLC

    Inventor: Otmar Ertl

    CPC classification number: G06F7/582 G06F16/152 G06F18/2113 G06F18/22

    Abstract: A system and method for the creation of locality sensitive hash signatures using weighted feature sets is disclosed. The disclosed methodology takes advantage of discretization mechanisms commonly used in computer systems to model the influence of the feature weights on the calculated hash signature. Pseudo random numbers required for the signature calculation are created in ascending order, which enables the signature generation mechanism to identify and avoid the unnecessary creation of pseudo random numbers to improve the performance of the signature calculation process. Further, hierarchic, tree-search like algorithms are used during the processing of signature weights to further decrease the number of required random numbers. The features of the Poisson Process model, like its ability to provide random numbers in ascending order and the split—and mergeability of Poisson Processes are used to further improve the performance of the signature calculation process.

    Method and system for calculating minwise hash signatures from weighted sets

    公开(公告)号:US11645043B2

    公开(公告)日:2023-05-09

    申请号:US16786992

    申请日:2020-02-10

    Applicant: Dynatrace LLC

    Inventor: Otmar Ertl

    CPC classification number: G06F16/152 G06F7/582 G06K9/623 G06K9/6215

    Abstract: A system and method for the creation of locality sensitive hash signatures using weighted feature sets is disclosed. The disclosed methodology takes advantage of discretization mechanisms commonly used in computer systems to model the influence of the feature weights on the calculated hash signature. Pseudo random numbers required for the signature calculation are created in ascending order, which enables the signature generation mechanism to identify and avoid the unnecessary creation of pseudo random numbers to improve the performance of the signature calculation process. Further, hierarchic, tree-search like algorithms are used during the processing of signature weights to further decrease the number of required random numbers. The features of the Poisson Process model, like its ability to provide random numbers in ascending order and the split- and mergeability of Poisson Processes are used to further improve the performance of the signature calculation process.

    Identification Of Primary And Foreign Keys

    公开(公告)号:US20250156416A1

    公开(公告)日:2025-05-15

    申请号:US18938570

    申请日:2024-11-06

    Applicant: Dynatrace LLC

    Abstract: A computer-implemented method is presented for determining primary keys in a table of a database system. The method includes: determining a number of rows in the table; for a given column of the table, generating a probabilistic data structure for the given column, where the probabilistic data structure is partitioned into a plurality of registers and configuration parameters for the probabilistic data structure includes a first recording parameter, base, that controls recording of data into the probabilistic data structure; computing a cardinality estimate for the given column using the probabilistic data structure; computing a ratio between the cardinality estimate for the given column and the number of rows in the table; comparing the ratio to a threshold; and designating the given column as a primary key for the table in response to the ratio being greater than the threshold.

    Method and system for the on-demand generation of graph-like models out of multidimensional observation data

    公开(公告)号:US12204431B2

    公开(公告)日:2025-01-21

    申请号:US17733105

    申请日:2022-04-29

    Applicant: Dynatrace LLC

    Abstract: Technologies are disclosed for the automated, rule-based generation of models from arbitrary, semi-structured observation data. Context data of received observation data, like data describing the location of on which a phenomenon was observed, is used to identify related observations, to generate entities in a model describing the observed data and to assign observations to model data. Mapping rules may be used for the on-demand generation of models, and different sets of mapping rules may be used to generate different models out of the same observation data for different purposes. Further, observation time data may be used to observer the temporal evolution of the generated model. Possible use cases of the so generated models include the interpretation of observation data that describes unexpected operation conditions in view of the generated model, or to determine how a monitored system reacts on changing conditions, like increased load.

    Method and system to estimate the cardinality of sets and set operation results from single and multiple HyperLogLog sketches

    公开(公告)号:US11561954B2

    公开(公告)日:2023-01-24

    申请号:US17358170

    申请日:2021-06-25

    Applicant: Dynatrace LLC

    Inventor: Otmar Ertl

    Abstract: A system and method for the estimation of the cardinality of large sets of transaction trace data is disclosed. The estimation is based on HyperLogLog data sketches that are capable to store cardinality relevant data of large sets with low and fixed memory requirements. The disclosure contains improvements to the known analysis methods for HyperLogLog data sketches that provide improved relative error behavior by eliminating a cardinality range dependent bias of the relative error. A new analysis method for HyperLogLog data structures is shown that uses maximum likelihood analysis methods on a Poisson based approximated probability model. In addition, a variant of the new analysis model is disclosed that uses multiple HyperLogLog data structured to directly provide estimation results for set operations like intersections or relative complement directly from the HyperLogLog input data.

    Method and system for log data analytics based on SuperMinHash signatures

    公开(公告)号:US12160503B2

    公开(公告)日:2024-12-03

    申请号:US18376882

    申请日:2023-10-05

    Applicant: Dynatrace LLC

    Abstract: A system and method for the analysis of log data is presented. The system uses SuperMinHash based locality sensitive hash signatures to describe the similarity between log lines. Signatures are created for incoming log lines and stored in signature indexes. Later similarity queries use those indexes to improve the query performance. The SuperMinHash algorithm uses a two staged approach to determine signature values, one stage uses a first random number to calculate the index of the signature value that is to update. The two staged approach improves the accuracy of the produced similarity estimation data for small sized signatures. The two staged approach may further be used to produce random numbers that are related, e.g. each created random number may be larger than its predecessors. This relation is used to optimize the algorithm by determining and terminating when further created random numbers have no influence on the created signature.

    Method and system for log data analytics based on SuperMinHash signatures

    公开(公告)号:US11804952B2

    公开(公告)日:2023-10-31

    申请号:US17887079

    申请日:2022-08-12

    Applicant: Dynatrace LLC

    Abstract: A system and method for the analysis of log data is presented. The system uses SuperMinHash based locality sensitive hash signatures to describe the similarity between log lines. Signatures are created for incoming log lines and stored in signature indexes. Later similarity queries use those indexes to improve the query performance. The SuperMinHash algorithm uses a two staged approach to determine signature values, one stage uses a first random number to calculate the index of the signature value that is to update. The two staged approach improves the accuracy of the produced similarity estimation data for small sized signatures. The two staged approach may further be used to produce random numbers that are related, e.g. each created random number may be larger than its predecessors. This relation is used to optimize the algorithm by determining and terminating when further created random numbers have no influence on the created signature.

    Method and system for log data analytics based on SuperMinHash signatures

    公开(公告)号:US11431475B2

    公开(公告)日:2022-08-30

    申请号:US16440439

    申请日:2019-06-13

    Applicant: Dynatrace LLC

    Abstract: A system and method for the analysis of log data is presented. The system uses SuperMinHash based locality sensitive hash signatures to describe the similarity between log lines. Signatures are created for incoming log lines and stored in signature indexes. Later similarity queries use those indexes to improve the query performance. The SuperMinHash algorithm uses a two staged approach to determine signature values, one stage uses a first random number to calculate the index of the signature value that is to update. The two staged approach improves the accuracy of the produced similarity estimation data for small sized signatures. The two staged approach may further be used to produce random numbers that are related, e.g. each created random number may be larger than its predecessors. This relation is used to optimize the algorithm by determining and terminating when further created random numbers have no influence on the created signature.

Patent Agency Ranking