ENHANCED MODEL EXPLANATIONS USING DYNAMIC TOKENIZATION FOR ENTITY MATCHING MODELS

    公开(公告)号:US20240177053A1

    公开(公告)日:2024-05-30

    申请号:US18070598

    申请日:2022-11-29

    Applicant: SAP SE

    CPC classification number: G06N20/00

    Abstract: Methods, systems, and computer-readable storage media for receiving query data representative of query entities and target data representative of target entities, determining, by an attention ML model, a set of character-level embeddings, providing, by a sub-word-level tokenizer, a set of sub-word-level tokens, each sub-word-level token including a string of multiple characters, generating, by the attention ML model, a set of sub-word-level embeddings based on the set of sub-word-level tokens, providing, by the attention ML model, at least one attention matrix including attention scores, each attention score representative of a relative importance of a respective sub-word-level token in a predicted match, the predicted match including a match between a query entity and a target entity, and outputting an explanation based on the at least one attention matrix.

Patent Agency Ranking