-
公开(公告)号:US20240386002A1
公开(公告)日:2024-11-21
申请号:US18319748
申请日:2023-05-18
Applicant: Adobe Inc.
Inventor: Raunak Shah , Koyel MUKHERJEE , Subrata MITRA , Dhruv JOSHI , Sai KARNAM , Shivam Pravin BHOSALE
IPC: G06F16/215 , G06F16/28 , G06F40/284
Abstract: A dataset comprising tables is received. Embeddings are generated for column titles of a table. Based on the embeddings, similar tables are clustered. The tables are organized into smaller clusters based on statistical similarities. Similarity scores are calculated for tables within the same cluster. A relatedness graph is created based on the similarity scores; similar tables are represented by nodes connected by edges. If the similarity score for a pair of tables exceeds a threshold, a table is deleted.