-
公开(公告)号:US20240320538A1
公开(公告)日:2024-09-26
申请号:US18123673
申请日:2023-03-20
Applicant: ADOBE INC.
Inventor: Ramasuri NARAYANAM , Shiv Kumar SAINI , Koyel MUKHERJEE , Manisha PADALA , Keshav VADREVU , Gautam CHOUDHARY , Atharv TYAGI
IPC: G06N20/00
CPC classification number: G06N20/00
Abstract: Systems and methods identify anomalous data in tabular data. A set of tabular data records is received. Each tabular data record includes data elements for a numbers of attributes, with each data element providing a value for a corresponding attribute. An anomaly score is generated for each data element of each tabular data record. Additionally, an evidence set is defined for each attribute and each tabular data record based on the anomaly scores for the data elements. An anomaly score is generated for each attribute and each tabular data record using the evidence sets. An output is provided that identifies one or more anomalous data subsets determined based on the anomaly scores for the attributes and tabular data records. Each anomalous data subset identifies a subset of attributes and a subset of tabular data records.
-
公开(公告)号:US20240386002A1
公开(公告)日:2024-11-21
申请号:US18319748
申请日:2023-05-18
Applicant: Adobe Inc.
Inventor: Raunak Shah , Koyel MUKHERJEE , Subrata MITRA , Dhruv JOSHI , Sai KARNAM , Shivam Pravin BHOSALE
IPC: G06F16/215 , G06F16/28 , G06F40/284
Abstract: A dataset comprising tables is received. Embeddings are generated for column titles of a table. Based on the embeddings, similar tables are clustered. The tables are organized into smaller clusters based on statistical similarities. Similarity scores are calculated for tables within the same cluster. A relatedness graph is created based on the similarity scores; similar tables are represented by nodes connected by edges. If the similarity score for a pair of tables exceeds a threshold, a table is deleted.
-
公开(公告)号:US20250103912A1
公开(公告)日:2025-03-27
申请号:US18471996
申请日:2023-09-21
Applicant: Adobe Inc.
Inventor: Raunak SHAH , Vibhor PORWAL , Koyel MUKHERJEE , Iftikhar Ahamath BURHANUDDIN , Saurabh MAHAPATRA , Annamalai ANNAMALAI , Fan DU
IPC: G06N5/022
Abstract: A data insight generation system generates facts from a dataset. Importance scores are determined for the facts. Facts having the highest importance scores are generated for display at a user interface. A selection of a displayed fact is received. Based on the selection, dependent facts are generated by adding subspaces to the selected fact. The dependent facts are generated for display at the user interface.
-
公开(公告)号:US20250005075A1
公开(公告)日:2025-01-02
申请号:US18342474
申请日:2023-06-27
Applicant: Adobe Inc.
Inventor: Sachin Kumar Chauhan , Subrata MITRA , Sunav CHOUDHARY , Ramasuri NARAYANAM , Koyel MUKHERJEE , Gautam Pratap KOWSHIK
IPC: G06F16/901
Abstract: Tabular data is received. A graph is created based on the tabular data. The graph comprises nodes corresponding to key-value pairs of the tabular data. Weights are assigned to the nodes and to edges that connect the nodes. The node and edge weights are updated using a message-passing neural network (MPNN) framework. The resulting graph is sampled based on the updated weights.
-
-
-