-
公开(公告)号:US12271353B2
公开(公告)日:2025-04-08
申请号:US17809606
申请日:2022-06-29
Applicant: Bytedance Inc.
Inventor: David Alan Johnston , Andrew James , Pradhee Tandon , Sivaramakrishnan Natarajan
IPC: G06F16/215 , G06F3/0481 , G06F16/28 , G06F16/951
Abstract: In general, embodiments of the present invention provide systems and computer readable media for implementing a single data integration platform that supports multiple data access interfaces to a single corpus of stored dynamic data collected from multiple data sources. In embodiments, the data integration platform includes a record tables layer that stores a group of data records and supports a CRUD interface for accessing the data records; a resolution mapping layer that stores a set of entities generated by a many-to-one mapping of data records to entities using entity resolution; and an entities layer that stores resolved entities which may be accessed via either a search interface based on search criteria or a hybrid search interface that supports “get via record id” queries.
-
公开(公告)号:US20240419981A1
公开(公告)日:2024-12-19
申请号:US18750363
申请日:2024-06-21
Applicant: ByteDance Inc.
Inventor: Mark Thomas Daly , Shawn Ryan Jeffery , Matthew DeLand , Nick Pendar , Andrew James , David Johnston
IPC: G06N5/02 , G06F16/215 , G06F16/23 , G06N20/00
Abstract: In general, embodiments of the present invention provide systems, methods and computer readable media for automated dynamic data quality assessment. One aspect of the subject matter described in this specification includes the actions of receiving a data quality job including a new data sample; and, if the new data sample is determined to be added to a reservoir of data samples, sending a quality verification request to an oracle; receiving a new data sample quality estimate from the oracle; and adding the new data sample and estimate to the reservoir. A second aspect of the subject matter includes the actions of receiving, from a predictive model, a judgment associated with a new data sample; analyzing the new data sample based in part on the judgment to determine whether to send a new data sample quality verification request to an oracle; and, if a new data sample quality estimate is received from the oracle, determining whether to add the new data sample and the judgment to the reservoir.
-
公开(公告)号:US12045732B2
公开(公告)日:2024-07-23
申请号:US17684935
申请日:2022-03-02
Applicant: ByteDance Inc.
Inventor: Mark Thomas Daly , Shawn Ryan Jeffery , Matthew DeLand , Nick Pendar , Andrew James , David Johnston
IPC: G06F16/00 , G06F16/215 , G06F16/23 , G06N5/02 , G06N20/00
CPC classification number: G06N5/02 , G06F16/215 , G06F16/2358 , G06F16/2365 , G06N20/00
Abstract: In general, embodiments of the present invention provide systems, methods and computer readable media for automated dynamic data quality assessment. One aspect of the subject matter described in this specification includes the actions of receiving a data quality job including a new data sample; and, if the new data sample is determined to be added to a reservoir of data samples, sending a quality verification request to an oracle; receiving a new data sample quality estimate from the oracle; and adding the new data sample and estimate to the reservoir. A second aspect of the subject matter includes the actions of receiving, from a predictive model, a judgment associated with a new data sample; analyzing the new data sample based in part on the judgment to determine whether to send a new data sample quality verification request to an oracle; and, if a new data sample quality estimate is received from the oracle, determining whether to add the new data sample and the judgment to the reservoir.
-
-