Deriving and augmenting access control for data lakes

    公开(公告)号:US12235988B2

    公开(公告)日:2025-02-25

    申请号:US18209714

    申请日:2023-06-14

    Applicant: SAP SE

    Abstract: In an example embodiment, access to a data set in a data lake can be specified using several approaches, based on the metadata and information attached. The metadata may be replicated from the original data source of the underlying data, and additional metadata may be modeled and stored to construct linkage information between data types. This linkage information may be used to automatically grant access to users to additional objects that are linked to objects that the user has explicit access to.

    Rebalancing engine for use in rebalancing files in a distributed storage systems

    公开(公告)号:US12229084B2

    公开(公告)日:2025-02-18

    申请号:US18194860

    申请日:2023-04-03

    Applicant: NetApp, Inc.

    Abstract: Redistribution of files in a containerized distributed file system is disclosed. Containers each have an engine and a scanner and each of the containers stores files and parameters for characteristics of files stored on the container. A first engine in a first container monitors characteristics of files stored on the first container and, responsive to determining that the parameters for files on the first container exceed one or more predetermined thresholds, communicates with a second engine in a second container to determine a destination container for one or more files from the first container. The second engine in the second container indicates to the first engine in the first container whether the second container is available to receive one or more files from the first container. The first engine triggers file system scanning by the scanner of the first container to identify files to be moved to the second container.

    METADATA CONTROL IN A LOAD-BALANCED DISTRIBUTED STORAGE SYSTEM

    公开(公告)号:US20250053545A1

    公开(公告)日:2025-02-13

    申请号:US18928476

    申请日:2024-10-28

    Applicant: Weka.IO LTD

    Abstract: A plurality of computing devices are communicatively coupled to each other via a network, and each of the plurality of computing devices is operably coupled to one or more of a plurality of storage devices. A plurality of failure resilient address spaces are distributed across the plurality of storage devices such that each of the plurality of failure resilient address spaces spans a plurality of the storage devices. The plurality of computing devices maintains metadata that maps each failure resilient address space to one of the plurality of computing devices. The metadata is grouped into buckets. Each bucket is stored in a group of computing devices. However, only the leader of the group is able to directly access a particular bucket at any given time.

    Revisions to smart files
    4.
    发明授权

    公开(公告)号:US12222902B1

    公开(公告)日:2025-02-11

    申请号:US18480287

    申请日:2023-10-03

    Abstract: Disclosed implementations include systems and methods to efficiently determine a block or blocks of a smart file that have changed, without accessing the block or blocks. For example, the disclosed implementations generate and maintain a nested hash value tree that includes block hash values for at least some blocks of a smart file, and element hash values for at least some elements of the blocks of the smart file for which a block hash value is maintained. The nested hash value tree may be traversed to determine elements and/or blocks of the smart file that have changed without having to access or process the smart file. Still further, the elements and/or blocks of the smart file that have changed may be processed without processing other elements and/or blocks of the smart file to determine the actual change(s) to the smart file.

    File storage location determining method and apparatus, and terminal

    公开(公告)号:US12222897B2

    公开(公告)日:2025-02-11

    申请号:US18017019

    申请日:2021-07-26

    Abstract: Embodiments of this application are applicable to the field of terminal technologies, and provide a file storage location determining method and apparatus, and a terminal. The method includes: A first terminal determines a plurality of candidate paths corresponding to a target file. Each candidate path points to one file storage location. The first terminal obtains file information of the target file. The first terminal determines, based on the file information and the plurality of candidate paths, a file storage location pointed to by a first candidate path in the plurality of candidate paths as a storage location of the target file. A matching degree of the first candidate path is a highest matching degree in matching degrees of the plurality of candidate paths. According to the foregoing method, accuracy of determining the file storage location is improved.

    System and method of filtering consumer data

    公开(公告)号:US12216795B2

    公开(公告)日:2025-02-04

    申请号:US18500773

    申请日:2023-11-02

    Inventor: Michael Cook

    Abstract: A system may include an interface configured to couple to a network, and includes a processor and a memory accessible to the processor. The memory may be configured to store instructions that, when executed, cause the processor to process search results corresponding to multiple data owners to selectively filter personally identifiable information (PII) associated with one or more consumers from the set of search results according to data sharing permissions for each of the data owners to produce filtered results. The instructions may further cause the processor to provide the filtered results to a user device through the network.

    Distributing data amongst storage components using data sensitivity classifications

    公开(公告)号:US12216778B2

    公开(公告)日:2025-02-04

    申请号:US16743863

    申请日:2020-01-15

    Abstract: Described is a system for distributing data amongst storage components using data sensitivity (or security) classifications. The system may define categories for classifying data files and assign a sensitivity (or security) rating to each of the defined categories. The categories and/or associated sensitivity ratings may be determined using machine learning components that may leverage industry-specific information or data sensitivity information used by other clients. The system may then continuously reevaluate (or reclassify) data files to determine whether they are stored on a storage component that meets the necessary data sensitivity requirements. If the system determines particular data files are stored on a corresponding storage component that does not meet certain data sensitivity requirements, the system may perform an action to secure the particular data files.

    File storage method, terminal, and storage medium

    公开(公告)号:US12212696B2

    公开(公告)日:2025-01-28

    申请号:US17772206

    申请日:2020-03-04

    Inventor: Penghui Chai Gui Fu

    Abstract: Embodiments of the present disclosure disclose a file storage method, terminal, and storage medium. The file storage method includes: obtaining a to-be-stored file, performing splitting processing on the to-be-stored file to obtain N sub-files corresponding to the to-be-stored file, wherein N is an integer greater than or equal to 1; sending the N sub-files to an IPFS, and receiving M pieces of address information corresponding to the N sub-files returned by the IPFS, wherein M is an integer greater than or equal to 1 and less than or equal to N; generating an address set corresponding to the to-be-stored file according to the M pieces of address information, and encrypting the address set to obtain an address set ciphertext; sending the address set ciphertext to a blockchain network and receiving a target index value returned by the blockchain network, wherein the target index value is used to identify the address set ciphertext.

    Document tracking through version hash linked graphs

    公开(公告)号:US12210822B2

    公开(公告)日:2025-01-28

    申请号:US18058149

    申请日:2022-11-22

    Applicant: Autodesk, Inc.

    Abstract: Embodiments of the invention provide the ability to track document versioning. Before executing an open operation on a first document version, a first before-hash is generated. After executing the open operation, a first after-hash is generated. Before executing a save operation, the first before-hash is acquired, and after execution (resulting in a second document version), a second after-hash of the second document version is generated. A version hash linked graph (VHLG) is generated and includes document nodes for the different document versions where each node includes a hash of that document version, a user-application node corresponding to the user or application that executed the operations, and edges connecting the nodes (e.g., that identify the operation and/or the document lineage) Based on the VHLG, a full history of a document is provided.

Patent Agency Ranking