DATA MODIFICATION METHOD AND INFORMATION PROCESSING APPARATUS

    公开(公告)号:US20230237036A1

    公开(公告)日:2023-07-27

    申请号:US18059173

    申请日:2022-11-28

    CPC classification number: G06F16/2272 G06F16/23

    Abstract: A computer-readable recording medium having stored therein a data modification program executable by one or more computers, the data modification program includes: an instruction for specifying, from a plurality of attributes included in training data, a first attribute having a causal relation with a second attribute included in the plurality of attributes; and an instruction for modifying values of the first attribute in the training data in accordance with a condition for reducing a difference between distributions of the values of the first attribute corresponding to each value of the second attribute.

    BRANCHING FOR TREE STRUCTURE IN DATABASE SYSTEM

    公开(公告)号:US20230195705A1

    公开(公告)日:2023-06-22

    申请号:US17555979

    申请日:2021-12-20

    Applicant: SAP SE

    CPC classification number: G06F16/2246 G06F16/2272 G06F16/245

    Abstract: In some embodiments, a method determines a query distinction bit (D-bit) slice for a query key using values at D-bit positions that are associated with a node in the data structure. D-bit positions are determined based on branches in the data structure. The method selects a D-bit slice for a key in the set of keys for the node based on the D-bit slice of the query key and compares a key value for the key to a query key value for the query key to determine a first D-bit position value. A D-bit position that has a second D-bit position value that is smaller in value than the first D-bit position value is selected. The D-bit position is used to determine a result for the query key.

    Compression/decompression using index correlating uncompressed/compressed content

    公开(公告)号:US11675768B2

    公开(公告)日:2023-06-13

    申请号:US16876990

    申请日:2020-05-18

    CPC classification number: G06F16/2272 G06F16/2365

    Abstract: Compression of data that permits direct reconstruction of arbitrary portions of the uncompressed data. Also, the direct reconstruction of arbitrary portions of the uncompressed data. Conventional compression is done such that decompression has to begin either at the very beginning of the data, or at particular intervals (e.g., at block boundaries—every 64 kilobytes) within the data. However, the principles described herein permit decompression to begin at any point within the compressed data, without having to decompress any prior portion of the file. Thus, the principles described herein permit random access of the compressed data. In accordance with the principles described herein, this is accomplished by using an index that correlates positions within the uncompressed data with positions within the compressed data.

    Hash based rollup with passthrough
    67.
    发明授权

    公开(公告)号:US11675767B1

    公开(公告)日:2023-06-13

    申请号:US17099467

    申请日:2020-11-16

    Abstract: A system includes a plurality of computing units. A first computing unit of the plurality of computing units comprises: a communication interface configured to receive an indication to roll up data in a data table; and a processor coupled to the communication interface and configured to: build a preaggregation hash table based at least in part on a set of columns and the data table by aggregating input rows of the data table; for each preaggregated hash table entry of the preaggregated hash table: provide the preaggregated hash table entry to a second computing unit of the plurality of computing units based at least in part on a distribution hash value; receive a set of received entries from computing units of the plurality of computing units; and build an aggregation hash table based at least in part on the set of received entries by aggregating the set of received entries.

    ADMINISTRATION OF SERVICES EXECUTING IN CLOUD PLATFORM BASED DATACENTERS USING TOKEN WITH DATA STRUCTURE

    公开(公告)号:US20230171244A1

    公开(公告)日:2023-06-01

    申请号:US17537234

    申请日:2021-11-29

    CPC classification number: H04L63/083 H04L9/3247 H04L67/40 G06F16/2272

    Abstract: A cloud infrastructure is configured and deployed for managing services executed on a cloud platform. The cloud infrastructure includes a control datacenter configured to communicate with one or more service datacenters. The service datacenter deploys one or more application programming interfaces (API's) associated with a service. The service datacenter also deploys an administration agent. The control datacenter hosts an engine that receives requests from users to perform administration operations by invoking the administration API's. In this manner, the control datacenter functions as a centralized control mechanism that effectively distributes administration operation requests as they are received from users to service datacenters that can service the requests. The cloud infrastructure provides an auditable, compliant and secure management system for administering services for distributed systems running in the cloud.

    LOAD BALANCING FOR DISTRIBUTED PROCESSING OF DETERMINISTICALLY ASSIGNED DATA USING STATISTICAL ANALYSIS OF BLOCK DATA

    公开(公告)号:US20190236474A1

    公开(公告)日:2019-08-01

    申请号:US15881191

    申请日:2018-01-26

    CPC classification number: G06N7/005 G06F16/2272 G06N7/02

    Abstract: Dynamic generation and implementation of assignment mappings of data items in large data files to distributed processors to achieve objectives such as reduced overall processing time like. Any appropriate key (e.g., character string) can be identified or obtained for each data item in a data file and the file can be segmented into sequential data blocks, where each data block includes a set of data items. The data items in each of a first plurality of the blocks (e.g., sampled block set) may be initially sorted into one of a plurality of key ranges of a search space (each corresponding to a different respective processor) and analyses conducted on the data items totals in each key range. The key range boundaries can be adjusted by accounting for uncertainty in the sample estimates to more evenly distribute data items from all blocks sent to each processor and thereby achieve the objective.

Patent Agency Ranking