Data annotation method and system for unstructured data integrating with data catalog

    公开(公告)号:US12038899B1

    公开(公告)日:2024-07-16

    申请号:US18213303

    申请日:2023-06-23

    申请人: HITACHI, Ltd.

    发明人: Satoru Watanabe

    IPC分类号: G06F16/22 G06F16/31

    CPC分类号: G06F16/2291 G06F16/313

    摘要: A method for performing data management that includes generating, using a processor of an agent server, terms for structured data files using structured data profiling and terms for unstructured data files using unstructured data profiling, wherein the structured data files and the unstructured data files are stored in a storage; and managing a term list, wherein the term list stores terms generated by the processor, wherein the processor utilizes terms generated through structured data profiling in deriving terms generated through unstructured data profiling.

    STORAGE SYSTEM AND DATA CACHE METHOD
    2.
    发明公开

    公开(公告)号:US20230297575A1

    公开(公告)日:2023-09-21

    申请号:US17893570

    申请日:2022-08-23

    申请人: Hitachi, Ltd.

    摘要: A database management system identifies a required column which is required for executing the query, reads out data of the identified required column from a storage device, and executes the query based on the data of the required column. When reading out the data of the required column, the database management system preferentially reads out the data of the required column from a high-speed storage device storing the data of the required column among a memory, a second storage, and a first storage, stores, in the memory, data of the second data size unit including the data of the required column used for executing the query, and, when the data of the required column is read out from the first storage, stores the data of the second data size unit in the memory and stores the read-out data of the first data size unit in the second storage.

    DATA PROCESSING DEVICE AND DATA PROCESSING METHOD

    公开(公告)号:US20200265052A1

    公开(公告)日:2020-08-20

    申请号:US16549447

    申请日:2019-08-23

    申请人: HITACHI, LTD.

    摘要: It is possible execute processing large-scale data and improve the processing efficiency while suppressing the complexity of a hardware circuit. A data processing device includes a processor and a FPGA connected to the processor. The processor is configured to acquire a query plan including target identification information identifying data to be processed and a processing detail for the data to be processed, generate, based on the query plan, a plurality of FPGA commands to process a plurality of row group data items constituting the data identified by the target identification information and to be processed, and transmit the FPGA commands to the FPGA. The FPGA is configured to execute processing on the row group data items based on the transmitted FPGA commands and return results of executing the processing to the processor.

    INFORMATION PROCESSOR, INFORMATION PROCESSING SYSTEM, AND METHOD OF PROCESSING INFORMATION

    公开(公告)号:US20200285520A1

    公开(公告)日:2020-09-10

    申请号:US16552146

    申请日:2019-08-27

    申请人: HITACHI, LTD.

    摘要: Processing performance is improved through introduction of accelerators and availability of the system is enhanced during introduction of the accelerators. A worker node includes a processor such as CPU, an accelerator that executes accelerator processing on a command, and a software model that operates on the CPU and executes software model processing of the command. In the worker node, the CPU breaks down an accelerator operator included in a query plan into a plurality of accelerator commands, sends each of the accelerator commands to the accelerator or the software model, and switches the destination of the accelerator command from the accelerator to the software model when a switching condition for changing the processing component of the accelerator command is satisfied.

    Computer system and query processing method

    公开(公告)号:US12056129B2

    公开(公告)日:2024-08-06

    申请号:US18113154

    申请日:2023-02-23

    申请人: Hitachi, Ltd.

    摘要: The processing load for joining a plurality of tables by hash join is reduced for a computer system in which the CPU of a node creates a partial bloom filter that manages a first table hash value of a joining key of a row corresponding to a query in an assigned row of a build table. An integrated bloom filter is created from a plurality of partial bloom filters, and a second table hash value of the joining key of the row corresponding to the condition of the query among the rows of a probe table is calculated. The row of the probe table is transmitted to the node containing a row of the build table of the join hash value for that row when the integrated bloom filter includes an identical first table hash value, and an integrated joined table is created and returned to the query request source.

    Computing system and server
    6.
    发明授权

    公开(公告)号:US10789253B2

    公开(公告)日:2020-09-29

    申请号:US16081023

    申请日:2016-04-27

    申请人: Hitachi, Ltd.

    IPC分类号: G06F16/245 G06F3/06 G06F9/38

    摘要: A storage device, connected to a computer including a processor and first memory, and executing a program, stores data processed under the program. The computer includes a protocol processing unit that accesses data in the storage device, an accelerator that includes an arithmetic unit executing a part of a process of the program, and a second memory storing data, and executes the part of the process. The first memory receives a processing request for processing data, and causes the accelerator to execute a command to process data, corresponding to the processing request for the processing request including a process to be executed by the arithmetic unit. The accelerator requests the protocol processing unit to provide target data indicated by a command received from the program, reads data from the storage device via the protocol processing unit, and stores the data in the second memory. The arithmetic unit executes the command.

    Storage device, computer system, and control method for storage device

    公开(公告)号:US10803035B2

    公开(公告)日:2020-10-13

    申请号:US15508019

    申请日:2015-03-27

    申请人: Hitachi, Ltd.

    摘要: A storage device for storing a column store database, the storage device comprising: a column read unit which reads page data to be searched that have been read from the column store database, acquires a leading row number included in the page data, and reads each column of data in the page data, sequentially from the leading row number to the last row in the column of data; a data search unit which compares each row in each read column of data with first search criteria, from the first row to the last row, and outputs a comparison result; and a search result aggregation unit which, when a comparison result for a range of columns specified by a search request has been output, compares each row in the comparison result with second search criteria, and determines one or more rows in the comparison result that satisfy the second search criteria.

    Computer system and computer system control method

    公开(公告)号:US10353768B2

    公开(公告)日:2019-07-16

    申请号:US15571050

    申请日:2015-06-29

    申请人: Hitachi, Ltd.

    摘要: A computer including a processor and a memory and a storage device that is connected to the computer and stores data has an FPGA that acquires data and an operation command from a control unit that controls reading and writing with respect to a non-volatile semiconductor storage unit to perform a data operation. The computer generates and transmits the operation command from an access request that has been received to the storage device. The computer receives execution results for the operation command from the storage device, and when the number of execution results for the operation command reaches a prescribed value, instructs the FPGA to detect a soft error, receives all execution results with respect to the generated operation command, and if there is no soft error, transmits the execution results.