Data processing system, computing node, and data processing method

    公开(公告)号:US10567494B2

    公开(公告)日:2020-02-18

    申请号:US15667634

    申请日:2017-08-03

    Abstract: A data processing system, a computing node, and a data processing method are provided. The data processing system includes a management node and a first class of computing nodes. The management node is configured to allocate first processing tasks to the first class of computing nodes. At least two computing nodes in the first class of computing nodes concurrently perform the first processing tasks allocated by the management node. A computing node performs a combine2 operation and a reduce2 operation on a data block Mx and a data block V1x, to obtain a first intermediate result. Then, the management node obtains a processing result for a to-be-processed dataset according to first intermediate results obtained by the first class of computing nodes. According to the data processing system, when a combine operation and a reduce operation are being performed on data blocks, memory space occupied by computation can be reduced.

    Mapreduce job resource sizing using assessment models

    公开(公告)号:US11003507B2

    公开(公告)日:2021-05-11

    申请号:US16369564

    申请日:2019-03-29

    Abstract: The present disclosure relates to computing resource allocation methods, devices, and systems. One example system includes a management node and a target computing node. The management node is configured to obtain M computing tasks and establish a resource assessment model, and send one or more computing tasks of the M computing tasks and information about the resource assessment model to the target computing node. The target computing node is configured to receive the one or more computing tasks and the information about the resource assessment model, substitute input data of a particular computing stage of a target task into the resource assessment model to compute a resource size required for the particular computing stage, and compute the input data by using a computing resource that is of the resource size and that is in a preset resource pool.

    Resource allocation method and apparatus for gene analysis

    公开(公告)号:US10853135B2

    公开(公告)日:2020-12-01

    申请号:US16153099

    申请日:2018-10-05

    Abstract: The present disclosure relates to a resource allocation method for gene analysis. In one example method, a parameter value that is of the target chromosome region and that is used for resource allocation is obtained according to a sequenced read in a target chromosome region. A computing resource is allocated, according to the parameter value that is of the target chromosome region and that is used for resource allocation, to an operation in a cleansing and variant calling task that is in the gene analysis and that is performed on the sequenced read in the target chromosome region.

    Graph data query method and apparatus

    公开(公告)号:US10068033B2

    公开(公告)日:2018-09-04

    申请号:US15196802

    申请日:2016-06-29

    Abstract: A graph data query method and apparatus are disclosed, where the method includes: acquiring a partition number and a layer number of a query vertex; determining, based on the partition number and the layer number of the query vertex, a partition number and a layer number of a candidate vertex indicated by a query condition, and using the partition number and the layer number of the candidate vertex respectively as a candidate partition number and a candidate layer number; forming a candidate set using a vertex whose partition number and layer number satisfy any group of a candidate partition number and a candidate layer number; and performing graph data query in the candidate set according to the query condition.

    GRAPH DATA QUERY METHOD AND APPARATUS
    5.
    发明申请
    GRAPH DATA QUERY METHOD AND APPARATUS 审中-公开
    图形数据查询方法和装置

    公开(公告)号:US20160306897A1

    公开(公告)日:2016-10-20

    申请号:US15196802

    申请日:2016-06-29

    CPC classification number: G06F17/30958 G06F17/30486

    Abstract: A graph data query method and apparatus are disclosed, where the method includes: acquiring a partition number and a layer number of a query vertex; determining, based on the partition number and the layer number of the query vertex, a partition number and a layer number of a candidate vertex indicated by a query condition, and using the partition number and the layer number of the candidate vertex respectively as a candidate partition number and a candidate layer number; forming a candidate set using a vertex whose partition number and layer number satisfy any group of a candidate partition number and a candidate layer number; and performing graph data query in the candidate set according to the query condition.

    Abstract translation: 公开了一种图形数据查询方法和装置,其中所述方法包括:获取查询顶点的分区号和层号; 基于查询顶点的分区号和层号确定由查询条件指示的候选顶点的分区号和层编号,并分别使用候选顶点的分区号和层编号作为候选 分区号和候选层号; 使用其分区号和层号满足候选分区号和候选层号的任何组的顶点形成候选集; 并根据查询条件在候选集中执行图形数据查询。

    Data Processing Method and Apparatus
    7.
    发明申请

    公开(公告)号:US20190156917A1

    公开(公告)日:2019-05-23

    申请号:US16251920

    申请日:2019-01-18

    Abstract: A data processing method includes traversing all sample fragments in a first sample set and collecting statistics about a first statistic of each basic element in a reference sample and included in the sample fragments, determining that a position of a basic element in the reference sample whose first statistic is less than a first threshold is a spacing position, dividing the reference sample into at least two reference sub-samples, traversing all the sample fragments in the first sample set and collecting statistics about a second statistic of each reference sub-sample of the reference sample and including the sample fragments, and combining adjacent reference sub-samples when a sum of second statistics of the adjacent reference sub-samples is less than a second threshold.

    Data Processing Method and Apparatus, and Computing Node

    公开(公告)号:US20190156916A1

    公开(公告)日:2019-05-23

    申请号:US16251829

    申请日:2019-01-18

    Abstract: A data processing method includes distributing, by a computing node, a pasting back result sequence corresponding to a to-be-pasted-back deoxyribonucleic acid (DNA) read string to a pasting back result sequence set corresponding to a target chromosome region, when the quantity of the pasting back result sequences included in the pasting back result sequence set is greater than or equal to the pre-determined quantity threshold, dividing the pasting back result sequence set into k pasting back result sequence subsets according to a preset division rule, and dividing the target chromosome region into k chromosome subregions in a one-to-one correspondence to the k pasting back result sequence subsets, and further dividing a gene analysis task of the pasting back result sequence set into k gene analysis subtasks, and executing in parallel the k gene analysis subtasks.

Patent Agency Ranking