Storing and querying multidimensional data using first and second indicies

    公开(公告)号:US10318512B2

    公开(公告)日:2019-06-11

    申请号:US14747318

    申请日:2015-06-23

    Abstract: The present disclosure relates to methods and systems for storing and querying data. According to the embodiments of the present invention, two-layer indexes are created for multi-dimension data, wherein the primary index is created based on two or more dimensions to retrieve respective data units of the data, while the secondary index is created based on specific dimensions to retrieve respective data blocks in the data unit. Correspondingly, when receiving a multi-dimension query request for data, the primary retrieval first determines a data unit including the target data based on a primary index, and then the secondary retrieval quickly locates a data block including the target data based on the secondary index. In this way, the multi-dimension retrieval can be efficiently performed. Moreover, by appropriately setting the size of a smallest data block, the I/O efficiency of data access will be significantly enhanced.

    Searching in a database
    4.
    发明授权

    公开(公告)号:US10042888B2

    公开(公告)日:2018-08-07

    申请号:US14931718

    申请日:2015-11-03

    Abstract: A computer-implemented method for searching in a database is provided according to one embodiment. The method includes, in response to receiving a search request to search in a database, extracting from the search request a condition for searching in the database. The method further includes selecting a search algorithm matching the condition from a plurality of search algorithms registered to the database, based on historical statistic data of historical search conducted on the database. Moreover, the method includes obtaining a search result from the database by using the search algorithm. The database is a time series database.

    DATA CUBE GENERATION
    5.
    发明申请
    DATA CUBE GENERATION 有权
    数据库生成

    公开(公告)号:US20170046370A1

    公开(公告)日:2017-02-16

    申请号:US14825132

    申请日:2015-08-12

    CPC classification number: G06F17/30333 G06F17/30572

    Abstract: Disclosed are a computer-implemented method for generating a data cube from data, a system and a computer program product. The method comprises selecting a candidate granularity from a plurality of candidate granularities determined for a dimension of the data cube, where a data distribution obtained in the selected candidate granularity satisfies a predetermined condition; and generating the data cube based on the selected candidate granularity for the dimension.

    Abstract translation: 公开了一种用于从数据,系统和计算机程序产品生成数据立方体的计算机实现的方法。 该方法包括从为数据立方体的维度确定的多个候选粒度中选择候选粒度,其中以所选候选粒度获得的数据分布满足预定条件; 以及基于所选尺寸的候选粒度来生成数据立方体。

    Detecting an abnormal subsequence in a data sequence
    6.
    发明授权
    Detecting an abnormal subsequence in a data sequence 有权
    检测数据序列中的异常子序列

    公开(公告)号:US09552243B2

    公开(公告)日:2017-01-24

    申请号:US14598843

    申请日:2015-01-16

    CPC classification number: G06F11/0751 G05B23/0232

    Abstract: A method for detecting abnormal subsequences in data sequence includes constructing a hierarchical data structure of a target subsequence, each node in a bottommost layer of the data structure storing corresponding data of the target subsequence, and each node in a layer above the bottommost layer storing values based on data stored in corresponding nodes in a lower layer next to the layer above the bottommost layer; determining a second number of neighbors of the target subsequence based on the data structure of the target subsequence and of the first number of reference subsequences constructed in advance, the second number of neighbors having minimum Euclidean distances from the target subsequence; determining a third number of neighbors of each reference subsequence in the second number of reference subsequences, which have minimum Euclidean distances from each reference subsequence and determining whether the target subsequence is an abnormal subsequence.

    Abstract translation: 一种用于检测数据序列中的异常子序列的方法,包括构建目标子序列的分层数据结构,存储目标子序列的对应数据的数据结构的最底层中的每个节点,以及最下层上的层中的每个节点存储值 基于存储在最下层上方的层之下的下层中的相应节点中的数据; 基于目标子序列的数据结构和预先构造的第一数目的参考子序列来确定目标子序列的第二数目的邻居,具有距离目标子序列的最小欧几里得距离的第二数目的邻居; 确定第二数量的参考子序列中每个参考子序列的第三数量的邻域,其具有从每个参考子序列的最小欧几里得距离,并确定目标子序列是否是异常子序列。

    SEARCHING IN A DATABASE
    7.
    发明申请
    SEARCHING IN A DATABASE 审中-公开
    在数据库中搜索

    公开(公告)号:US20160154852A1

    公开(公告)日:2016-06-02

    申请号:US14931718

    申请日:2015-11-03

    Abstract: A computer-implemented method for searching in a database is provided according to one embodiment. The method includes, in response to receiving a search request to search in a database, extracting from the search request a condition for searching in the database. The method further includes selecting a search algorithm matching the condition from a plurality of search algorithms registered to the database, based on historical statistic data of historical search conducted on the database. Moreover, the method includes obtaining a search result from the database by using the search algorithm. The database is a time series database.

    Abstract translation: 根据一个实施例,提供了一种用于在数据库中搜索的计算机实现的方法。 该方法响应于接收到在数据库中搜索的搜索请求,从搜索请求中提取用于在数据库中搜索的条件。 该方法还包括基于在数据库上进行的历史搜索的历史统计数据,从登记到数据库的多个搜索算法中选择匹配条件的搜索算法。 此外,该方法包括通过使用搜索算法从数据库获得搜索结果。 数据库是一个时间序列数据库。

    DATA PROCESSING DEVICE AND METHOD
    8.
    发明申请
    DATA PROCESSING DEVICE AND METHOD 审中-公开
    数据处理装置和方法

    公开(公告)号:US20160124932A1

    公开(公告)日:2016-05-05

    申请号:US14918786

    申请日:2015-10-21

    CPC classification number: G06F17/246 G06F16/2423 G06F16/2452 G06F16/248

    Abstract: Data processing device and method. The device includes: a spreadsheet of data displaying row for displaying a part of data retrieved from a database and a hyper row for expressing the remaining data; a data processor configured to calculate the value of the formula based on the data retrieved from the database. According to the device and method of the present invention, it is possible to eliminate overhead for loading data from the database to the spreadsheet when there are massive data records, continuously update the resulting data, and minimize users' development and migration cost.

    Abstract translation: 数据处理装置及方法。 该设备包括:数据显示行的电子表格,用于显示从数据库检索的数据的一部分,以及用于表示剩余数据的超级行; 配置为基于从数据库检索的数据来计算公式的值的数据处理器。 根据本发明的装置和方法,当存在海量数据记录,连续地更新所得到的数据并最小化用户的开发和迁移成本时,可以消除将数据从数据库加载到电子表格的开销。

    DETECTING AN ABNORMAL SUBSEQUENCE IN A DATA SEQUENCE
    9.
    发明申请
    DETECTING AN ABNORMAL SUBSEQUENCE IN A DATA SEQUENCE 有权
    检测数据序列中的异常后续

    公开(公告)号:US20150286516A1

    公开(公告)日:2015-10-08

    申请号:US14741819

    申请日:2015-06-17

    CPC classification number: G06F11/0751 G05B23/0232

    Abstract: A method for detecting abnormal subsequences in data sequence includes constructing a hierarchical data structure of a target subsequence, each node in a bottommost layer of the data structure storing corresponding data of the target subsequence, and each node in a layer above the bottommost layer storing values based on data stored in corresponding nodes in a lower layer next to the layer above the bottommost layer; determining a second number of neighbors of the target subsequence based on the data structure of the target subsequence and of the first number of reference subsequences constructed in advance, the second number of neighbors having minimum Euclidean distances from the target subsequence; determining a third number of neighbors of each reference subsequence in the second number of reference subsequences, which have minimum Euclidean distances from each reference subsequence and determining whether the target subsequence is an abnormal subsequence.

    Abstract translation: 一种用于检测数据序列中的异常子序列的方法,包括构建目标子序列的分层数据结构,存储目标子序列的对应数据的数据结构的最底层中的每个节点,以及最下层上的层中的每个节点存储值 基于存储在最下层上方的层之下的下层中的相应节点中的数据; 基于目标子序列的数据结构和预先构造的第一数目的参考子序列来确定目标子序列的第二数目的邻居,具有距离目标子序列的最小欧几里得距离的第二数目的邻居; 确定第二数量的参考子序列中每个参考子序列的第三数量的邻域,其具有从每个参考子序列的最小欧几里得距离,并确定目标子序列是否是异常子序列。

    METHOD AND APPARATUS FOR MANAGING TIME SERIES DATABASE
    10.
    发明申请
    METHOD AND APPARATUS FOR MANAGING TIME SERIES DATABASE 审中-公开
    用于管理时间序列数据库的方法和装置

    公开(公告)号:US20150095381A1

    公开(公告)日:2015-04-02

    申请号:US14492423

    申请日:2014-09-22

    Abstract: A method for managing a time series database, includes: monitoring multiple operations that access the time series database, so as to identify types of the multiple operations, the types of the multiple operations comprising at least one of the query types or insert types; with respect to a storage mode among multiple storage modes, obtaining costs that the multiple operations access the time series database based on the types, respectively; selecting a storage mode with the minimum cost from the multiple storage modes; and during a predetermined time period, storing into the time series database data values that are collected from multiple measurement points according to the selected storage mode. In one embodiment, there is provided an apparatus for managing a time series database. By means of the method and apparatus of the present invention, the storage and query efficiency with respect to the time series database can be increased.

    Abstract translation: 一种用于管理时间序列数据库的方法,包括:监视访问时间序列数据库的多个操作,以便识别多个操作的类型,包括至少一个查询类型或插入类型的多个操作的类型; 相对于多个存储模式中的存储模式,分别获得多个操作基于类型访问时间序列数据库的成本; 从多种存储模式中选择具有最小成本的存储模式; 并且在预定时间段期间,根据所选择的存储模式将从多个测量点收集的时间序列数据库数据值存储。 在一个实施例中,提供了一种用于管理时间序列数据库的装置。 通过本发明的方法和装置,可以提高相对于时间序列数据库的存储和查询效率。

Patent Agency Ranking