-
公开(公告)号:US10318512B2
公开(公告)日:2019-06-11
申请号:US14747318
申请日:2015-06-23
Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
Inventor: Xiao Yan Chen , Yao Liang Chen , Sheng Huang , Kai Liu , Wei Lu , Xiao Min Xu
IPC: G06F16/00 , G06F16/22 , G06F16/50 , G06F16/2455 , G06F16/51
Abstract: The present disclosure relates to methods and systems for storing and querying data. According to the embodiments of the present invention, two-layer indexes are created for multi-dimension data, wherein the primary index is created based on two or more dimensions to retrieve respective data units of the data, while the secondary index is created based on specific dimensions to retrieve respective data blocks in the data unit. Correspondingly, when receiving a multi-dimension query request for data, the primary retrieval first determines a data unit including the target data based on a primary index, and then the secondary retrieval quickly locates a data block including the target data based on the secondary index. In this way, the multi-dimension retrieval can be efficiently performed. Moreover, by appropriately setting the size of a smallest data block, the I/O efficiency of data access will be significantly enhanced.
-
公开(公告)号:US09483533B2
公开(公告)日:2016-11-01
申请号:US13955473
申请日:2013-07-31
Applicant: International Business Machines Corporation
Inventor: Xiao Yan Chen , Sheng Huang , Kai Liu , Chen Wang
IPC: G06F17/30
CPC classification number: G06F17/30551 , G06F17/30339
Abstract: The present invention relates to processing of time series data. There is disclosed a method and apparatus for processing time series data, the method comprising: receiving a time series data set, wherein each element of the time series data set contains a timestamp and an original value associated with the timestamp, and times represented by all timestamps constitute a time series having fixed time intervals; converting each original value into a coded value occupying a smaller storage space, according to a predetermined monotone numerical compression coding scheme; dividing the times represented by all timestamps into a plurality of time intervals having a predetermined length; assembling coded values corresponding to all timestamps within each time interval into a data package such that the data package contains coded values arranged in an order of timestamps; and storing in a database record each data package and its associated identification of a time interval.
Abstract translation: 本发明涉及时序数据的处理。 公开了一种用于处理时间序列数据的方法和装置,该方法包括:接收时间序列数据集合,其中时间序列数据集合的每个元素包含与时间戳相关联的时间戳和原始值,以及由所有时间序列数据集合表示的时间 时间戳构成具有固定时间间隔的时间序列; 根据预定的单调数字压缩编码方案将每个原始值转换为占用较小存储空间的编码值; 将由所有时间戳表示的时间划分成具有预定长度的多个时间间隔; 将对应于每个时间间隔内的所有时间戳的编码值组合成数据包,使得数据包包含以时间戳的顺序排列的编码值; 并存储在数据库中记录每个数据包及其相关联的时间间隔的标识。
-
公开(公告)号:US10969233B2
公开(公告)日:2021-04-06
申请号:US16116423
申请日:2018-08-29
Applicant: International Business Machines Corporation
Inventor: Xiao Yan Chen , Raghu K. Ganti , Sheng Huang , Kai Liu , Ramya Raghavendra , Mudhakar Srivatsa
Abstract: A method, computer system, and computer readable product for trajectory data compression are disclosed. In embodiments, the method comprises generating spatial data for one or more moving objects; projecting the data onto a network comprised of a plurality of trajectories, the network constraining movement of the one or more moving objects; and storing the projected data in a data store. In embodiments of the invention, the method further comprises translating updates and queries to the spatial data, using specified data of the network, into links to the data store, and using the links to update and query the data store. In embodiments of the invention, the specified data of the network are stored in a network store. In embodiments of the invention, each of the trajectories includes one or more sub-trajectories, and the projecting the spatial data onto a network includes projecting the spatial data onto the sub-trajectories.
-
公开(公告)号:US10366095B2
公开(公告)日:2019-07-30
申请号:US14748295
申请日:2015-06-24
Applicant: International Business Machines Corporation
Inventor: Xiao Yan Chen , Yao Liang Chen , Sheng Huang , Kai Liu , Wei Lu , Lin Hao Xu , Xiao Min Xu
IPC: G06F16/248 , G06F16/28 , G06F16/2458 , G06F17/18 , G06K9/00
Abstract: A method for processing a time series includes dividing, with a processing device, the time series into a plurality of windows by time; extracting at least one group of similar subsequences from a current window among the plurality of windows; and updating a candidate list on the basis of comparison between similar subsequences in each group of the at least one group with k characteristic subsequences in the candidate list; wherein the k characteristic subsequences are k characteristic subsequences with a greatest number of occurrences in at least processed parts of the time series.
-
公开(公告)号:US20190011272A1
公开(公告)日:2019-01-10
申请号:US16116423
申请日:2018-08-29
Applicant: International Business Machines Corporation
Inventor: Xiao Yan Chen , RAGHU K. GANTI , Sheng Huang , Kai Liu , Ramya Raghavendra , MUDHAKAR SRIVATSA
Abstract: A method, computer system, and computer readable product for trajectory data compression are disclosed. In embodiments, the method comprises generating spatial data for one or more moving objects; projecting the data onto a network comprised of a plurality of trajectories, the network constraining movement of the one or more moving objects; and storing the projected data in a data store. In embodiments of the invention, the method further comprises translating updates and queries to the spatial data, using specified data of the network, into links to the data store, and using the links to update and query the data store. In embodiments of the invention, the specified data of the network are stored in a network store. In embodiments of the invention, each of the trajectories includes one or more sub-trajectories, and the projecting the spatial data onto a network includes projecting the spatial data onto the sub-trajectories.
-
公开(公告)号:US10176208B2
公开(公告)日:2019-01-08
申请号:US15450606
申请日:2017-03-06
Applicant: International Business Machines Corporation
Inventor: Xiao Yan Chen , Sheng Huang , Kai Liu , Chen Wang
IPC: G06F17/30
Abstract: Processing time sequence data for multiple sensors, wherein the multiple sensors are divided into multiple sensor groups and each data comprises a time stamp and a value associated with the timestamp. The method comprises: receiving time series data from each sensor; assigning the time series data received to a sensor group to which the sensor belongs; storing time series data in a first database of a first memory, such that multiple time series data assigned to the same sensor group in the multiple sensor groups are stored in at least one database record of the first database; obtaining the time series data of each sensor among the multiple sensors from the first database; storing time series data in a second database of a second memory, such that the multiple time series data from the same sensor are stored in at least one database record of the second database.
-
公开(公告)号:US20140122022A1
公开(公告)日:2014-05-01
申请号:US14068559
申请日:2013-10-31
Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
Inventor: Xiao Yan Chen , Sheng Huang , Kai Liu , Chen Wang
IPC: G06F11/34
CPC classification number: G06F17/30353 , G06F3/06 , G06F17/30312 , G06F17/30424 , G06F17/30551 , G06Q40/04 , H04L67/12
Abstract: Processing time sequence data for multiple sensors, wherein the multiple sensors are divided into multiple sensor groups and each data comprises a time stamp and a value associated with the timestamp. The method comprises: receiving time series data from each sensor; assigning the time series data received to a sensor group to which the sensor belongs; storing time series data in a first database of a first memory, such that multiple time series data assigned to the same sensor group in the multiple sensor groups are stored in at least one database record of the first database; obtaining the time series data of each sensor among the multiple sensors from the first database; storing time series data in a second database of a second memory, such that the multiple time series data from the same sensor are stored in at least one database record of the second database.
Abstract translation: 处理多个传感器的时间序列数据,其中多个传感器被分成多个传感器组,并且每个数据包括时间戳和与时间戳相关联的值。 该方法包括:从每个传感器接收时间序列数据; 将接收的时间序列数据分配给传感器所属的传感器组; 将时间序列数据存储在第一存储器的第一数据库中,使得分配给多个传感器组中的相同传感器组的多个时间序列数据被存储在第一数据库的至少一个数据库记录中; 从所述第一数据库获取所述多个传感器中的每个传感器的时间序列数据; 将时间序列数据存储在第二存储器的第二数据库中,使得来自相同传感器的多个时间序列数据被存储在第二数据库的至少一个数据库记录中。
-
公开(公告)号:US20140040276A1
公开(公告)日:2014-02-06
申请号:US13955473
申请日:2013-07-31
Applicant: International Business Machines Corporation
Inventor: Xiao Yan Chen , Sheng Huang , Kai Liu , Chen Wang
IPC: G06F17/30
CPC classification number: G06F17/30551 , G06F17/30339
Abstract: The present invention relates to processing of time series data. There is disclosed a method and apparatus for processing time series data, the method comprising: receiving a time series data set, wherein each element of the time series data set contains a timestamp and an original value associated with the timestamp, and times represented by all timestamps constitute a time series having fixed time intervals; converting each original value into a coded value occupying a smaller storage space, according to a predetermined monotone numerical compression coding scheme; dividing the times represented by all timestamps into a plurality of time intervals having a predetermined length; assembling coded values corresponding to all timestamps within each time interval into a data package such that the data package contains coded values arranged in an order of timestamps; and storing in a database record each data package and its associated identification of a time interval.
Abstract translation: 本发明涉及时序数据的处理。 公开了一种用于处理时间序列数据的方法和装置,该方法包括:接收时间序列数据集合,其中时间序列数据集合的每个元素包含与时间戳相关联的时间戳和原始值,以及由所有时间序列数据集合表示的时间 时间戳构成具有固定时间间隔的时间序列; 根据预定的单调数字压缩编码方案将每个原始值转换为占用较小存储空间的编码值; 将由所有时间戳表示的时间划分成具有预定长度的多个时间间隔; 将对应于每个时间间隔内的所有时间戳的编码值组合成数据包,使得数据包包含以时间戳的顺序排列的编码值; 并存储在数据库中记录每个数据包及其相关联的时间间隔的标识。
-
公开(公告)号:US10423635B2
公开(公告)日:2019-09-24
申请号:US14721042
申请日:2015-05-26
Applicant: International Business Machines Corporation
Inventor: Xiao Yan Chen , Yao Liang Chen , Sheng Huang , Kai Liu , Wei Lu , Lin Hao Xu , Xiao Min Xu
IPC: G06F16/248 , G06F16/28 , G06F16/2458 , G06F17/18 , G06K9/00
Abstract: A method for processing a time series includes dividing, with a processing device, the time series into a plurality of windows by time; extracting at least one group of similar subsequences from a current window among the plurality of windows; and updating a candidate list on the basis of comparison between similar subsequences in each group of the at least one group with k characteristic subsequences in the candidate list; wherein the k characteristic subsequences are k characteristic subsequences with a greatest number of occurrences in at least processed parts of the time series.
-
公开(公告)号:US10282439B2
公开(公告)日:2019-05-07
申请号:US14713635
申请日:2015-05-15
Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
Inventor: Xiao Yan Chen , Yao Liang Chen , Sheng Huang , Kai Liu , Wei Lu , Xiao Min Xu
IPC: G06F17/30
Abstract: The present disclosure relates to methods and systems for storing and querying data. According to the embodiments of the present invention, two-layer indexes are created for multi-dimension data, wherein the primary index is created based on two or more dimensions to retrieve respective data units of the data, while the secondary index is created based on specific dimensions to retrieve respective data blocks in the data unit. Correspondingly, when receiving a multi-dimension query request for data, the primary retrieval first determines a data unit including the target data based on a primary index, and then the secondary retrieval quickly locates a data block including the target data based on the secondary index. In this way, the multi-dimension retrieval can be efficiently performed. Moreover, by appropriately setting the size of a smallest data block, the I/O efficiency of data access will be significantly enhanced.
-
-
-
-
-
-
-
-
-