SYSTEMS AND METHODS FOR INDEXING AND SEARCHING

    公开(公告)号:EP4421650A2

    公开(公告)日:2024-08-28

    申请号:EP24174839.1

    申请日:2018-06-08

    IPC分类号: G06F16/17

    CPC分类号: G06F16/24568 G06F16/1734

    摘要: A method, performed by one or more processors, is disclosed, the method comprising receiving a stream of log data from one or more applications and indexing a plurality of different portions of the received stream to respective locations of a cold storage system. The method may also comprise storing in an index catalog pointers to the respective locations of the indexed portions in the cold storage system. One or more requests for log data may be received, and the method may also comprise subsequently identifying from the index catalog one or more pointers to respective indexed portions appropriate to at least part of the one or more requests, and sending of the identified one or more indexed portions to one or more hot storage systems each associated with a respective search node for processing of one or more search requests.

    TECHNIQUES FOR DATA EXTRACTION
    2.
    发明公开
    TECHNIQUES FOR DATA EXTRACTION 审中-公开
    数据提取技术

    公开(公告)号:EP3279812A1

    公开(公告)日:2018-02-07

    申请号:EP17183768.5

    申请日:2017-07-28

    IPC分类号: G06F17/30

    摘要: Computer-implemented techniques for data extraction are described. The techniques include a method and system for retrieving an extraction job specification, wherein the extraction job specification comprises a source repository identifier that identifies a source repository comprising a plurality of data records; a data recipient identifier that identifies a data recipient; and a schedule that indicates a timing of when to retrieve the plurality of data records. The method and system further include retrieving the plurality of data records from the source repository based on the schedule, creating an extraction transaction from the plurality of data records, wherein the extraction transaction comprises a subset of the plurality of data records and metadata, and sending the extraction transaction to the data recipient.

    摘要翻译: 描述了用于数据提取的计算机实现的技术。 该技术包括用于检索提取作业规范的方法和系统,其中提取作业规范包括标识包括多个数据记录的源存储库的源存储库标识符; 识别数据接收者的数据接收者标识符; 以及指示何时检索多个数据记录的定时的时间表。 所述方法和系统进一步包括基于所述调度从所述源储存库检索所述多个数据记录,从所述多个数据记录中创建提取事务,其中所述提取事务包括所述多个数据记录和元数据的子集, 提取事务给数据接收者。

    DATA REVISION CONTROL IN LARGE-SCALE DATA ANALYTIC SYSTEMS
    3.
    发明公开
    DATA REVISION CONTROL IN LARGE-SCALE DATA ANALYTIC SYSTEMS 审中-公开
    大规模数据分析系统中的数据修改控制

    公开(公告)号:EP3258393A1

    公开(公告)日:2017-12-20

    申请号:EP16194936.7

    申请日:2016-10-20

    发明人: FINK, Robert

    IPC分类号: G06F17/30

    摘要: Computer-implemented techniques for data revision control in large-scale data analytic systems. In one embodiment, for example, the techniques encompass a method for data revision control in a large-scale data analytic system that comprises the steps of storing a first version of a dataset that is derived by executing a first version of driver program associated with the dataset; and storing a first build catalog entry comprising an identifier of the first version of the dataset and comprising an identifier of the first version of the driver program.

    摘要翻译: 计算机实现的大规模数据分析系统中的数据修订控制技术。 在一个实施例中,例如,这些技术涵盖用于大规模数据分析系统中的数据修订控制的方法,该方法包括以下步骤:存储数据集的第一版本,该第一版本通过执行与第一版本的驱动程序相关联的 数据集; 以及存储包括所述数据集的所述第一版本的标识符并且包括所述驱动程序的所述第一版本的标识符的第一构建目录条目。

    ORCHESTRATION SYSTEM FOR STREAM STORAGE AND PROCESSING

    公开(公告)号:EP3907610A1

    公开(公告)日:2021-11-10

    申请号:EP20193205.0

    申请日:2020-08-27

    发明人: FINK, Robert

    IPC分类号: G06F9/54

    摘要: A method, system and computer program product for orchestrating stream storage and processing is disclosed. The method, performed by one or more processors, may comprise receiving a query specifying a change to a first transformer of a data processing pipeline, the first transformer receiving an existing data stream of data records from a first data source and providing transformed data records to a first data sink. The method may also disclose identifying at least a first or second type of change to be implemented by the query and implementing the change specified in the query dependent on the identified first or second change type. The implementing may comprise one of: changing the first data transformer in accordance with the query and providing transformed output to the first data sink; and deploying a second transformer, being a changed version of the first transformer as specified in the query, for operating in parallel with the first transformer, and providing its transformed output to a second data sink.

    AUTOMATIC CONFIGURATION OF LOGGING INFRASTRUCTURE FOR SOFTWARE DEPLOYMENTS USING SOURCE CODE

    公开(公告)号:EP4287017A2

    公开(公告)日:2023-12-06

    申请号:EP23205372.8

    申请日:2020-05-06

    IPC分类号: G06F8/75

    摘要: One or more processors examine source code of one or more software packages that produce output messages and identify, in the source code, one or more call expressions that each represent a logging call. The one or more processors generate a number of search patterns for parsing output messages produced by the one or more software packages, wherein each of the search patterns is based on one or more arguments of a corresponding call expression of the one or more call expressions. The one or more processors further reduce the number of search patterns to be applied to the output messages produced by the one or more software packages to identify log entries among the output messages.

    AUTOMATIC CONFIGURATION OF LOGGING INFRASTRUCTURE FOR SOFTWARE DEPLOYMENTS USING SOURCE CODE

    公开(公告)号:EP3789882A1

    公开(公告)日:2021-03-10

    申请号:EP20173279.9

    申请日:2020-05-06

    摘要: One or more processors examine source code of one or more software packages that produce output messages and identify, in the source code, one or more call expressions that each represent a logging call. The one or more processors generate a number of search patterns for parsing output messages produced by the one or more software packages, wherein each of the search patterns is based on one or more arguments of a corresponding call expression of the one or more call expressions. The one or more processors further reduce the number of search patterns to be applied to the output messages produced by the one or more software packages to identify log entries among the output messages.

    AUTOMATIC CONFIGURATION OF LOGGING INFRASTRUCTURE FOR SOFTWARE DEPLOYMENTS USING SOURCE CODE

    公开(公告)号:EP4287017A3

    公开(公告)日:2024-02-14

    申请号:EP23205372.8

    申请日:2020-05-06

    摘要: One or more processors examine source code of one or more software packages that produce output messages and identify, in the source code, one or more call expressions that each represent a logging call. The one or more processors generate a number of search patterns for parsing output messages produced by the one or more software packages, wherein each of the search patterns is based on one or more arguments of a corresponding call expression of the one or more call expressions. The one or more processors further reduce the number of search patterns to be applied to the output messages produced by the one or more software packages to identify log entries among the output messages.