Automatically Determining Whether an Activation Cluster Contains Poisonous Data

    公开(公告)号:US20210081708A1

    公开(公告)日:2021-03-18

    申请号:US16571321

    申请日:2019-09-16

    IPC分类号: G06K9/62 G06N3/08 G06N3/04

    摘要: Embodiments relate to a system, program product, and method for automatically determining which activation data points in a neural model have been poisoned to erroneously indicate association with a particular label or labels. A neural network is trained network using potentially poisoned training data. Each of the training data points is classified using the network to retain the activations of the last hidden layer, and segment those activations by the label of corresponding training data. Clustering is applied to the retained activations of each segment, and a cluster assessment is conducted for each cluster associated with each label to distinguish clusters with potentially poisoned activations from clusters populated with legitimate activations. The assessment includes analyzing, for each cluster, a distance of a median of the activations therein to medians of the activations in the labels.

    DETECTING AND MITIGATING POISON ATTACKS USING DATA PROVENANCE

    公开(公告)号:US20200019821A1

    公开(公告)日:2020-01-16

    申请号:US16031953

    申请日:2018-07-10

    IPC分类号: G06K9/62 G06F15/18 H04L29/06

    摘要: Computer-implemented methods, program products, and systems for provenance-based defense against poison attacks are disclosed. In one approach, a method includes: receiving observations and corresponding provenance data from data sources; determining whether the observations are poisoned based on the corresponding provenance data; and removing the poisoned observation(s) from a final training dataset used to train a final prediction model. Another implementation involves provenance-based defense against poison attacks in a fully untrusted data environment. Untrusted data points are grouped according to provenance signature, and the groups are used to train learning algorithms and generate complete and filtered prediction models. The results of applying the prediction models to an evaluation dataset are compared, and poisoned data points identified where the performance of the filtered prediction model exceeds the performance of the complete prediction model. Poisoned data points are removed from the set to generate a final prediction model.

    Semantic-aware and user-aware admission control for performance management in data analytics and data storage systems

    公开(公告)号:US10241826B2

    公开(公告)日:2019-03-26

    申请号:US15792643

    申请日:2017-10-24

    IPC分类号: G06F9/46 G06F9/48

    摘要: In one embodiment, a computer program product includes a computer-readable storage medium having program instructions embodied therewith. The embodied program instructions are executable by a processor to cause the processor to receive, by the processor, a first job request. The embodied program instructions are also executable by the processor to cause the processor to analyze, by the processor, the first job request to determine a user skill level of a user that submitted the first job request. Moreover, the embodied program instructions are executable by the processor to cause the processor to admit, by the processor, the first job request to a data analytics system and/or a data storage system in a specified order with respect to other received job requests based on at least the user skill level of the user that submitted the first job request. Other systems and methods are described in accordance with more embodiments.

    SEMANTIC-AWARE AND USER-AWARE ADMISSION CONTROL FOR PERFORMANCE MANAGEMENT IN DATA ANALYTICS AND DATA STORAGE SYSTEMS

    公开(公告)号:US20170090975A1

    公开(公告)日:2017-03-30

    申请号:US14869798

    申请日:2015-09-29

    IPC分类号: G06F9/46

    CPC分类号: G06F9/46 G06F9/4843

    摘要: In one embodiment, a computer program product includes a computer readable storage medium having program instructions embodied therewith. The embodied program instructions are executable by a processor to cause the processor to receive, by the processor, a first job request, and analyze, by the processor, the first job request to determine: an estimated complexity of the first job request based on one or more attributes of the first job request and a user skill level of a user that submitted the first job request. Moreover, the embodied program instructions are executable by the processor to admit, by the processor, the first job request to a data analytics system and/or a data storage system in a specified order with respect to other received job requests based on at least: the estimated complexity of the first job request, and the user skill level of the user that submitted the first job request.