Abstract:
A bit vector for a Bloom filter is determined by performing one or more hash function operations on a set of ternary content addressable memory (TCAM) words. A TCAM array is partitioned into a first portion to store the bit vector for the Bloom filter and a second portion to store the set of TCAM words. The TCAM array can be searched using a search word by performing the one or more hash function operations on the search word to generate a hashed search word and determining whether bits at specified positions of the hashed search word match bits at corresponding positions of the bit vector stored in the first portion of the TCAM array before searching the second portion of the TCAM array with the search word.
Abstract:
Systems and methods are provided for incorporating an optimized dispatcher with an FaaS infrastructure to permit and restrict access to resources. For example, the dispatcher may assign requests to “warm” resources and initiate a fault process if the resource is overloaded or a cache-miss is identified (e.g., by restarting or rebooting the resource). The warm instances or accelerators associated with the allocation size that are identified may be commensurate to the demand and help dynamically route requests to faster accelerators.
Abstract:
A computer system includes multiple memory array components that include respective analog memory arrays which are sequenced to implement a multi-layer process. An error array data structure is obtained for at least a first memory array component, and from which a determination is made as to whether individual nodes (or cells) of the error array data structure are significant. A determination can be made as to any remedial operations that can be performed to mitigate errors of significance.
Abstract:
In some examples, with respect to adaptive multi-level checkpointing, a transfer parameter associated with transfer of checkpoint data from a node-local storage to a parallel file system may be ascertained for the checkpoint data stored in the node-local storage. The transfer parameter may be compared to a specified transfer parameter threshold. A determination may be made, based on the comparison of the transfer parameter to the specified transfer parameter threshold, as to whether to transfer the checkpoint data from the node-local storage to the parallel file system.
Abstract:
Computer-implemented methods, systems, and devices to perform lossless compression of floating point format time-series data are disclosed. A first data value may be obtained in floating point format representative of an initial time-series parameter. For example, an output checkpoint of a computer simulation of a real-world event such as weather prediction or nuclear reaction simulation. A first predicted value may be determined representing the parameter at a first checkpoint time. A second data value may be obtained from the simulation. A prediction error may be calculated. Another predicted value may be generated for a next point in time and may be adjusted by the previously determined prediction error (e.g., to increase accuracy of the subsequent prediction). When a third data value is obtained, the adjusted prediction value may be used to generate a difference (e.g., XOR) for storing in a compressed data store to represent the third data value.
Abstract:
Techniques for reallocating a memory pending queue based on stalls are provided. In one aspect, it may be determined at a memory stop of a memory fabric that at least one class of memory access is stalled. It may also be determined at the memory stop of the memory fabric that there is at least one class of memory access that is not stalled. At least a portion of a memory pending queue may be reallocated from the class of memory access that is not stalled to the class of memory access that is stalled.
Abstract:
Examples described herein include receiving an operation pipeline for a computing system and building a graph that comprises a model for a number of potential memory side accelerator thread assignments to carry out the operation pipeline. The computing system may comprise at least two memories and a number of memory side accelerators. Each model may comprise a number of steps and at least one step out of the number of steps in each model may comprise a function performed at one memory side accelerator out of the number of memory side accelerators. Examples described herein also include determining a cost of at least one model.
Abstract:
Techniques for memory device writes based on mapping are provided. In one aspect, a block of data to be written to a line in a rank of memory may be received. The rank of memory may comprise multiple memory devices. The block of data may be written to a number of memory devices determined by the size of the block of data. A memory device mapping for the line may be retrieved. The mapping may determine the order in which the block of data is written to the memory devices within the rank. The block of data may be written to the memory devices based on the mapping.
Abstract:
Examples described herein relate to caching in a system with multiple nodes sharing a globally addressable memory. The globally addressable memory includes multiple windows that each include multiple chunks. Each node of a set of the nodes includes a cache that is associated with one of the windows. One of the nodes includes write access to one of the chunks of the window. The other nodes include read access to the chunk. The node with write access further includes a copy of the chunk in its cache and modifies multiple lines of the chunk copy. After a first line of the chunk copy is modified, a notification is sent to the other nodes that the chunk should be marked dirty. After multiple lines are modified, an invalidation message is sent for each of the modified lines of the set of the nodes.
Abstract:
Techniques for reallocating a memory pending queue based on stalls are provided. In one aspect, it may be determined at a memory stop of a memory fabric that at least one class of memory access is stalled. It may also be determined at the memory stop of the memory fabric that there is at least one class of memory access that is not stalled. At least a portion of a memory pending queue may be reallocated from the class of memory access that is not stalled to the class of memory access that is stalled.