摘要:
A resource unit has a request interface to allow the unit to receive a request and associated data. The resource unit also has a hashing engine to create a hash of the associated data, a modulo engine to create a modulus result, a read engine to perform a memory read, and a results interface to allow the device to return results.
摘要:
A processing device employs a stack memory in a region of an external memory. The processing device has a stack pointer register to store a current top address for the stack memory. One of several techniques is used to determine which portion or portions of the external memory correspond to the stack region. A more efficient memory policy is implemented, whereby pushes to the stack do not have to read data from the external memory in to a cache, and whereby pops from the stack do not cause stale stack data to be written back from the cache to the external memory.
摘要:
An apparatus having a first circuit and a second circuit is disclosed. The first circuit may be configured to (i) read data from a region of a memory circuit during a read scrub of the region and (ii) generate a plurality of statistics based on (a) the data and (b) one or more bit flips performed during an error correction of the data. The memory circuit is generally configured to store the data in a nonvolatile condition. One or more reference voltages may be used to read the data. The second circuit may be configured to (i) update a plurality of parameters of the region based on the statistics and (ii) compute updated values of the reference voltages based on the parameters.
摘要:
A distributed database system has multiple compute nodes each running an instance of a database management system (DBMS) program that accesses database records in a local buffer cache. Records are persistently stored in distributed flash memory on multiple storage nodes. A Sharing Data Fabric (SDF) is a middleware layer between the DBMS programs and the storage nodes and has API functions called by the DBMS programs when a requested record is not present in the local buffer cache. The SDF fetches the requested record from flash memory and loads a copy into the local buffer cache. The SDF has threads on a home storage node that locate database records using a node map. A global cache directory locks and pins records to local buffer caches for updating by a node's DBMS program. DBMS operations are grouped into transactions that are committed or aborted together as a unit.
摘要:
An SSD controller maintains a zero count and a one count, and/or in some embodiments a zero/one disparity count, for each read unit read from an SLC NVM (or the lower pages of an MLC). In an event that the read unit is uncorrectable in part due to a shift in the threshold voltage distributions away from their nominal distributions, the maintained counts enable a determination of a direction and/or a magnitude to adjust a read threshold to track the threshold voltage shift and restore the read data zero/one balance. In various embodiments, the adjusted read threshold is determined in a variety of described ways (counts, percentages) that are based on a number of described factors (determined threshold voltage distributions, known stored values, past NVM operating events). Extensions of the forgoing techniques are described for MLC memories.
摘要:
A networking device employing memory buffering in which a first memory is logically configured into blocks, and the blocks are logically configured into particles, where a second memory is configured to mirror the first memory in which a fixed number of bits in the second memory are allocated for each particle in the first memory so that scheduling and datagram lengths of packets stored in the first memory may be stored in the second memory. Other embodiments are described and claimed.
摘要:
A Sharing Data Fabric (SDF) causes flash memory attached to multiple compute nodes to appear to be a single large memory space that is global yet shared by many applications running on the many compute nodes. Flash objects stored in flash memory of a home node are copied to an object cache in DRAM at an action node by SDF threads executing on the nodes. The home node has a flash object map locating flash objects in the home node's flash memory, and a global cache directory that locates copies of the object in other sharing nodes. Application programs use an applications-programming interface (API) into the SDF to transparently get and put objects without regard to the object's location on any of the many compute nodes. SDF threads and tables control coherency of objects in flash and DRAM.
摘要:
Rate computations are performed such as for use in scheduling activities, such as, but not limited to packets, processes, traffic flow, etc. One implementation identifies an approximated inverse rate, a fix-up adjustment value, and a quantum. An activity measurement value is maintained based on a measure of activity, and a rate control value is maintained based on the measure of activity and the approximated inverse rate. The fix-up adjustment value is applied once each quantum to the rate control value to maintain rate accuracy of the activity. In one implementation, the control value is a scheduling value used for determining when to perform a next part of the activity (e.g., send one or more packets). Scheduling rates are efficiently and compactly stored in an inverse form, which may have advantages in terms of rate granularity, accuracy, and the ability to deliver service smoothly.
摘要:
The present invention provides a network multithreaded processor, such as a network processor, including a thread interleaver that implements fine-grained thread decisions to avoid underutilization of instruction execution resources in spite of large communication latencies. In an upper pipeline, an instruction unit determines an-instruction fetch sequence responsive to an instruction queue depth on a per thread basis. In a lower pipeline, a thread interleaver determines a thread interleave sequence responsive to thread conditions including thread latency conditions. The thread interleaver selects threads using a two-level round robin arbitration. Thread latency signals are active responsive to thread latencies such as thread stalls, cache misses, and interlocks. During the subsequent one or more clock cycles, the thread is ineligible for arbitration. In one embodiment, other thread conditions affect selection decisions such as local priority, global stalls, and late stalls.
摘要:
A method and apparatus for computation is provided. A main cluster crossbar is connected to a plurality of statically scheduled routing processors. A first sub-cluster crossbar is associated with a first one of the plurality of statically scheduled routing processors where the first sub-cluster crossbar is connected to a first plurality of execution processors. A second sub-cluster crossbar is associated with a second one of the plurality of statically scheduled routing processors where the second sub-cluster crossbar is connected to a second plurality of execution processors.