Low-latency accelerator
    81.
    发明授权

    公开(公告)号:US10437739B2

    公开(公告)日:2019-10-08

    申请号:US15715594

    申请日:2017-09-26

    Inventor: Vinodh Gopal

    Abstract: Methods, apparatus and associated techniques and mechanisms for reducing latency in accelerators. The techniques and mechanisms are implemented in platform architectures supporting shared virtual memory (SVM) and includes use of SVM-enabled accelerators, along with translation look-aside buffers (TLBs). A request descriptor defining a job to be performed by an accelerator and referencing virtual addresses (VAs) and sizes of one or more buffers is enqueued via execution of a thread on a processor core. Under one approach, the descriptor includes hints comprising physical addresses or virtual address to physical address (VA-PA) translations that are obtained from one or more TLBs associated with the core using the buffer VAs. Under another approach employing TLB snooping, the buffer VAs are used as lookups and matching TLB entries ((VA-PA) translations) are used as hints. The hints are used to speculatively pre-fetch buffer data and speculatively start processing the pre-fetched buffer data on the accelerator.

    Technologies for addressing data in a memory

    公开(公告)号:US10416900B2

    公开(公告)日:2019-09-17

    申请号:US15198015

    申请日:2016-06-30

    Abstract: Technologies for addressing data in a memory include an apparatus that includes a memory and a controller. The memory is to store sub-blocks of data in a data table and a pointer table of locations of the sub-blocks in the data table. The controller is to manage the storage and lookup of data in the memory. Further, the controller is to store a sub-block pointer in the pointer table to a location of a sub-block in the data table and store a second pointer that references an entry where the sub-block pointer is stored in the pointer table.

    Hand held device to perform a bit range isolation instruction

    公开(公告)号:US10372455B2

    公开(公告)日:2019-08-06

    申请号:US14568812

    申请日:2014-12-12

    Abstract: Receiving an instruction indicating a source operand and a destination operand. Storing a result in the destination operand in response to the instruction. The result operand may have: (1) first range of bits having a first end explicitly specified by the instruction in which each bit is identical in value to a bit of the source operand in a corresponding position; and (2) second range of bits that all have a same value regardless of values of bits of the source operand in corresponding positions. Execution of instruction may complete without moving the first range of the result relative to the bits of identical value in the corresponding positions of the source operand, regardless of the location of the first range of bits in the result. Execution units to execute such instructions, computer systems having processors to execute such instructions, and machine-readable medium storing such an instruction are also disclosed.

    Semi-dynamic, low latency compression

    公开(公告)号:US10367524B2

    公开(公告)日:2019-07-30

    申请号:US15496562

    申请日:2017-04-25

    Abstract: Methods and apparatus are described by which data is compressed using semi-dynamic Huffman code generation. Embodiments generate symbol statistics over a portion of data. The symbol statistics are expanded to include all possible literals that could appear within the data. Any literal or reference added to the statistics may be given a frequency of one. The statistics are used to generate a semi-dynamic Huffman code. The entire data is then compressed using the semi-dynamic Huffman code.

    Systems and methods for guardband recovery using in situ characterization

    公开(公告)号:US10365708B2

    公开(公告)日:2019-07-30

    申请号:US15379283

    申请日:2016-12-14

    Abstract: Methods and apparatuses related to guardband recovery using in situ characterization are disclosed. In one example, a system includes a target circuit, a voltage regulator to provide a variable voltage to, a phase-locked loop (PLL) to provide a variable clock to, and a temperature sensor to sense a temperature of the target circuit, and a control circuit, wherein the control circuit is to set up a characterization environment by setting a temperature, voltage, clock frequency, and workload of the target circuit, execute a plurality of tests on the target circuit, when the target circuit passes the plurality of tests, adjust the variable voltage to increase a likelihood of the target circuit failing the plurality of tests and repeat the plurality of tests, and when the target circuit fails the plurality of tests, adjust the variable voltage to decrease a likelihood of the target circuit failing the plurality of tests.

    Methods and apparatus to parallelize data decompression

    公开(公告)号:US10320414B2

    公开(公告)日:2019-06-11

    申请号:US15875836

    申请日:2018-01-19

    Abstract: This application sets forth methods and apparatus to parallelize data decompression. An example method selecting initial starting positions in a compressed data bitstream; adjusting a first one of the initial starting positions to determine a first adjusted starting position by decoding the bitstream starting at a training position in the bitstream, the decoding including traversing the bitstream from the training position as though first data located at the training position is a valid token; outputting first decoded data generated by decoding a first segment of the bitstream starting from the first adjusted starting position; and merging the first decoded data with second decoded data generated by decoding a second segment of the bitstream, the decoding of the second segment starting from a second position in the bitstream and being performed in parallel with the decoding of the first segment, and the second segment preceding the first segment in the bitstream.

    System for compressing floating point data

    公开(公告)号:US10305508B2

    公开(公告)日:2019-05-28

    申请号:US15977720

    申请日:2018-05-11

    Abstract: A processor comprises a first memory to store data elements that are encoded according to a floating point format including a sign field, an exponent field, and a significand field; and a compression engine comprising circuitry, the compression engine to generate a compressed data block that is to include a tag type per data element, wherein responsive to a determination that a first data element includes a value in its exponent field that does not match a value of any entry in a dictionary, a first tag type and an uncompressed value of the data element are included in the compressed data block; and responsive to a determination that a second data element includes a value in its exponent field that matches a value of a first entry in the dictionary, a second tag type and a compressed value of the data element are included in the compressed data block.

Patent Agency Ranking