SCOPED PERSISTENCE BARRIERS FOR NON-VOLATILE MEMORIES

    公开(公告)号:US20190286362A1

    公开(公告)日:2019-09-19

    申请号:US16432391

    申请日:2019-06-05

    Abstract: A processing apparatus is provided that includes NVRAM and one or more processors configured to process a first set and a second set of instructions according to a hierarchical processing scope and process a scoped persistence barrier residing in the program after the first instruction set and before the second instruction set. The barrier includes an instruction to cause first data to persist in the NVRAM before second data persists in the NVRAM. The first data results from execution of each of the first set of instructions processed according to the one hierarchical processing scope. The second data results from execution of each of the second set of instructions processed according to the one hierarchical processing scope. The processing apparatus also includes a controller configured to cause the first data to persist in the NVRAM before the second data persists in the NVRAM based on the scoped persistence barrier.

    PER-INSTRUCTION ENERGY DEBUGGING USING INSTRUCTION SAMPLING HARDWARE

    公开(公告)号:US20190286209A1

    公开(公告)日:2019-09-19

    申请号:US15923153

    申请日:2018-03-16

    Abstract: A processor utilizes instruction based sampling to generate sampling data sampled on a per instruction basis during execution of an instruction. The sampling data indicates what processor hardware was used due to the execution of the instruction. Software receives the sampling data and generates an estimate of energy used by the instruction based on the sampling data. The sampling data may include microarchitectural events and the energy estimate utilizes a base energy amount corresponding to the instruction executed along with energy amounts corresponding to the microarchitectural events in the sampling data. The sampling data may include switching events associated with hardware blocks that switched due to execution of the instruction and the energy estimate for the instruction is based on the switching events and capacitance estimates associated with the hardware blocks.

    Streaming translation lookaside buffer

    公开(公告)号:US10417140B2

    公开(公告)日:2019-09-17

    申请号:US15442487

    申请日:2017-02-24

    Abstract: Techniques are provided for using a translation lookaside buffer to provide low latency memory address translations for data streams. Clients of a memory system first prepare the address translation cache hierarchy by requesting that a translation pre-fetch stream is initialized. After the translation pre-fetch stream is initialized, the cache hierarchy returns an acknowledgment of completion to the client, which then begins to access memory. Pre-fetch streams are specified in terms of address ranges and are performed for large contiguous portions of the virtual memory address space.

    Dynamic memory traffic optimization in multi-client systems

    公开(公告)号:US10409524B1

    公开(公告)日:2019-09-10

    申请号:US15962997

    申请日:2018-04-25

    Abstract: Systems, apparatuses, and methods for dynamically optimizing memory traffic in multi-client systems are disclosed. A system includes a plurality of client devices, a memory subsystem, and a communication fabric coupled to the client devices and the memory subsystem. The system includes a first client which generates memory access requests targeting the memory subsystem. Prior to sending a given memory access request to the fabric, the first client analyzes metadata associated with data targeted by the given memory access request. If the metadata indicates the targeted data is the same as or is able to be derived from previously retrieved data, the first client prevents the request from being sent out on the fabric on the data path to memory subsystem. This helps to reduce memory bandwidth consumption and allows the fabric and the memory subsystem to stay in a low-power state for longer periods of time.

    Network-aware cache coherence protocol enhancement

    公开(公告)号:US10402327B2

    公开(公告)日:2019-09-03

    申请号:US15358318

    申请日:2016-11-22

    Abstract: A non-uniform memory access system includes several nodes that each have one or more processors, caches, local main memory, and a local bus that connects a node's processor(s) to its memory. The nodes are coupled to one another over a collection of point-to-point interconnects, thereby permitting processors in one node to access data stored in another node. Memory access time for remote memory takes longer than local memory because remote memory accesses have to travel across a communications network to arrive at the requesting processor. In some embodiments, inter-cache and main-memory-to-cache latencies are measured to determine whether it would be more efficient to satisfy memory access requests using cached copies stored in caches of owning nodes or from main memory of home nodes.

    FLEXIBILE INTERFACES USING THROUGH-SILICON VIA TECHNOLOGY

    公开(公告)号:US20190268086A1

    公开(公告)日:2019-08-29

    申请号:US15903253

    申请日:2018-02-23

    Abstract: An integrated circuit includes first and second through-silicon via (TSV) circuits and a steering logic circuit. The first TSV circuit has a first TSV and a first multiplexer for selecting between a first TSV data signal received from the first TSV and a first local data signal for transmission to a first TSV output terminal. The second TSV circuit includes a second TSV and a second multiplexer for selecting between a second TSV data signal received from the second TSV and the first local data signal for transmission to a second TSV output terminal. The steering logic circuit controls the first multiplexer to select the first local data signal and the second multiplexer to select the second TSV data signal in a first mode, and the first multiplexer to select the first TSV data signal and the second multiplexer to select the first local data signal in a second mode.

    Network of memory modules with logarithmic access

    公开(公告)号:US10394726B2

    公开(公告)日:2019-08-27

    申请号:US15229708

    申请日:2016-08-05

    Inventor: Gabriel Loh

    Abstract: A memory network includes a plurality of memory nodes each identifiable by an ordinal number m, and a set of links divided into N subsets of links, where each subset of links is identifiable by an ordinal number n. For each subset of the plurality of N subsets of links, each link in the subset connects two memory nodes that have ordinal numbers m differing by b(n-1), where b is a positive number. Each of the memory nodes is communicatively coupled to a processor via at least two non-overlapping pathways through the plurality of links.

    ACCURATE ON-CHIP TEMPERATURE SENSING USING THERMAL OSCILLATOR

    公开(公告)号:US20190257696A1

    公开(公告)日:2019-08-22

    申请号:US15902101

    申请日:2018-02-22

    Abstract: A calibrated temperature sensor includes a power on oscillator responsive to a calibration enable signal for providing a power on clock signal, a temperature dependent oscillator responsive to said calibration enable signal for providing a temperature dependent clock signal, and a measurement logic circuit. The measurement logic circuit counts a first number of pulses of the temperature dependent clock signal during a first calibration period using the power on clock signal, a second number of pulses of the temperature dependent clock signal during a second calibration period using a system clock signal, and a third number of pulses of the power on clock signal over a third calibration period using the system clock signal, and a fourth number of pulses of the temperature dependent clock signal using the system clock signal during a normal operation mode, wherein the first calibration period precedes both the second and third calibration periods.

    DYNAMIC VARIABLE PRECISION COMPUTATION
    649.
    发明申请

    公开(公告)号:US20190235838A1

    公开(公告)日:2019-08-01

    申请号:US16378055

    申请日:2019-04-08

    CPC classification number: G06F7/4824 G06F7/729

    Abstract: A conversion unit converts operands from a conventional number system that represents each binary number in the operands as one bit to redundant number system (RNS) operands that represent each binary number as a plurality of bits. An arithmetic logic unit performs an arithmetic operation on the RNS operands in a direction from a most significant bit (MSB) to a least significant bit (LSB). The arithmetic logic unit stops performing the arithmetic operation prior to performing the arithmetic operation on a target binary number indicated by a dynamic precision associated with the RNS operands. In some cases, a power supply provides power to bit slices in the arithmetic logic unit and a clock signal generator provides clock signals to the bit slices. Gate logic is configured to gate the power or the clock signals provided to a subset of the bit slices.

Patent Agency Ranking