NMONITOR instruction for monitoring a plurality of addresses

    公开(公告)号:US10289516B2

    公开(公告)日:2019-05-14

    申请号:US15394271

    申请日:2016-12-29

    Abstract: A processor core includes a decode circuit to decode an instruction, where the instruction specifies an address to be monitored. The processor core further includes a monitor circuit, where the monitor circuit includes a data structure to store a plurality of entries for addresses that are being monitored by the monitor circuit and a triggered queue, where the monitor circuit is to enqueue an address being monitored by the monitor circuit into the triggered queue in response to a determination that a triggering event for the address being monitored by the monitor circuit occurred. The processor core further includes an execution circuit to execute the decoded instruction to add an entry for the specified address to be monitored into the data structure and ensure, using a cache coherence protocol, that a coherency status of a cache line corresponding to the specified address to be monitored is in a shared state.

    INDEPENDENT TUNING OF MULTIPLE HARDWARE PREFETCHERS

    公开(公告)号:US20190095333A1

    公开(公告)日:2019-03-28

    申请号:US15718845

    申请日:2017-09-28

    Abstract: Embodiments of apparatuses, methods, and systems for independent tuning of multiple hardware prefetchers are described. In an embodiment, an apparatus includes a processor core, a cache memory, a hardware prefetcher, and a prefetch tuner. The hardware prefetcher is to prefetch data for the processor core from a system memory to the cache memory. The prefetch tuner is to adjust a prefetch rate of the hardware prefetcher based on a fraction of late prefetches. The prefetch tuner includes a late prefetch counter to count a number of late prefetches for the hardware prefetcher, a prefetch counter to count a number of prefetches for the hardware prefetcher, and a late prefetch calculator to calculate the fraction of late prefetches based on the number of late prefetches and the number of prefetches.

    STORING CACHE LINES IN DEDICATED CACHE OF AN IDLE CORE

    公开(公告)号:US20190303294A1

    公开(公告)日:2019-10-03

    申请号:US15940712

    申请日:2018-03-29

    Abstract: Embodiment of this disclosure provides a mechanism to store cache lines in dedicated cache of an idle core. In one embodiment, a multi-core processor comprising a first core, a second core, a first cache, a second cache, a third cache, and a cache controller unit is provided. The cache controller is operatively coupled to at least the first cache, the second cache, and the third cache. The cache controller is to evict a first line from the first cache, wherein the first core is in an active state. Responsive to the evicting of the first line, the first line is stored in the third cache. Responsive to storing the first line, a second line is evicted from the third cache. Responsive to evicting the second line, the second line is stored in the second cache when the second core is in an idle state.

    Independent tuning of multiple hardware prefetchers

    公开(公告)号:US10303609B2

    公开(公告)日:2019-05-28

    申请号:US15718845

    申请日:2017-09-28

    Abstract: Embodiments of apparatuses, methods, and systems for independent tuning of multiple hardware prefetchers are described. In an embodiment, an apparatus includes a processor core, a cache memory, a hardware prefetcher, and a prefetch tuner. The hardware prefetcher is to prefetch data for the processor core from a system memory to the cache memory. The prefetch tuner is to adjust a prefetch rate of the hardware prefetcher based on a fraction of late prefetches. The prefetch tuner includes a late prefetch counter to count a number of late prefetches for the hardware prefetcher, a prefetch counter to count a number of prefetches for the hardware prefetcher, and a late prefetch calculator to calculate the fraction of late prefetches based on the number of late prefetches and the number of prefetches.

    TECHNOLOGIES FOR PROCESSOR SIMULATION MODELING WITH MACHINE LEARNING

    公开(公告)号:US20190004920A1

    公开(公告)日:2019-01-03

    申请号:US15638727

    申请日:2017-06-30

    Abstract: Technologies for processor architecture simulation with machine learning include a computing device that simulates performance of a processor executing training programs with a simulation model. The computing device captures ground truth performance statistics of the processor executing the training programs, for example using a cycle-accurate simulator. The computing device collects training simulation statistics from the simulation model and trains an error model with the training simulation statistics as feature vector and with the ground truth performance statistics. The computing device may simulate performance of the processor executing a test program, capture test simulation statistic from the simulation model, and predict a predicted error of the simulation model using the error model with the test simulation statistics as feature vector. The computing device may adjust output of the simulation model or adapt execution of the simulation model based on the predicted error. Other embodiments are described and claimed.

Patent Agency Ranking