-
公开(公告)号:US20240143502A1
公开(公告)日:2024-05-02
申请号:US17958338
申请日:2022-10-01
Applicant: INTEL CORPORATION
Inventor: Mark DECHENE , Thomas MULLINS , Ryan CARLSON , Paula PETRICA , Brendan WEST , Jonathan JOHNSON , Nikhil PATIL
IPC: G06F12/0802
CPC classification number: G06F12/0802 , G06F2212/601
Abstract: An apparatus and method for implementing a Level 0 cache within a cache subsystem. For example, one embodiment of a processor comprises: a cache subsystem comprising a Level-0 cache; a scheduler to schedule a load operation indicating data to be loaded; and a load hit predictor to predict whether the data indicated by the load operation is stored in the LO cache and to generate a wakeup signal to the scheduler in response to predicting that the data is stored in the LO cache. Some implementations perform store forwarding in response to load operations using a multi-step approach in which a partial linear address check is performed to determine load operations which are eligible for store forwarding. A full address check is performed for those load operations which are eligible in which the address of the load is compared against the address of a youngest older store operation. Mini-MOB implementations are also described including a stale data watchdog function and wakeup signal to schedule dependent operations.
-
公开(公告)号:US20160275026A1
公开(公告)日:2016-09-22
申请号:US14663785
申请日:2015-03-20
Applicant: INTEL CORPORATION
Inventor: Niall MCDONNELL , Tomasz KANTECKI , Ryan CARLSON , Michael O'HANLON
Abstract: A weakly ordered doorbell at least reduces the cycle cost of talking to a device. This may manifest as simple performance improvement, but it also allows a reduction in the number of jobs batched into a single doorbell—current DPDK (Data Plane Development Kit) code (for example) batches larger numbers of packets behind a single doorbell to amortize the per-packet doorbell cost. Reducing the number of packets at least provide a better latency profile.
Abstract translation: 一个弱排序的门铃至少减少了与设备通话的周期成本。 这可能表现为简单的性能改进,但它也可以减少分配到单个门铃当前DPDK(数据平面开发套件)代码中的作业数量(例如)批次在单个门铃后面分配更多数量的数据包,以分摊 每包门铃成本。 减少数据包数量至少提供更好的延迟配置文件。
-