Prefetch kernels on data-parallel processors

    公开(公告)号:US11500778B2

    公开(公告)日:2022-11-15

    申请号:US16813075

    申请日:2020-03-09

    Abstract: Embodiments include methods, systems and non-transitory computer-readable computer readable media including instructions for executing a prefetch kernel with reduced intermediate state storage resource requirements. These include executing a prefetch kernel on a graphics processing unit (GPU), such that the prefetch kernel begins executing before a processing kernel. The prefetch kernel performs memory operations that are based upon at least a subset of memory operations in the processing kernel.

    Method and system for asymmetrical processing with managed data affinity
    4.
    发明授权
    Method and system for asymmetrical processing with managed data affinity 有权
    具有管理数据亲和力的不对称处理方法和系统

    公开(公告)号:US09244629B2

    公开(公告)日:2016-01-26

    申请号:US13926765

    申请日:2013-06-25

    Abstract: Methods, systems and computer readable storage mediums for more efficient and flexible scheduling of tasks on an asymmetric processing system having at least one host processor and one or more slave processors, are disclosed. An example embodiment includes, determining a data access requirement of a task, comparing the data access requirement to respective local memories of the one or more slave processors selecting a slave processor from the one or more slave processors based upon the comparing, and running the task on the selected slave processor.

    Abstract translation: 公开了用于在具有至少一个主处理器和一个或多个从属处理器的非对称处理系统上更有效和灵活地调度任务的方法,系统和计算机可读存储介质。 一个示例实施例包括:确定任务的数据访问需求,将数据访问要求与一个或多个从属处理器的相应本地存储器进行比较,所述一个或多个从属处理器基于比较而从一个或多个从属处理器中选择从属处理器,并且执行任务 在所选的从属处理器上。

    Method and System for Asymmetrical Processing With Managed Data Affinity
    5.
    发明申请
    Method and System for Asymmetrical Processing With Managed Data Affinity 有权
    具有管理数据亲和性的不对称处理方法与系统

    公开(公告)号:US20140380003A1

    公开(公告)日:2014-12-25

    申请号:US13926765

    申请日:2013-06-25

    Abstract: Methods, systems and computer readable storage mediums for more efficient and flexible scheduling of tasks on an asymmetric processing system having at least one host processor and one or more slave processors, are disclosed. An example embodiment includes, determining a data access requirement of a task, comparing the data access requirement to respective local memories of the one or more slave processors selecting a slave processor from the one or more slave processors based upon the comparing, and running the task on the selected slave processor.

    Abstract translation: 公开了用于在具有至少一个主处理器和一个或多个从属处理器的非对称处理系统上更有效和灵活地调度任务的方法,系统和计算机可读存储介质。 一个示例实施例包括:确定任务的数据访问需求,将数据访问要求与一个或多个从属处理器的相应本地存储器进行比较,所述一个或多个从属处理器基于比较而从一个或多个从属处理器中选择从属处理器,并且执行任务 在所选的从属处理器上。

    System and method for page-conscious GPU instruction

    公开(公告)号:US11301256B2

    公开(公告)日:2022-04-12

    申请号:US14466080

    申请日:2014-08-22

    Abstract: Embodiments disclose a system and method for reducing virtual address translation latency in a wide execution engine that implements virtual memory. One example method describes a method comprising receiving a wavefront, classifying the wavefront into a subset based on classification criteria selected to reduce virtual address translation latency associated with a memory support structure, and scheduling the wavefront for processing based on the classifying.

    PREFETCH KERNELS ON DATA-PARALLEL PROCESSORS

    公开(公告)号:US20200210341A1

    公开(公告)日:2020-07-02

    申请号:US16813075

    申请日:2020-03-09

    Abstract: Embodiments include methods, systems and non-transitory computer-readable computer readable media including instructions for executing a prefetch kernel with reduced intermediate state storage resource requirements. These include executing a prefetch kernel on a graphics processing unit (GPU), such that the prefetch kernel begins executing before a processing kernel. The prefetch kernel performs memory operations that are based upon at least a subset of memory operations in the processing kernel.

    System and Method for Page-Conscious GPU Instruction
    9.
    发明申请
    System and Method for Page-Conscious GPU Instruction 审中-公开
    页意识GPU指令的系统和方法

    公开(公告)号:US20160055005A1

    公开(公告)日:2016-02-25

    申请号:US14466080

    申请日:2014-08-22

    CPC classification number: G06F9/3887 G06F12/1027 G06F2212/654

    Abstract: Embodiments disclose a system and method for reducing virtual address translation latency in a wide execution engine that implements virtual memory. One example method describes a method comprising receiving a wavefront, classifying the wavefront into a subset based on classification criteria selected to reduce virtual address translation latency associated with a memory support structure, and scheduling the wavefront for processing based on the classifying.

    Abstract translation: 实施例公开了一种在实现虚拟存储器的宽执行引擎中减少虚拟地址转换等待时间的系统和方法。 一个示例性方法描述了一种方法,包括接收波阵面,基于所选择的分类标准将波阵面分类为子集,以减少与存储器支持结构相关联的虚拟地址转换延迟,以及基于分类调度波阵面进行处理。

Patent Agency Ranking