UNIFIED MEMORY SYSTEMS AND METHODS
    1.
    发明申请
    UNIFIED MEMORY SYSTEMS AND METHODS 审中-公开
    统一的内存系统和方法

    公开(公告)号:US20150206277A1

    公开(公告)日:2015-07-23

    申请号:US14601223

    申请日:2015-01-20

    CPC classification number: G06T1/20 G06F9/5016 G06F12/109 G06T1/60

    Abstract: The present invention facilitates efficient and effective utilization of unified virtual addresses across multiple components. In one embodiment, the presented new approach or solution uses Operating System (OS) allocation on the central processing unit (CPU) combined with graphics processing unit (GPU) driver mappings to provide a unified virtual address (VA) across both GPU and CPU. The new approach helps ensure that a GPU VA pointer does not collide with a CPU pointer provided by OS CPU allocation (e.g., like one returned by “malloc” C runtime API, etc.).

    Abstract translation: 本发明有助于跨多个组件的统一虚拟地址的有效和有效的利用。 在一个实施例中,所提出的新方法或解决方案使用与图形处理单元(GPU)驱动程序映射相结合的中央处理单元(CPU)上的操作系统(OS)分配,以在GPU和CPU两者之间提供统一的虚拟地址(VA)。 新的方法有助于确保GPU VA指针不会与OS CPU分配提供的CPU指针相冲突(例如,像“malloc”C运行时API返回的一样)等。

    METHOD AND SYSTEM FOR SEPARATE COMPILATION OF DEVICE CODE EMBEDDED IN HOST CODE
    3.
    发明申请
    METHOD AND SYSTEM FOR SEPARATE COMPILATION OF DEVICE CODE EMBEDDED IN HOST CODE 有权
    用于嵌入主机代码的设备代码的单独编译的方法和系统

    公开(公告)号:US20130305233A1

    公开(公告)日:2013-11-14

    申请号:US13850207

    申请日:2013-03-25

    CPC classification number: G06F8/30 G06F8/41 G06F8/54

    Abstract: Embodiments of the present invention provide a novel solution that supports the separate compilation of host code and device code used within a heterogeneous programming environment. Embodiments of the present invention are operable to link device code embedded within multiple host object files using a separate device linking operation. Embodiments of the present invention may extract device code from their respective host object files and then linked them together to form linked device code. This linked device code may then be embedded back into a host object generated by embodiments of the present invention which may then be passed to a host linker to form a host executable file. As such, device code may be split into multiple files and then linked together to form a final executable file by embodiments of the present invention.

    Abstract translation: 本发明的实施例提供了一种新颖的解决方案,其支持在异构编程环境中使用的主机代码和设备代码的单独编译。 本发明的实施例可操作以使用单独的设备链接操作链接嵌入在多个主机对象文件内的设备代码。 本发明的实施例可以从其各自的主机对象文件中提取设备代码,然后将它们链接在一起以形成链接的设备代码。 然后将该链接的设备代码嵌入到由本发明的实施例生成的主机对象中,然后可以将其传递到主机链接器以形成主机可执行文件。 因此,设备代码可以被分割成多个文件,然后通过本发明的实施例链接在一起以形成最终的可执行文件。

    RECONFIGURING REGISTER AND SHARED MEMORY USAGE IN THREAD ARRAYS

    公开(公告)号:US20230297426A1

    公开(公告)日:2023-09-21

    申请号:US17698664

    申请日:2022-03-18

    CPC classification number: G06F9/5022 G06F9/30098 G06F9/3005 G06F2209/5011

    Abstract: Various embodiments include techniques for utilizing resources on a processing unit. Thread groups executing on a processor begin execution with specified resources, such as a number of registers and an amount of shared memory. During execution, one or more thread groups may determine that the thread groups have excess resources needed to execute the current functions. Such thread groups can deallocate the excess resources to a free pool. Similarly, during execution, one or more thread groups may determine that the thread groups have fewer resources needed to execute the current functions. Such thread groups can allocate the needed resources from the free pool. Further, producer thread groups that generate data for consumer thread groups can deallocate excess resources prior to completion. The consumer thread groups can allocate the excess resources and initiate execution while the producer thread groups complete execution, thereby decreasing latency between producer and consumer thread groups.

Patent Agency Ranking