Hardware acceleration for inline caches in dynamic languages

    公开(公告)号:US09740504B2

    公开(公告)日:2017-08-22

    申请号:US14262871

    申请日:2014-04-28

    CPC classification number: G06F9/4491 G06F12/0802

    Abstract: Aspects include apparatuses, systems, and methods for hardware acceleration for inline caches in dynamic languages. An inline cache may be initialized for an instance of a dynamic software operation. A call of an initialized instance of the dynamic software operation may be executed by an inline cache hardware accelerator. The inline cache may be checked to determine that its data is current. When the data is current, the initialized instance of the dynamic software operation may be executed using the related inline cache data. When the data is not current, a new inline cache may be initialized for the instance of the dynamic software operation, including the not current data of a previously initialized instance of the dynamic software operation. The inline cache hardware accelerator may include an inline cache memory, a coprocessor, and/or a functional until one an inline cache pipeline connected to a processor pipeline.

    SYSTEM AND METHOD FOR ADAPTIVELY MANAGING REGISTERS IN AN INSTRUCTION PROCESSOR
    3.
    发明申请
    SYSTEM AND METHOD FOR ADAPTIVELY MANAGING REGISTERS IN AN INSTRUCTION PROCESSOR 审中-公开
    在指令处理程序中自适应管理寄存器的系统和方法

    公开(公告)号:US20160216969A1

    公开(公告)日:2016-07-28

    申请号:US14607270

    申请日:2015-01-28

    Abstract: Systems and methods for adaptively managing registers in an instruction processor are disclosed. The system identifies one or more registers with inoperable cells. An operand manager identifies a set of operable cells within the one or more registers with inoperable cells and determines if a present instruction will use an operand that can be supported by the set of operable cells. When the set of operable cells can support the operand, the operand manager generates an assignment which is communicated to a register file manager.

    Abstract translation: 公开了一种用于在指令处理器中自适应地管理寄存器的系统和方法。 系统识别一个或多个不可操作的单元的寄存器。 操作数管理器用一个或多个寄存器中的不可操作单元识别一组可操作的单元,并且确定当前指令是否将使用该组可操作单元支持的操作数。 当可操作单元的组可以支持操作数时,操作数管理器产生一个传送给寄存器文件管理器的分配。

    Method for simplified task-based runtime for efficient parallel computing

    公开(公告)号:US10169105B2

    公开(公告)日:2019-01-01

    申请号:US14992268

    申请日:2016-01-11

    Abstract: Aspects include computing devices, systems, and methods for implementing scheduling and execution of lightweight kernels as simple tasks directly by a thread without setting up a task structure. A computing device may determine whether a task pointer in a task queue is a simple task pointer for the lightweight kernel. The computing device may schedule a first simple task for the lightweight kernel for execution by the thread. The computing device may retrieve, from an entry of a simple task table, a kernel pointer for the lightweight kernel. The entry in the simple task table may be associated with the simple task pointer. The computing device may directly execute the lightweight kernel as the simple task.

    Method For Simplified Task-based Runtime For Efficient Parallel Computing
    5.
    发明申请
    Method For Simplified Task-based Runtime For Efficient Parallel Computing 审中-公开
    用于简化的基于任务的运行时间进行高效并行计算的方法

    公开(公告)号:US20170031728A1

    公开(公告)日:2017-02-02

    申请号:US14992268

    申请日:2016-01-11

    CPC classification number: G06F9/52 G06F9/4843

    Abstract: Aspects include computing devices, systems, and methods for implementing scheduling and execution of lightweight kernels as simple tasks directly by a thread without setting up a task structure. A computing device may determine whether a task pointer in a task queue is a simple task pointer for the lightweight kernel. The computing device may schedule a first simple task for the lightweight kernel for execution by the thread. The computing device may retrieve, from an entry of a simple task table, a kernel pointer for the lightweight kernel. The entry in the simple task table may be associated with the simple task pointer. The computing device may directly execute the lightweight kernel as the simple task.

    Abstract translation: 方面包括计算设备,系统和方法,用于直接通过线程实现轻量级内核的调度和执行,而无需设置任务结构。 计算设备可以确定任务队列中的任务指针是否是轻量级内核的简单任务指针。 计算设备可以安排轻量级内核的第一简单任务以供线程执行。 计算设备可以从简单任务表的条目中检索轻量级内核的内核指针。 简单任务表中的条目可能与简单任务指针相关联。 计算设备可以直接执行轻量级内核作为简单任务。

    Method and system for accelerating task control flow
    6.
    发明授权
    Method and system for accelerating task control flow 有权
    加快任务控制流程的方法和系统

    公开(公告)号:US09529643B2

    公开(公告)日:2016-12-27

    申请号:US14604845

    申请日:2015-01-26

    CPC classification number: G06F9/52 G06F9/4806 G06F9/4881 Y02D10/24

    Abstract: A computing device (e.g., a mobile computing device, etc.) may be configured to may be configured to better exploit the concurrency and parallelism enabled by modern multiprocessor architectures by identifying a sequence of tasks via a task dependency controller, commencing execution of a first task in the sequence of tasks, and setting a value of a register so that each remaining task in the sequence of tasks executes after its predecessor task finishes execution without transferring control to a runtime system of the computing device. The task dependency controller may be a hardware component that is shared by the processor cores and/or otherwise configured to transfer control between tasks executing on different processor cores independent of the runtime system and/or without performing the relatively slow and memory-based inter-task, inter-thread or inter-process communications required by conventional solutions.

    Abstract translation: 计算设备(例如,移动计算设备等)可以被配置为可以被配置为通过经由任务依赖性控制器识别任务序列来更好地利用现代多处理器架构实现的并发性和并行性,开始执行第一 任务序列中的任务,以及设置寄存器的值,使得任务序列中的每个剩余任务在其前任任务完成执行之后执行,而不将控制转移到计算设备的运行时系统。 任务依赖性控制器可以是由处理器核共享的硬件组件和/或另外被配置成在独立于运行时系统的不同处理器核上执行的任务之间传送控制和/或不执行相对较慢和基于存储器的间隔 任务,跨线程或传统解决方案所需的进程间通信。

    Hardware Acceleration For Inline Caches In Dynamic Languages
    7.
    发明申请
    Hardware Acceleration For Inline Caches In Dynamic Languages 有权
    动态语言中的内联缓存的硬件加速

    公开(公告)号:US20150205720A1

    公开(公告)日:2015-07-23

    申请号:US14262871

    申请日:2014-04-28

    CPC classification number: G06F9/4491 G06F12/0802

    Abstract: Aspects include apparatuses, systems, and methods for hardware acceleration for inline caches in dynamic languages. An inline cache may be initialized for an instance of a dynamic software operation. A call of an initialized instance of the dynamic software operation may be executed by an inline cache hardware accelerator. The inline cache may be checked to determine that its data is current. When the data is current, the initialized instance of the dynamic software operation may be executed using the related inline cache data. When the data is not current, a new inline cache may be initialized for the instance of the dynamic software operation, including the not current data of a previously initialized instance of the dynamic software operation. The inline cache hardware accelerator may include an inline cache memory, a coprocessor, and/or a functional until one an inline cache pipeline connected to a processor pipeline.

    Abstract translation: 方面包括用于动态语言的内联高速缓存的硬件加速的装置,系统和方法。 可以为动态软件操作的实例初始化内联缓存。 动态软件操作的初始化实例的调用可以由内联高速缓存硬件加速器执行。 可以检查内联高速缓存以确定其数据是当前的。 当数据是最新的时,可以使用相关的在线高速缓存数据来执行动态软件操作的初始化实例。 当数据不是当前的时候,可以为动态软件操作的实例初始化新的内联高速缓存,包括动态软件操作的先前初始化的实例的当前数据。 内联高速缓存硬件加速器可以包括内联高速缓冲存储器,协处理器和/或功能,直到连接到处理器流水线的内联高速缓存流水线为止。

    SYSTEMS AND METHODS FOR SELECTION OF SPECIALIZED FUNCTIONS IN DYNAMICALLY-TYPED LANGUAGES
    8.
    发明申请
    SYSTEMS AND METHODS FOR SELECTION OF SPECIALIZED FUNCTIONS IN DYNAMICALLY-TYPED LANGUAGES 审中-公开
    用于选择动态语言中特殊功能的系统和方法

    公开(公告)号:US20140173556A1

    公开(公告)日:2014-06-19

    申请号:US14083264

    申请日:2013-11-18

    CPC classification number: G06F8/31 G06F9/45529

    Abstract: Systems, methods, and devices for executing a function in a dynamically-typed language are described herein. In one aspect, a method includes generating a function selection decision tree based on one or more specializations of a generic function and one or more function inputs via an electronic device. The method further includes selecting one of the specializations or the generic function based on an input type of at least one function input via the electronic device. The method further includes calling the selected specialization or generic function via the electronic device. Another aspect of the subject matter described in the disclosure provides a method of executing a function in a prototype-based dynamically-typed language. The method includes maintaining a list of calls to one or more specializations of the function via the electronic device. The method further includes creating or destroying a specialization of the function via the electronic device. The method further includes updating calls to the created or destroyed specialization via the electronic device. Advantageously in certain embodiments, selection logic overhead can be reduced using criteria that can utilize different weightages for one or more inputs based on heuristics or runtime information.

    Abstract translation: 这里描述了用于以动态类型语言执行功能的系统,方法和设备。 一方面,一种方法包括基于通用功能的一个或多个专业化以及经由电子设备的一个或多个功能输入来生成功能选择决策树。 该方法还包括基于经由电子设备的至少一个功能输入的输入类型来选择专业化或通用功能之一。 该方法还包括通过电子设备调用所选择的专门化或通用功能。 在本公开中描述的主题的另一方面提供了一种在基于原型的动态类型语言中执行功能的方法。 该方法包括通过电子设备维护对功能的一个或多个专业化的呼叫列表。 该方法还包括通过电子设备创建或破坏功能的专业化。 该方法还包括通过电子设备更新对创建或销毁的专业化的呼叫。 有利地,在某些实施例中,可以使用可以基于启发式或运行时信息为一个或多个输入利用不同权重的标准来减少选择逻辑开销。

    Data-Driven Accelerator For Machine Learning And Raw Data Analysis

    公开(公告)号:US20170083827A1

    公开(公告)日:2017-03-23

    申请号:US14862408

    申请日:2015-09-23

    CPC classification number: G06N20/00 G06F15/8092

    Abstract: Embodiments include computing devices, apparatus, and methods implemented by the apparatus for accelerating machine learning on a computing device. Raw data may be received in the computing device from a raw data source device. The apparatus may identify key features as two dimensional matrices of the raw data such that the key features are mutually exclusive from each other. The key features may be translated into key feature vectors. The computing device may generate a feature vector from at least one of the key feature vectors. The computing device may receive a first partial output resulting from an execution of a basic linear algebra subprogram (BLAS) operation using the feature vector and a weight factor. The first partial output may be combined with a plurality of partial outputs to produce an output matrix. Receiving the raw data on the computing device may include receiving streaming raw data.

    Method for exploiting parallelism in task-based systems using an iteration space splitter
    10.
    发明授权
    Method for exploiting parallelism in task-based systems using an iteration space splitter 有权
    使用迭代空间分离器在基于任务的系统中利用并行性的方法

    公开(公告)号:US09501328B2

    公开(公告)日:2016-11-22

    申请号:US14673857

    申请日:2015-03-30

    CPC classification number: G06F9/5066 G06F9/5027

    Abstract: Embodiments include computing devices, systems, and methods for task-based handling of repetitive processes in parallel. At least one processor of the computing device, or a specialized hardware controller, may be configured to partition iterations of a repetitive process and assign the partitions to initialized tasks to be executed in parallel by a plurality of processor cores. Upon completing a task, remaining divisible partitions of the repetitive process of ongoing tasks may be subpartitioned and assigned to the ongoing task, and the completed task or a newly initialized task. Information about the iteration space for a repetitive process may be stored in a descriptor table, and status information for all partitions of a repetitive process stored in a status table. Each processor core may have an associated local table that tracks iteration execution of each task, and is synchronized with the status table.

    Abstract translation: 实施例包括用于并行地重复处理的基于任务的处理的计算设备,系统和方法。 计算设备的至少一个处理器或专用硬件控制器可以被配置为分区重复过程的迭代,并且将分区分配给由多个处理器核并行执行的初始化任务。 完成任务后,正在执行的任务的重复进程的剩余可分区可以被分分区并分配给正在进行的任务,以及完成的任务或新初始化的任务。 关于重复过程的迭代空间的信息可以存储在描述符表中,以及存储在状态表中的重复进程的所有分区的状态信息。 每个处理器核心可以具有跟踪每个任务的迭代执行的相关联的本地表,并且与状态表同步。

Patent Agency Ranking