Debugging multithreaded code by generating exception upon target address CAM search for variable and checking race condition
    1.
    发明授权
    Debugging multithreaded code by generating exception upon target address CAM search for variable and checking race condition 有权
    通过在目标地址CAM搜索变量和检查竞争条件时产生异常来调试多线程代码

    公开(公告)号:US08838939B2

    公开(公告)日:2014-09-16

    申请号:US13439229

    申请日:2012-04-04

    IPC分类号: G06F9/312 G06F9/38

    摘要: Mechanisms are provided for debugging application code using a content addressable memory. The mechanisms receive an instruction in a hardware unit of a processor of the data processing system, the instruction having a target memory address that the instruction is attempting to access. A content addressable memory (CAM) associated with the hardware unit is searched for an entry in the CAM corresponding to the target memory address. In response to an entry in the CAM corresponding to the target memory address being found, a determination is made as to whether information in the entry identifies the instruction as an instruction of interest. In response to the entry identifying the instruction as an instruction of interest, an exception is generated and sent to one of an exception handler or a debugger application. In this way, debugging of multithreaded applications may be performed in an efficient manner.

    摘要翻译: 提供了使用内容可寻址存储器调试应用程序代码的机制。 机构在数据处理系统的处理器的硬件单元中接收指令,该指令具有指令试图访问的目标存储器地址。 搜索与硬件单元相关联的内容可寻址存储器(CAM),以对应于目标存储器地址的CAM中的条目。 响应于与所找到的目标存储器地址相对应的CAM中的条目,确定条目中的信息是否将该指令识别为感兴趣的指令。 响应于将该指令识别为感兴趣的指令的条目,生成异常并将其发送到异常处理程序或调试器应用程序之一。 以这种方式,可以以有效的方式执行多线程应用程序的调试。

    Single thread performance in an in-order multi-threaded processor
    2.
    发明授权
    Single thread performance in an in-order multi-threaded processor 失效
    单线程性能在一个顺序的多线程处理器

    公开(公告)号:US08650554B2

    公开(公告)日:2014-02-11

    申请号:US12767886

    申请日:2010-04-27

    IPC分类号: G06F9/44 G06F9/45

    CPC分类号: G06F8/456

    摘要: A mechanism is provided for improving single-thread performance for a multi-threaded, in-order processor core. In a first phase, a compiler analyzes application code to identify instructions that can be executed in parallel with focus on instruction-level parallelism and removing any register interference between the threads. The compiler inserts as appropriate synchronization instructions supported by the apparatus to ensure that the resulting execution of the threads is equivalent to the execution of the application code in a single thread. In a second phase, an operating system schedules the threads produced in the first phase on the hardware threads of a single processor core such that they execute simultaneously. In a third phase, the microprocessor core executes the threads specified by the second phase such that there is one hardware thread executing an application thread.

    摘要翻译: 提供了一种用于提高多线程,按顺序处理器内核的单线程性能的机制。 在第一阶段,编译器分析应用程序代码以识别可以并行执行的指令,重点是指令级并行性,并消除线程之间的任何寄存器干扰。 编译器插入作为设备支持的适当的同步指令,以确保线程的结果执行等同于单个线程中应用程序代码的执行。 在第二阶段中,操作系统在单个处理器核心的硬件线程上调度在第一阶段产生的线程,使得它们同时执行。 在第三阶段,微处理器核心执行由第二阶段指定的线程,使得有一个硬件线程执行应用程序线程。

    Architecture Support for Debugging Multithreaded Code
    3.
    发明申请
    Architecture Support for Debugging Multithreaded Code 有权
    架构支持调试多线程代码

    公开(公告)号:US20120203979A1

    公开(公告)日:2012-08-09

    申请号:US13439229

    申请日:2012-04-04

    IPC分类号: G06F9/30 G06F12/00

    摘要: Mechanisms are provided for debugging application code using a content addressable memory. The mechanisms receive an instruction in a hardware unit of a processor of the data processing system, the instruction having a target memory address that the instruction is attempting to access. A content addressable memory (CAM) associated with the hardware unit is searched for an entry in the CAM corresponding to the target memory address. In response to an entry in the CAM corresponding to the target memory address being found, a determination is made as to whether information in the entry identifies the instruction as an instruction of interest. In response to the entry identifying the instruction as an instruction of interest, an exception is generated and sent to one of an exception handler or a debugger application. In this way, debugging of multithreaded applications may be performed in an efficient manner.

    摘要翻译: 提供了使用内容可寻址存储器调试应用程序代码的机制。 机构在数据处理系统的处理器的硬件单元中接收指令,该指令具有指令试图访问的目标存储器地址。 搜索与硬件单元相关联的内容可寻址存储器(CAM),以对应于目标存储器地址的CAM中的条目。 响应于与所找到的目标存储器地址相对应的CAM中的条目,确定条目中的信息是否将该指令识别为感兴趣的指令。 响应于将该指令识别为感兴趣的指令的条目,生成异常并将其发送到异常处理程序或调试器应用程序之一。 以这种方式,可以以有效的方式执行多线程应用程序的调试。

    Single Thread Performance in an In-Order Multi-Threaded Processor
    4.
    发明申请
    Single Thread Performance in an In-Order Multi-Threaded Processor 失效
    单线程处理器中的单线程性能

    公开(公告)号:US20110265068A1

    公开(公告)日:2011-10-27

    申请号:US12767886

    申请日:2010-04-27

    IPC分类号: G06F9/45

    CPC分类号: G06F8/456

    摘要: A mechanism is provided for improving single-thread performance for a multi-threaded, in-order processor core. In a first phase, a compiler analyzes application code to identify instructions that can be executed in parallel with focus on instruction-level parallelism and removing any register interference between the threads. The compiler inserts as appropriate synchronization instructions supported by the apparatus to ensure that the resulting execution of the threads is equivalent to the execution of the application code in a single thread. In a second phase, an operating system schedules the threads produced in the first phase on the hardware threads of a single processor core such that they execute simultaneously. In a third phase, the microprocessor core executes the threads specified by the second phase such that there is one hardware thread executing an application thread.

    摘要翻译: 提供了一种用于提高多线程,按顺序处理器内核的单线程性能的机制。 在第一阶段,编译器分析应用程序代码以识别可以并行执行的指令,重点是指令级并行性,并消除线程之间的任何寄存器干扰。 编译器插入作为设备支持的适当的同步指令,以确保线程的结果执行等同于单个线程中应用程序代码的执行。 在第二阶段中,操作系统在单个处理器核心的硬件线程上调度在第一阶段产生的线程,使得它们同时执行。 在第三阶段,微处理器核心执行由第二阶段指定的线程,使得有一个硬件线程执行应用程序线程。

    Architecture Support for Debugging Multithreaded Code
    5.
    发明申请
    Architecture Support for Debugging Multithreaded Code 审中-公开
    架构支持调试多线程代码

    公开(公告)号:US20110258421A1

    公开(公告)日:2011-10-20

    申请号:US12762817

    申请日:2010-04-19

    IPC分类号: G06F9/44 G06F9/30

    摘要: Mechanisms are provided for debugging application code using a content addressable memory. The mechanisms receive an instruction in a hardware unit of a processor of the data processing system, the instruction having a target memory address that the instruction is attempting to access. A content addressable memory (CAM) associated with the hardware unit is searched for an entry in the CAM corresponding to the target memory address. In response to an entry in the CAM corresponding to the target memory address being found, a determination is made as to whether information in the entry identifies the instruction as an instruction of interest. In response to the entry identifying the instruction as an instruction of interest, an exception is generated and sent to one of an exception handler or a debugger application. In this way, debugging of multithreaded applications may be performed in an efficient manner.

    摘要翻译: 提供了使用内容可寻址存储器调试应用程序代码的机制。 机构在数据处理系统的处理器的硬件单元中接收指令,该指令具有指令试图访问的目标存储器地址。 搜索与硬件单元相关联的内容可寻址存储器(CAM),以对应于目标存储器地址的CAM中的条目。 响应于与所找到的目标存储器地址相对应的CAM中的条目,确定条目中的信息是否将该指令识别为感兴趣的指令。 响应于将该指令识别为感兴趣的指令的条目,生成异常并将其发送到异常处理程序或调试器应用程序之一。 以这种方式,可以以有效的方式执行多线程应用程序的调试。

    Automatic use of large pages
    6.
    发明授权
    Automatic use of large pages 有权
    自动使用大页面

    公开(公告)号:US08954707B2

    公开(公告)日:2015-02-10

    申请号:US13565985

    申请日:2012-08-03

    IPC分类号: G06F12/10

    CPC分类号: G06F12/023

    摘要: A mechanism is provided for automatic use of large pages. An operating system loader performs aggressive contiguous allocation followed by demand paging of small pages into a best-effort contiguous and naturally aligned physical address range sized for a large page. The operating system detects when the large page is fully populated and switches the mapping to use large pages. If the operating system runs low on memory, the operating system can free portions and degrade gracefully.

    摘要翻译: 提供了一种自动使用大页面的机制。 操作系统加载器执行积极的连续分配,然后将小页面的需求寻呼到为大页面设置的尽力而为的连续且自然对准的物理地址范围。 操作系统检测大页面何时完全填充,并将映射切换为使用大页面。 如果操作系统内存不足,操作系统可以释放部分并正常降级。

    Multithreaded processor architecture with operational latency hiding
    7.
    发明授权
    Multithreaded processor architecture with operational latency hiding 有权
    具有可操作延迟隐藏的多线程处理器架构

    公开(公告)号:US08230423B2

    公开(公告)日:2012-07-24

    申请号:US11101601

    申请日:2005-04-07

    IPC分类号: G06F9/46 G06F9/40 G06F7/38

    摘要: A method and processor architecture for achieving a high level of concurrency and latency hiding in an “infinite-thread processor architecture” with a limited number of hardware threads is disclosed. A preferred embodiment defines “fork” and “join” instructions for spawning new context-switched threads. Context switching is used to hide the latency of both memory-access operations (i.e., loads and stores) and arithmetic/logical operations. When an operation executing in a thread incurs a latency having the potential to delay the instruction pipeline, the latency is hidden by performing a context switch to a different thread. When the result of the operation becomes available, a context switch back to that thread is performed to allow the thread to continue.

    摘要翻译: 公开了一种用于在具有有限数量的硬件线程的“无限线程处理器架构”中实现高水平并发和延迟隐藏的方法和处理器架构。 优选实施例定义了用于产生新的上下文切换线程的“fork”和“join”指令。 上下文切换用于隐藏两个存储器访问操作(即,加载和存储)和算术/逻辑操作的延迟。 当在线程中执行的操作引起具有延迟指令流水线的可能性的等待时间时,通过执行到不同线程的上下文切换来隐藏等待时间。 当操作的结果变得可用时,执行回到该线程的上下文切换以允许线程继续。

    Extended register bank allocation based on status mask bits set by allocation instruction for respective code block
    8.
    发明授权
    Extended register bank allocation based on status mask bits set by allocation instruction for respective code block 失效
    基于由各个代码块的分配指令设置的状态屏蔽位的扩展寄存器组分配

    公开(公告)号:US07231509B2

    公开(公告)日:2007-06-12

    申请号:US11034559

    申请日:2005-01-13

    IPC分类号: G06F9/34

    摘要: An extended register processor includes a register file having a legacy register set and an extended register set. The extended register set includes a plurality of extended registers accessible only to extended register instructions. The processor maps extended register references to physical extended registers at run time. The processor includes a configurable extended register mapping unit to support this functionality. The mapping unit is accessible to an instruction decoder, which detects extended register references and forwards them to the mapping unit. The mapping unit returns a physical extended register corresponding to the extended register reference in the instruction. The mapping unit is configurable so that, for example, the mapping is specific to a code block. An extended register allocation instruction causes the processor to allocate a portion of the extended register set to the code block in which the declaration is located and to configure the mapping unit to reflect the allocation.

    摘要翻译: 扩展寄存器处理器包括具有遗留寄存器组和扩展寄存器组的寄存器文件。 扩展寄存器集合包括可扩展寄存器指令可访问的多个扩展寄存器。 处理器在运行时将扩展寄存器引用映射到物理扩展寄存器。 该处理器包括一个可配置的扩展寄存器映射单元来支持该功能。 指令解码器可访问映射单元,该指令解码器检测扩展寄存器引用并将其转发给映射单元。 映射单元返回与指令中的扩展寄存器引用相对应的物理扩展寄存器。 映射单元是可配置的,使得例如映射特定于代码块。 扩展寄存器分配指令使处理器将扩展寄存器集的一部分分配给声明所在的代码块,并配置映射单元以反映分配。

    Managing data access in mobile devices

    公开(公告)号:US10469979B2

    公开(公告)日:2019-11-05

    申请号:US13459779

    申请日:2012-04-30

    IPC分类号: G06F15/16 H04W4/00

    摘要: A method for managing data access in a mobile device is provided in the illustrative embodiments. Using a data manager executing in the mobile device, a data item is configured in a data model. A value parameter of the data item is populated with data and a status parameter of the data item is populated with a status indication. A subscription to the data item is received from a mobile application executing in the mobile device. In response to the subscription, the data and the status of the data item are sent to the mobile application.

    USING LARGE FRAME PAGES WITH VARIABLE GRANULARITY
    10.
    发明申请
    USING LARGE FRAME PAGES WITH VARIABLE GRANULARITY 有权
    使用可变尺寸的大框架页

    公开(公告)号:US20140013073A1

    公开(公告)日:2014-01-09

    申请号:US13541055

    申请日:2012-07-03

    IPC分类号: G06F12/10

    CPC分类号: G06F12/1009 G06F12/109

    摘要: The page tables in existing art are modified to allow virtual address resolution by mapping to multiple overlapping entries, and resolving a physical address from the most specific entry. This enables more efficient use of system resources by allowing smaller frames to shadow larger frames. A page table is selected. When a virtual address in a request corresponds to an entry in the page table, which identifies a next page table associated with the large frame, a determination is made that the virtual address corresponds to an entry in the next page table, the entry in the next page table referencing a small frame overlay for the large frame. The virtual address is mapped to a physical address in the small frame overlay using data of the entry in the next page table. The physical address in a process-specific view of the large frame is returned.

    摘要翻译: 现有技术中的页表被修改为允许通过映射到多个重叠条目来实现虚拟地址解析,并从最特定的条目解析物理地址。 通过允许较小的帧来遮蔽较大的帧,这样可以更有效地利用系统资源。 选择页表。 当请求中的虚拟地址对应于页表中的与大帧相关联的下一页表的条目时,确定虚拟地址对应于下一页表中的条目, 引用大帧的小帧覆盖的下一页表。 使用下一页表中条目的数据将虚拟地址映射到小帧覆盖中的物理地址。 返回大帧的进程特定视图中的物理地址。