摘要:
Mechanisms are provided for debugging application code using a content addressable memory. The mechanisms receive an instruction in a hardware unit of a processor of the data processing system, the instruction having a target memory address that the instruction is attempting to access. A content addressable memory (CAM) associated with the hardware unit is searched for an entry in the CAM corresponding to the target memory address. In response to an entry in the CAM corresponding to the target memory address being found, a determination is made as to whether information in the entry identifies the instruction as an instruction of interest. In response to the entry identifying the instruction as an instruction of interest, an exception is generated and sent to one of an exception handler or a debugger application. In this way, debugging of multithreaded applications may be performed in an efficient manner.
摘要:
A mechanism is provided for improving single-thread performance for a multi-threaded, in-order processor core. In a first phase, a compiler analyzes application code to identify instructions that can be executed in parallel with focus on instruction-level parallelism and removing any register interference between the threads. The compiler inserts as appropriate synchronization instructions supported by the apparatus to ensure that the resulting execution of the threads is equivalent to the execution of the application code in a single thread. In a second phase, an operating system schedules the threads produced in the first phase on the hardware threads of a single processor core such that they execute simultaneously. In a third phase, the microprocessor core executes the threads specified by the second phase such that there is one hardware thread executing an application thread.
摘要:
Mechanisms are provided for debugging application code using a content addressable memory. The mechanisms receive an instruction in a hardware unit of a processor of the data processing system, the instruction having a target memory address that the instruction is attempting to access. A content addressable memory (CAM) associated with the hardware unit is searched for an entry in the CAM corresponding to the target memory address. In response to an entry in the CAM corresponding to the target memory address being found, a determination is made as to whether information in the entry identifies the instruction as an instruction of interest. In response to the entry identifying the instruction as an instruction of interest, an exception is generated and sent to one of an exception handler or a debugger application. In this way, debugging of multithreaded applications may be performed in an efficient manner.
摘要:
A mechanism is provided for improving single-thread performance for a multi-threaded, in-order processor core. In a first phase, a compiler analyzes application code to identify instructions that can be executed in parallel with focus on instruction-level parallelism and removing any register interference between the threads. The compiler inserts as appropriate synchronization instructions supported by the apparatus to ensure that the resulting execution of the threads is equivalent to the execution of the application code in a single thread. In a second phase, an operating system schedules the threads produced in the first phase on the hardware threads of a single processor core such that they execute simultaneously. In a third phase, the microprocessor core executes the threads specified by the second phase such that there is one hardware thread executing an application thread.
摘要:
Mechanisms are provided for debugging application code using a content addressable memory. The mechanisms receive an instruction in a hardware unit of a processor of the data processing system, the instruction having a target memory address that the instruction is attempting to access. A content addressable memory (CAM) associated with the hardware unit is searched for an entry in the CAM corresponding to the target memory address. In response to an entry in the CAM corresponding to the target memory address being found, a determination is made as to whether information in the entry identifies the instruction as an instruction of interest. In response to the entry identifying the instruction as an instruction of interest, an exception is generated and sent to one of an exception handler or a debugger application. In this way, debugging of multithreaded applications may be performed in an efficient manner.
摘要:
A mechanism is provided for automatic use of large pages. An operating system loader performs aggressive contiguous allocation followed by demand paging of small pages into a best-effort contiguous and naturally aligned physical address range sized for a large page. The operating system detects when the large page is fully populated and switches the mapping to use large pages. If the operating system runs low on memory, the operating system can free portions and degrade gracefully.
摘要:
A method and processor architecture for achieving a high level of concurrency and latency hiding in an “infinite-thread processor architecture” with a limited number of hardware threads is disclosed. A preferred embodiment defines “fork” and “join” instructions for spawning new context-switched threads. Context switching is used to hide the latency of both memory-access operations (i.e., loads and stores) and arithmetic/logical operations. When an operation executing in a thread incurs a latency having the potential to delay the instruction pipeline, the latency is hidden by performing a context switch to a different thread. When the result of the operation becomes available, a context switch back to that thread is performed to allow the thread to continue.
摘要:
An extended register processor includes a register file having a legacy register set and an extended register set. The extended register set includes a plurality of extended registers accessible only to extended register instructions. The processor maps extended register references to physical extended registers at run time. The processor includes a configurable extended register mapping unit to support this functionality. The mapping unit is accessible to an instruction decoder, which detects extended register references and forwards them to the mapping unit. The mapping unit returns a physical extended register corresponding to the extended register reference in the instruction. The mapping unit is configurable so that, for example, the mapping is specific to a code block. An extended register allocation instruction causes the processor to allocate a portion of the extended register set to the code block in which the declaration is located and to configure the mapping unit to reflect the allocation.
摘要:
A method for managing data access in a mobile device is provided in the illustrative embodiments. Using a data manager executing in the mobile device, a data item is configured in a data model. A value parameter of the data item is populated with data and a status parameter of the data item is populated with a status indication. A subscription to the data item is received from a mobile application executing in the mobile device. In response to the subscription, the data and the status of the data item are sent to the mobile application.
摘要:
The page tables in existing art are modified to allow virtual address resolution by mapping to multiple overlapping entries, and resolving a physical address from the most specific entry. This enables more efficient use of system resources by allowing smaller frames to shadow larger frames. A page table is selected. When a virtual address in a request corresponds to an entry in the page table, which identifies a next page table associated with the large frame, a determination is made that the virtual address corresponds to an entry in the next page table, the entry in the next page table referencing a small frame overlay for the large frame. The virtual address is mapped to a physical address in the small frame overlay using data of the entry in the next page table. The physical address in a process-specific view of the large frame is returned.