摘要:
A cache subsystem may comprise a multi-way set associative cache and a data memory that holds a contiguous block of memory defined by an address stored in a register. Local variables (e.g., Java local variables) may be stored in the data memory. The data memory preferably is adapted to store two groups of local variables. A first group comprises local variables associated with finished methods and a second group comprises local variables associated with unfinished methods. Further, local variables are saved to, or fetched from, external memory upon a context change based on a threshold value differentiating the first and second groups. The first value may comprise a threshold address or an allocation bit associated with each of a plurality of lines forming the data memory.
摘要:
A processor (e.g., a co-processor) comprising a decoder coupled to a pre-decoder, in which the decoder decodes a current instruction in parallel with the pre-decoder pre-decoding a subsequent instruction. In particular, the pre-decoder examines at least five Bytecodes in parallel with the decoder decoding a current instruction. The pre-decoder determines if a subsequent instruction contains a prefix. If a prefix is detected in at least one of the five Bytecodes, a program counter skips the prefix and changes the behavior of the decoder during the decoding of the subsequent instruction.
摘要:
In some embodiments, a storage medium comprises application software that performs one or more operations and that directly manages a device. The application software comprises instructions that initialize an application data structure (e.g., an object or array) usable by the application software to manage the device and also comprises instructions that map the application data structure to a memory associated with the device without the use of a device driver. In other embodiments, a method comprises initializing an application data structure to manage a hardware device and mapping the application data structure to a memory associated with the hardware device without the use of a device driver. The application data structure may store a single dimensional data structure or a multi-dimensional data structure. In some embodiments, the device being managed by the application software may comprise a display and the application software may comprise Java code.
摘要:
A technique comprises receiving an instruction and dynamically changing the instruction's semantic based on programmable information that is separate from the instruction. The change in semantic may comprise the inclusion of monitoring code that determines a performance characteristic associated with the instruction or a change in the instruction's operation (e.g., the inclusion of read or write barrier operations to support a garbage collector).
摘要:
A digital system is provided with a several processors, a private level one (L1) cache associated with each processor, a shared level two (L2) cache having several segments per entry, and a level three (L3) physical memory. The shared L2 cache architecture is embodied with 4-way associativity, four segments per entry and four valid and dirty bits. When the L2-cache misses, the penalty to access to data within the L3 memory is high. The system supports miss under miss to let a second miss interrupt a segment prefetch being done in response to a first miss. Thus, an interruptible SDRAM to L2-cache prefetch system with miss under miss support is provided. A shared translation look-aside buffer (TLB) is provided for L2 accesses, while a private TLB is associated with each processor. A micro TLB (&mgr;TLB) is associated with each resource that can initiate a memory transfer. The L2 cache, along with all of the TLBs and &mgr;TLBs have resource ID fields and task ID fields associated with each entry to allow flushing and cleaning based on resource or task. Configuration circuitry is provided to allow the digital system to be configured on a task by task basis in order to reduce power consumption.
摘要:
Methods and apparatuses are disclosed for implementing a processor with a split stack. In some embodiments, the processor includes a main stack and a micro-stack. The micro-stack preferably is implemented in the core of the processor, whereas the main stack may be implemented in areas that are external to the core of the processor. Operands are preferably provided to an arithmetic logic unit (ALU) by the micro-stack, and in the case of underflow (micro-stack empty), operands may be fetched from the main stack. Operands are written to the main stack during overflow (micro-stack full) or by explicit flushing of the micro-stack. By optimizing the size of the micro-stack, the number of operands fetched from the main stack may be reduced, and consequently the processor's power consumption may be reduced.
摘要:
A system comprises a first processor, a second processor coupled to the first processor, memory coupled to, and shared by, the first and second processors, and a synchronization unit coupled to the first and second processors. The second processor preferably comprises stack storage that resides in the core of the second processor. Further, the second processor executes stack-based instructions while the first processor executes one or more tasks including, for example, managing the memory via an operating system that executes only on the first processor. Associated methods are also disclosed.
摘要:
An electronic system comprises a processor, memory coupled to the processor, and an application programming interface that causes an embedded garbage collection object to be active. The memory stores one or more objects that selectively have references from root objects. The embedded garbage collection object preferably uses control data to cause objects to be removed from said memory, the removed objects comprise those objects that were created while an embedded garbage collection object was active and that do not have references from root objects.
摘要:
A process and associated system comprise pre-allocating a portion of memory in a first processor based upon a control input and determining in a second processor if the portion of the pre-allocated memory can satisfy a memory allocation request. Further, if a portion of pre-allocated memory can satisfy a memory allocation request, the technique includes assigning the pre-allocated portion of memory to the allocation request. However, if a portion of pre-allocated memory cannot satisfy a memory allocation request, the technique includes allocating a portion of memory in the first processor to the allocation request.
摘要:
A system comprises a first processor having cache memory, a second processor having cache memory and a coherence buffer that can be enabled and disabled by the first processor. The system also comprises a memory subsystem coupled to the first and second processors. For a write transaction originating from the first processor, the first processor enables the second processor's coherence buffer, and information associated with the first processor's write transaction is stored in the second processor's coherence buffer to maintain data coherency between the first and second processors.