Abstract:
A method of generating a feature descriptor includes determining a first output histogram of an input by processing a first group of pixels of the input to determine first contributions to bins of the first output histogram. The input image including gradient orientation values and gradient magnitude values of a portion of an image that is in a region of a detected feature. After processing the first group of pixels, the method includes determining a second output histogram of the input by processing a second group of pixels of the input to determine second contributions to bins of the second output histogram.
Abstract:
An interleaved multithreading pipeline operating method comprises reading an instruction packet containing at least two instructions, steering a first instruction of the instruction packet to a first execution unit for execution and generating a first result, steering a second instruction of the instruction packet to a second execution unit for execution using the first result and generating a second result, and storing the second result.
Abstract:
A multithreaded processor device is disclosed and includes a processor that is configured to execute a plurality of executable program threads and a mode control register. The mode control register includes a first data field to control a first execution mode of a first of the plurality of executable program threads and a second data field to control a second execution mode of a second of the plurality of executable program threads. In a particular embodiment, the first execution mode is a run mode and the second execution mode is a low power mode.
Abstract:
A processor device is disclosed and includes a memory unit and at least one interleaved multi-threading instruction pipeline. The interleaved multi-threading instruction pipeline utilizes a number of clock cycles that is less than an instruction issue rate for each of a plurality of program threads that are stored within the memory unit. The memory unit includes six instruction caches. Further, the processor device includes six register files and each of the six register files is associated with one of the six instruction caches. Each of the plurality of program threads is associated with one of the six register files. Further, each of the six program threads includes a plurality of instructions and each of the plurality of instructions is stored within one of the six instruction caches of the memory.
Abstract:
An instruction memory unit comprises a first memory structure operable to store program instructions, and a second memory structure operable to store program instructions fetched from the first memory structure, and to issue stored program instructions for execution. The second memory structure is operable to identify a repeated issuance of a forward program redirect construct, and issue a next program instruction already stored in the second memory structure if a resolution of the forward branching instruction is identical to a last resolution of the same. The second memory structure is further operable to issue a backward program redirect construct, determine whether a target instruction is stored in the second memory structure, issue the target instruction if the target instruction is stored in the second memory structure, and fetch the target instruction from the first memory structure if the target instruction is not stored in the second memory structure.
Abstract:
Techniques for processing transmissions in a communications (e.g., CDMA) system. An aspect of the disclosed subject matter includes a method for processing instructions on a multithreaded processor. The multithreaded processor processes a plurality of threads via a plurality of processor pipelines. The method includes the step determining the operating frequency, F, at which the multithreaded processor operates. Then, the method determines a variable thread switch timeout state for triggering the switching of the processing among the plurality of active threads. The variable thread switch timeout state varies so that each of the plurality of active threads operates at a frequency of an allocated portion of the frequency, F. The allocated portion at which the active threads operate is determined at least in part in order to optimize the operation of the multithreaded processor. The method furtherswitches the processing from a first one of the active threads to a next one of the active threads upon the occurrence of the variable thread switch timeout state.
Abstract:
Techniques for processing transmissions in a communications (e.g., CDMA) system. The method and system encode and process instructions of mixed lengths (e.g., 16 bits and 32 bits) and instruction packets including instructions of mixed lengths. This includes encoding a plurality of instructions of a first length and a plurality of instructions of a second length. The method and system encode a header having at least one instruction length bit. The instruction bit distinguishes between instructions of the first length and instructions of the second length for an associated DSP to process in a mixed stream. The method and system distinguish between the instructions of the first length and the instructions of the second length according to the contents of the instruction length bits. The header further includes bits for distinguishing between instructions of varying lengths in an instruction packet.
Abstract:
A method of constraining data represented in a deep neural network is described. The method includes determining an initial shifting specified to convert a fixed-point input value to a floating-point output value. The method also includes determining an additional shifting specified to constrain a dynamic range during converting of the fixed-point input value to the floating-point output value. The method further includes performing both the initial shifting and the additional shifting together to form a dynamic, range constrained, normalized floating-point output value.
Abstract:
A processor includes priority adjustment circuitry configured to adjust a priority of a thread of multiple threads configured to execute tasks to have a software-defined priority value or a designated high priority value. The processor also includes circuitry configured to identify a lowest priority thread of the multiple threads and a control unit configured to cause the lowest priority thread to take a pending interrupt.
Abstract:
Methods and apparatus for encoding information regarding a hardware loop of a set of packets is provided, each packet (400) containing instructions. The information is encoded into one or more bits of at least one instruction (300) in the set of packets. The information may indicate whether a packet is or is not an end packet of the loop. Information regarding two hardware loops may be encoded where information regarding the first loop is encoded into an instruction at a first position in each packet and information regarding the second loop is encoded into an instruction at a second position in each packet. End instruction information may be encoded into an instruction not having encoded loop information at the same bit positions reserved for the encoded loop information, the end instruction information indicating whether an instruction is the last instruction of a packet and the length of a packet.