Queuing cache for vectors with elements in predictable order
    1.
    发明授权
    Queuing cache for vectors with elements in predictable order 有权
    用可预测顺序的元素排队缓存

    公开(公告)号:US07246203B2

    公开(公告)日:2007-07-17

    申请号:US10993972

    申请日:2004-11-19

    IPC分类号: G06F12/00

    CPC分类号: G06F12/0862 G06F12/126

    摘要: A cache for storing data elements is disclosed. The cache includes a cache memory having one or more lines and one or more cache line counters, each associated with a line of the cache memory. In operation, a cache line counter of the one or more of cache line counters is incremented when a request is received to prefetch a data element into the cache memory and is decremented when the data element is consumed. Optionally, one or more reference queues may be used to store the locations of data elements in the cache memory. In one embodiment, data cannot be evicted from cache lines unless the associated cache line counters indicate that the prefetched data has been consumed.

    摘要翻译: 公开了一种用于存储数据元素的缓存。 高速缓存包括具有一行或多行和一个或多个高速缓存行计数器的高速缓冲存储器,每个缓存行计数器与高速缓冲存储器的一行相关联。 在操作中,当接收到请求以将数据元素预取入高速缓冲存储器时,高速缓存行计数器中的一个或多个的高速缓存行计数器递增,并且当数据元素被消耗时递减。 可选地,可以使用一个或多个参考队列来存储高速缓冲存储器中的数据元素的位置。 在一个实施例中,除非相关联的高速缓存行计数器指示预取的数据已被消耗,否则数据不能从高速缓存行逐出。

    Method and apparatus for elimination of prolog and epilog instructions in a vector processor using data validity tags and sink counters
    3.
    发明授权
    Method and apparatus for elimination of prolog and epilog instructions in a vector processor using data validity tags and sink counters 失效
    用于使用数据有效性标签和接收器计数器消除矢量处理器中的序言和epilog指令的方法和装置

    公开(公告)号:US07415601B2

    公开(公告)日:2008-08-19

    申请号:US10652135

    申请日:2003-08-29

    IPC分类号: G06F9/30

    摘要: A method and apparatus for the elimination of prolog and epilog instructions in a vector processor. To eliminate the prolog, a functional unit of the vector processor has at least one input for receiving an input data value tagged with a data validity tag and an output for outputting an intermediate result tagged with a data validity tag. The data validity tags indicate the validity of the data. Before a loop is executed, the data validity tags are set to indicate that the associated data values are invalid. During execution of the loop body a functional unit checks the validity of input data. If all of the input data values are valid the functional operation is performed, the corresponding data validity tag set to indicate that the result is valid. If any of the input data values is invalid, the data validity tag of the result is set to indicate that the result is invalid. To eliminate the epilog, an iteration counter is associated with each sink unit of the vector processor. When a specified number of data values have been produced by a particular sink, no more data values are produced by that sink. The instructions for the pipelined loop body may be repeated, without alteration, to eliminate prolog and epilog instructions.

    摘要翻译: 用于消除向量处理器中的序言和外延指令的方法和装置。 为了消除序言,矢量处理器的功能单元具有至少一个用于接收标记有数据有效性标签的输入数据值的输入和用于输出用数据有效性标签标记的中间结果的输出。 数据有效性标签表示数据的有效性。 在执行循环之前,数据有效性标签被设置为指示相关联的数据值无效。 在执行循环体期间,功能单元检查输入数据的有效性。 如果所有输入数据值都有效,则执行功能操作,设置相应的数据有效性标签以指示结果有效。 如果任何输入数据值无效,则将结果的数据有效性标签设置为指示结果无效。 为了消除epilog,迭代计数器与向量处理器的每个宿单元相关联。 当特定接收器产生指定数量的数据值时,该接收器不会产生更多的数据值。 流水线回路体的指令可以重复进行,而无需改变,以消除序言和epilog指令。

    Streaming vector processor with reconfigurable interconnection switch
    4.
    发明授权
    Streaming vector processor with reconfigurable interconnection switch 失效
    具有可重配置互连交换机的流媒体向量处理器

    公开(公告)号:US07159099B2

    公开(公告)日:2007-01-02

    申请号:US10184583

    申请日:2002-06-28

    IPC分类号: G06F15/00

    摘要: A re-configurable, streaming vector processor (100) is provided which includes a number of function units (102), each having one or more inputs for receiving data values and an output for providing a data value, a re-configurable interconnection switch (104) and a micro-sequencer (118). The re-configurable interconnection switch (104) includes one or more links, each link operable to couple an output of a function unit (102) to an input of a function unit (102) as directed by the micro-sequencer (118). The vector processor may also include one or more input-stream units (122) for retrieving data from memory. Each input-stream unit is directed by a host processor and has a defined interface (116) to the host processor. The vector processor also includes one or more output-stream units (124) for writing data to memory or to the host processor. The defined interface of the input-stream and output-stream units forms a first part of the programming model. The instructions stored in a memory, in the sequence that direct the re-configurable interconnection switch, form a second part of the programming model.

    摘要翻译: 提供了可重新配置的流向量处理器(100),其包括多个功能单元(102),每个功能单元(102)具有用于接收数据值的一个或多个输入和用于提供数据值的输出,可重新配置的互连开关 104)和微定序器(118)。 可重配置互连开关(104)包括一个或多个链路,每个链路可操作以在功能单元(102)的输出端与微定序器(118)指导的功能单元(102)的输入耦合。 向量处理器还可以包括用于从存储器检索数据的一个或多个输入流单元(122)。 每个输入流单元由主处理器引导,并且具有到主机处理器的定义的接口(116)。 矢量处理器还包括用于将数据写入存储器或主处理器的一个或多个输出流单元(124)。 输入流和输出流单元的定义接口形成编程模型的第一部分。 存储在存储器中的指令以指导可重新配置的互连交换机的顺序形成编程模型的第二部分。

    Data processing system using multiple addressing modes for SIMD operations and method thereof
    5.
    发明授权
    Data processing system using multiple addressing modes for SIMD operations and method thereof 有权
    使用SIMD操作的多种寻址模式的数据处理系统及其方法

    公开(公告)号:US07275148B2

    公开(公告)日:2007-09-25

    申请号:US10657797

    申请日:2003-09-08

    IPC分类号: G06F9/312 G06F15/80

    摘要: Various load and store instructions may be used to transfer multiple vector elements between registers in a register file and memory. A cnt parameter may be used to indicate a total number of elements to be transferred to or from memory, and an rcnt parameter may be used to indicate a maximum number of vector elements that may be transferred to or from a single register within a register file. Also, the instructions may use a variety of different addressing modes. The memory element size may be specified independently from the register element size such that source and destination sizes may differ within an instruction. With some instructions, a vector stream may be initiated and conditionally enqueued or dequeued. Truncation or rounding fields may be provided such that source data elements may be truncated or rounded when transferred. Also, source data elements may be sign- or unsigned-extended when transferred.

    摘要翻译: 可以使用各种加载和存储指令在寄存器文件和存储器中的寄存器之间传送多个向量元素。 可以使用cnt参数来指示要传送到存储器或从存储器传送的元素的总数,并且可以使用rcnt参数来指示可以传送到寄存器文件中的单个寄存器的向量元素的最大数量 。 此外,指令可以使用各种不同的寻址模式。 可以独立于寄存器元件大小指定存储器元件大小,使得源和目标大小在指令内可能不同。 通过一些指令,可以启动向量流并有条件地排队或出队。 可以提供截断或舍入字段,使得源数据元素在被传送时可以被截断或舍入。 此外,源数据元素在传输时可以是符号或无符号扩展的。

    Method of programming linear graphs for streaming vector computation
    6.
    发明授权
    Method of programming linear graphs for streaming vector computation 失效
    用于流矢量计算的线性图的编程方法

    公开(公告)号:US06934938B2

    公开(公告)日:2005-08-23

    申请号:US10184743

    申请日:2002-06-28

    CPC分类号: G06F8/314

    摘要: A method for producing a formatted description of a computation representable by a data-flow graph and computer for performing a computation so described. A source instruction is generated for each input of the data-flow graph, a computational instruction is generated for each node of the data-flow graph, and a sink instruction is generated for each output of the data-flow graph. The computational instruction for a node includes a descriptor of an operation performed at the node and a descriptor of each instruction that produces an input to the node. The formatted description is a sequential instruction list comprising source instructions, computational instructions and sink instructions. Each instruction has an instruction identifier and the descriptor of each instruction that produces an input to the node is the instruction identifier. The computer is directed by a program of instructions to implement a computation representable by a data-flow graph.

    摘要翻译: 一种用于产生由数据流图和计算机表示的计算的格式化描述的方法,用于执行如此描述的计算。 为数据流图的每个输入生成源指令,为数据流图的每个节点生成计算指令,并为数据流图的每个输出生成汇编指令。 节点的计算指令包括在节点处执行的操作的描述符和产生到该节点的输入的每条指令的描述符。 格式化的描述是包括源指令,计算指令和汇指令的顺序指令列表。 每个指令都有一个指令标识符,并且每个指令的描述符产生一个到该节点的输入是指令标识符。 计算机由指令程序引导以实现由数据流图表示的计算。

    Scheduler of program instructions for streaming vector processor having interconnected functional units
    7.
    发明授权
    Scheduler of program instructions for streaming vector processor having interconnected functional units 失效
    具有互连功能单元的流媒体处理器的程序指令调度器

    公开(公告)号:US07140019B2

    公开(公告)日:2006-11-21

    申请号:US10184772

    申请日:2002-06-28

    IPC分类号: G06F9/50 G06F9/44

    CPC分类号: G06F8/445

    摘要: A method for scheduling a computation for execution on a computer with a number of interconnected functional units. The computation is representable by a data-flow graph with a number of nodes connected by edge. A loop-period of the computation is calculated and the nodes are scheduled for throughput by assigning an execution cycle and a functional unit to each node of the data-flow graph. The scheduling of flexible nodes is adjusted to minimize the number of interconnections required in each execution cycle. The edges of the data-flow graph are allocated to one or more of the interconnections between functional units. The scheduling method may be used, for example, to optimize the interconnection fabric design for an ASIC or as part of a compiler for a re-configurable streaming vector processor.

    摘要翻译: 一种用于在具有多个互连的功能单元的计算机上调度用于执行的计算的方法。 该计算可以由具有多个通过边缘连接的节点的数据流图形表示。 计算计算的循环周期,并通过为数据流图的每个节点分配执行周期和功能单元来调度节点的吞吐量。 调整灵活节点的调度以最小化每个执行周期中所需的互连数量。 数据流图的边缘被分配给功能单元之间的一个或多个互连。 例如,可以使用调度方法来优化ASIC的互连结构设计或作为用于可重新配置的流向量处理器的编译器的一部分。

    Interconnection device with integrated storage
    8.
    发明授权
    Interconnection device with integrated storage 失效
    具有集成存储的互连设备

    公开(公告)号:US06850536B2

    公开(公告)日:2005-02-01

    申请号:US10184609

    申请日:2002-06-28

    CPC分类号: H04L49/103 H04L49/30

    摘要: An interconnection device (300) with a number of links (306, 308, 310, 312 and 314), each link having a number of link input ports (302), link output ports (304) and storage registers (316). An input selection switch (402) is coupled to a selected link input port to receive an input data token. The storage registers (316) may be used to store input data tokens. A storage access switch (404) is coupled to the input selection switch (402) and to the storage registers (316) and may be used to select the current input data token or a token from the storage registers as an output data token. An output selection switch (406) receives the output data token and provides it to a selected link output port. The interconnection device may, for example, be used to connect the inputs and outputs of the processing elements of a vector processor or digital signal processor.

    摘要翻译: 具有多个链路(306,308,310,312和314)的互连设备(300),每个链路具有多个链路输入端口(302),链路输出端口(304)和存储寄存器(316)。 输入选择开关(402)耦合到所选择的链接输入端口以接收输入数据令牌。 存储寄存器(316)可用于存储输入数据令牌。 存储访问开关(404)耦合到输入选择开关(402)和存储寄存器(316),并且可以用于从存储寄存器中选择当前输入数据令牌或令牌作为输出数据令牌。 输出选择开关(406)接收输出数据令牌并将其提供给所选择的链路输出端口。 互连装置可以例如用于连接矢量处理器或数字信号处理器的处理元件的输入和输出。

    Memory interface with fractional addressing
    9.
    发明授权
    Memory interface with fractional addressing 失效
    具有分数寻址的存储器接口

    公开(公告)号:US06799261B2

    公开(公告)日:2004-09-28

    申请号:US10184582

    申请日:2002-06-28

    IPC分类号: G06F1200

    摘要: A memory interface device (100) providing a fractional address interface between a data processor (104) and a memory system (102) and a method for retrieving intermediate data values from a memory system using fractional addressing. The device includes an address generator (108) for generating first and second memory addresses, the first memory address being less than or equal to a specified fractional address, the second memory address being greater than or equal to the fractional address. The device also includes a memory access unit (110) coupled to the address generator (108) for retrieving first and second data values from the memory system (102) at the first and second memory addresses, respectively. The device also includes a data access unit (112) for interpolating between the first and second data values and passing the interpolated value to the data processor (104). The memory interface has application in a variety of data processing systems, including digital signal processors and streaming vector processors.

    摘要翻译: 在数据处理器(104)和存储器系统(102)之间提供分数地址接口的存储器接口设备(100)以及用于使用分数寻址从存储器系统检索中间数据值的方法。 该设备包括用于产生第一和第二存储器地址的地址发生器(108),第一存储器地址小于或等于指定的分数地址,第二存储器地址大于或等于分数地址。 该设备还包括耦合到地址发生器(108)的存储器访问单元(110),用于分别在第一和第二存储器地址处从存储器系统(102)检索第一和第二数据值。 该装置还包括用于在第一和第二数据值之间进行内插并将内插值传递给数据处理器(104)的数据访问单元(112)。 存储器接口在各种数据处理系统中具有应用,包括数字信号处理器和流媒体矢量处理器。

    METHOD AND APPARATUS FOR BATTERY-AWARE DYNAMIC BANDWIDTH ALLOCATION FOR GROUPS OF WIRELESS SENSOR NODES IN A WIRELESS SENSOR NETWORK
    10.
    发明申请
    METHOD AND APPARATUS FOR BATTERY-AWARE DYNAMIC BANDWIDTH ALLOCATION FOR GROUPS OF WIRELESS SENSOR NODES IN A WIRELESS SENSOR NETWORK 有权
    无线传感器网络无线传感器组的电池动态带宽分配方法与装置

    公开(公告)号:US20080211666A1

    公开(公告)日:2008-09-04

    申请号:US11681634

    申请日:2007-03-02

    IPC分类号: H04Q7/00 G08B19/00 H04B7/00

    摘要: A method and apparatus that allocates bandwidth among wireless sensor nodes in wireless sensor groups in a wireless sensor network (WSN) is disclosed. The method may include forming a plurality of wireless sensor node groups from a plurality of wireless sensor nodes based on battery levels of the wireless senor nodes, allocating transmission time slots for the wireless sensor nodes in each of the wireless sensor node groups based on at least one channel quality metric, determining average battery levels for each of the wireless sensor node groups and average battery level of all of the wireless sensor nodes, determining differences between the average battery levels of each of the wireless sensor node groups and the average battery level of all of the wireless sensor nodes, wherein if any difference in the average battery levels is above a predetermined threshold, regrouping the plurality of wireless sensor nodes according to the battery levels of the plurality wireless sensor nodes to minimize any variance in average battery level across all of the wireless sensor node groups.

    摘要翻译: 公开了一种在无线传感器网络(WSN)中的无线传感器组中的无线传感器节点之间分配带宽的方法和装置。 该方法可以包括基于无线传感器节点的电池电平从多个无线传感器节点形成多个无线传感器节点组,至少基于至少基于无线传感器节点组中的无线传感器节点组中的无线传感器节点分配传输时隙 单通道质量度量,确定每个无线传感器节点组的平均电池电量和所有无线传感器节点的平均电池电量,确定每个无线传感器节点组的平均电池电量与平均电池电量之间的差异 所有无线传感器节点,其中如果平均电池电平中的任何差异高于预定阈值,则根据多个无线传感器节点的电池电平重新分组多个无线传感器节点,以最小化所有的平均电池电平的任何差异 的无线传感器节点组。