System and method for performing efficient conditional vector operations for data parallel architectures involving both input and conditional vector values
    1.
    发明授权
    System and method for performing efficient conditional vector operations for data parallel architectures involving both input and conditional vector values 有权
    用于对涉及输入和条件向量值的数据并行架构执行有效条件向量运算的系统和方法

    公开(公告)号:US07818539B2

    公开(公告)日:2010-10-19

    申请号:US11511157

    申请日:2006-08-28

    IPC分类号: G06F15/00 G06F15/76

    摘要: A processor implements conditional vector operations in which, for example, an input vector containing multiple operands to be used in conditional operations is divided into two or more output vectors based on a condition vector. Each output vector can then be processed at full processor efficiency without cycles wasted due to branch latency. Data to be processed are divided into two groups based on whether or not they satisfy a given condition by e.g., steering each to one of the two index vectors. Once the data have been segregated in this way, subsequent processing can be performed without conditional operations, processor cycles wasted due to branch latency, incorrect speculation or execution of unnecessary instructions due to predication. Other examples of conditional operations include combining one or more input vectors into a single output vector based on a condition vector, conditional vector switching, conditional vector combining, and conditional vector load balancing.

    摘要翻译: 处理器实现条件向量操作,其中例如,在条件操作中包含要使用的多个操作数的输入向量基于条件向量被划分为两个或更多个输出向量。 然后可以以全处理器效率处理每个输出向量,而不会由于分支延迟而浪费周期。 基于它们是否满足给定条件,待处理的数据被分成两组,例如,每个指向两个索引向量之一。 一旦以这种方式分离了数据,就可以执行后续处理,而无需条件操作,由于分支延迟浪费处理器周期,由于预测导致不正确的猜测或执行不必要的指令。 条件操作的其他示例包括:基于条件向量,条件向量切换,条件向量组合和条件向量负载平衡将一个或多个输入向量组合成单个输出向量。

    SYSTEM AND METHOD FOR MANAGING INPUT/OUTPUT DATA OF PERIPHERAL DEVICES
    2.
    发明申请
    SYSTEM AND METHOD FOR MANAGING INPUT/OUTPUT DATA OF PERIPHERAL DEVICES 有权
    用于管理外围设备的输入/输出数据的系统和方法

    公开(公告)号:US20140304440A1

    公开(公告)日:2014-10-09

    申请号:US14245205

    申请日:2014-04-04

    IPC分类号: G06F3/06

    CPC分类号: G06F13/4027 G06F2003/0691

    摘要: A method for communicating data between peripheral devices and an embedded processor that includes receiving, at a data buffer unit of the embedded processor, the data from a peripheral device. The method also includes copying data from the data buffer unit into the bridge buffer of the embedded processor as a bridge buffer message. Additionally, the method includes creating, after storing the data as a bridge buffer message, a peripheral device message comprising the bridge buffer message, and sending the peripheral device message to a thread message queue of a subscriber.

    摘要翻译: 一种用于在外围设备和嵌入式处理器之间传送数据的方法,包括在嵌入式处理器的数据缓冲器单元处接收来自外围设备的数据。 该方法还包括从数据缓冲器单元将数据作为桥接缓冲器消息复制到嵌入式处理器的桥接缓冲器中。 此外,该方法包括在将数据作为桥接缓冲器消息存储之后创建包括桥接缓冲器消息的外围设备消息,以及向用户的线程消息队列发送外围设备消息。

    System and method for re-ordering memory references for access to memory

    公开(公告)号:US20060215481A1

    公开(公告)日:2006-09-28

    申请号:US11434392

    申请日:2006-05-15

    IPC分类号: G11C8/18

    摘要: A memory processing approach involves implementation of memory status-driven access. According to an example embodiment, addresses received at an address buffer are processed for access to a memory relative to an active location in the memory. Addresses corresponding to an active location in the memory array are processed prior to addresses that do not correspond to an active location. Data is read from the memory to a read buffer and ordered in a manner commensurate with the order of received addresses at the address buffer (e.g., thus facilitating access to the memory in an order different from that received at the address buffer while maintaining the order from the read buffer).

    Memory system and approach
    4.
    发明申请
    Memory system and approach 有权
    内存系统和方法

    公开(公告)号:US20050105381A1

    公开(公告)日:2005-05-19

    申请号:US11019979

    申请日:2004-12-21

    IPC分类号: G11C5/00 G11C7/10 G11C11/408

    摘要: A memory processing approach involves implementation of memory status-driven access. According to an example embodiment, addresses received at an address buffer are processed for access to a memory relative to an active location in the memory. Addresses corresponding to an active location in the memory array are processed prior to addresses that do not correspond to an active location. Data is read from the memory to a read buffer and ordered in a manner commensurate with the order of received addresses at the address buffer (e.g., thus facilitating access to the memory in an order different from that received at the address buffer while maintaining the order from the read buffer).

    摘要翻译: 内存处理方法涉及实现内存状态驱动的访问。 根据示例实施例,处理在地址缓冲器处接收的地址,以相对于存储器中的活动位置访问存储器。 对应于存储器阵列中的活动位置的地址在不对应于活动位置的地址之前被处理。 将数据从存储器读取到读缓冲器,并以与地址缓冲器处的接收地址顺序相称的方式进行排序(例如,因此有助于以与地址缓冲器接收的顺序不同的顺序访问存储器,同时保持顺序 从读缓冲区)。

    System and method for managing input/output data of peripheral devices
    5.
    发明授权
    System and method for managing input/output data of peripheral devices 有权
    用于管理外围设备的输入/输出数据的系统和方法

    公开(公告)号:US08984184B2

    公开(公告)日:2015-03-17

    申请号:US14245205

    申请日:2014-04-04

    IPC分类号: G06F3/06 G06F13/38

    CPC分类号: G06F13/4027 G06F2003/0691

    摘要: A method for communicating data between peripheral devices and an embedded processor that includes receiving, at a data buffer unit of the embedded processor, the data from a peripheral device. The method also includes copying data from the data buffer unit into the bridge buffer of the embedded processor as a bridge buffer message. Additionally, the method includes creating, after storing the data as a bridge buffer message, a peripheral device message comprising the bridge buffer message, and sending the peripheral device message to a thread message queue of a subscriber.

    摘要翻译: 一种用于在外围设备和嵌入式处理器之间传送数据的方法,包括在嵌入式处理器的数据缓冲器单元处接收来自外围设备的数据。 该方法还包括从数据缓冲器单元将数据作为桥接缓冲器消息复制到嵌入式处理器的桥接缓冲器中。 此外,该方法包括在将数据作为桥接缓冲器消息存储之后创建包括桥接缓冲器消息的外围设备消息,以及向用户的线程消息队列发送外围设备消息。

    SYSTEM AND METHOD FOR CONTEXT-INDEPENDENT CODES FOR OFF-CHIP INTERCONNECTS
    6.
    发明申请
    SYSTEM AND METHOD FOR CONTEXT-INDEPENDENT CODES FOR OFF-CHIP INTERCONNECTS 有权
    用于片外互连的上下文独立代码的系统和方法

    公开(公告)号:US20080140987A1

    公开(公告)日:2008-06-12

    申请号:US11953028

    申请日:2007-12-08

    IPC分类号: G06F12/06

    摘要: A system and method for context-independent coding using frequency-based mapping schemes, sequence-based mapping schemes, memory trace-based mapping schemes, and/or transition statistics-based mapping schemes in order to reduce off-chip interconnect power consumption. State-of-the-art context-dependent, double-ended codes for processor-SDRAM off-chip interfaces require the transmitter and receiver (memory controller and SDRAM) to collaborate using the current and previously transmitted values to encode and decode data. In contrast, the memory controller can use a context-independent code to encode data stored in SDRAM and subsequently decode that data when it is retrieved, allowing the use of commodity memories. A single-ended, context-independent code is realized by assigning limited-weight codes using a frequency-based mapping technique. Experimental results show that such a code can reduce the power consumption of an uncoded off-chip interconnect by an average of 30% with less than a 0.1% degradation in performance

    摘要翻译: 一种使用基于频率的映射方案,基于序列的映射方案,基于存储器跟踪的映射方案和/或基于过渡统计的映射方案的上下文无关编码的系统和方法,以便减少片外互连功耗。 处理器 - SDRAM片外接口的最先进的上下文相关的双端代码要求发送器和接收器(存储器控制器和SDRAM)使用当前和先前传输的值进行协作,以对数据进行编码和解码。 相比之下,存储器控制器可以使用与上下文无关的代码对存储在SDRAM中的数据进行编码,并随后在该数据被检索时解码该数据,从而允许使用商品存储器。 通过使用基于频率的映射技术分配有限权重码来实现单端,上下文无关代码。 实验结果表明,这样的代码可以将未编码的片外互连的功耗降低30%,平均降低0.1%,降低性能

    System and method for performing efficient conditional vector operations for data parallel architectures involving both input and conditional vector values
    7.
    发明申请
    System and method for performing efficient conditional vector operations for data parallel architectures involving both input and conditional vector values 有权
    用于对涉及输入和条件向量值的数据并行架构执行有效条件向量运算的系统和方法

    公开(公告)号:US20070150700A1

    公开(公告)日:2007-06-28

    申请号:US11511157

    申请日:2006-08-28

    IPC分类号: G06F15/00

    摘要: A processor implements conditional vector operations in which, for example, an input vector containing multiple operands to be used in conditional operations is divided into two or more output vectors based on a condition vector. Each output vector can then be processed at full processor efficiency without cycles wasted due to branch latency. Data to be processed are divided into two groups based on whether or not they satisfy a given condition by e.g., steering each to one of the two index vectors. Once the data have been segregated in this way, subsequent processing can be performed without conditional operations, processor cycles wasted due to branch latency, incorrect speculation or execution of unnecessary instructions due to predication. Other examples of conditional operations include combining one or more input vectors into a single output vector based on a condition vector, conditional vector switching, conditional vector combining, and conditional vector load balancing.

    摘要翻译: 处理器实现条件向量操作,其中例如,在条件操作中包含要使用的多个操作数的输入向量基于条件向量被划分为两个或更多个输出向量。 然后可以以全处理器效率处理每个输出向量,而不会由于分支延迟而浪费周期。 基于它们是否满足给定条件,待处理的数据被分成两组,例如,每个指向两个索引向量之一。 一旦以这种方式分离了数据,就可以执行后续处理,而无需条件操作,由于分支延迟浪费处理器周期,由于预测导致不正确的猜测或执行不必要的指令。 条件操作的其他示例包括:基于条件向量,条件向量切换,条件向量组合和条件向量负载平衡将一个或多个输入向量组合成单个输出向量。

    System and method for performing efficient conditional vector operations for data parallel architectures involving both input and conditional vector values
    8.
    发明授权
    System and method for performing efficient conditional vector operations for data parallel architectures involving both input and conditional vector values 有权
    用于对涉及输入和条件向量值的数据并行架构执行有效条件向量运算的系统和方法

    公开(公告)号:US07100026B2

    公开(公告)日:2006-08-29

    申请号:US09871301

    申请日:2001-05-30

    IPC分类号: G06F7/38

    摘要: A processor implements conditional vector operations in which, for example, an input vector containing multiple operands to be used in conditional operations is divided into two or more output vectors based on a condition vector. Each output vector can then be processed at full processor efficiency without cycles wasted due to branch latency. Data to be processed are divided into two groups based on whether or not they satisfy a given condition by, e.g., steering each to one of two index vectors. Once the data have been segregated in this way, subsequent processing can be performed without conditional operations, processor cycles wasted due to branch latency, incorrect speculation or execution of unnecessary instructions due to predication. Other examples of conditional operations include combining one or more input vectors into a single output vector based on a condition vector, conditional vector switching, conditional vector combining, and conditional vector load balancing.

    摘要翻译: 处理器实现条件向量操作,其中例如,在条件操作中包含要使用的多个操作数的输入向量基于条件向量被划分为两个或更多个输出向量。 然后可以以全处理器效率处理每个输出向量,而不会由于分支延迟而浪费周期。 基于它们是否满足给定条件,待处理的数据被分成两组,例如通过将它们转向两个索引向量中的一个。 一旦以这种方式分离了数据,就可以执行后续处理,而无需条件操作,由于分支延迟浪费处理器周期,由于预测导致不正确的猜测或执行不必要的指令。 条件操作的其他示例包括:基于条件向量,条件向量切换,条件向量组合和条件向量负载平衡将一个或多个输入向量组合成单个输出向量。

    Method and system for scalable ethernet
    9.
    发明授权
    Method and system for scalable ethernet 有权
    可扩展以太网的方法和系统

    公开(公告)号:US08761152B2

    公开(公告)日:2014-06-24

    申请号:US12578240

    申请日:2009-10-13

    IPC分类号: H04L12/28

    摘要: A computer readable medium comprising computer readable code for data transfer. The computer readable code, when executed, performs a method. The method includes receiving, at a first Axon, an ARP request from a source host directed to a target host. The method also includes obtaining a first route from the first Axon to the second Axon, and generating a target identification corresponding to the target host. The method further includes sending an Axon-ARP request to the second Axon using the first route, and receiving an Axon-ARP reply from the second Axon, where the Axon-ARP reply includes a second route. The method further includes storing the first route in storage space on the first Axon, where the storage space is indexed by the target identification, and sending an ARP reply to the first host where the source host is configured to send a packet to the target host.

    摘要翻译: 一种包括用于数据传输的计算机可读代码的计算机可读介质。 计算机可读代码执行时执行方法。 该方法包括在第一个Axon处接收来自指向目标主机的源主机的ARP请求。 该方法还包括获得从第一Axon到第二Axon的第一路由,以及生成与目标主机相对应的目标标识。 该方法还包括使用第一路由向第二Axon发送Axon-ARP请求,以及从第二Axon接收Axon-ARP应答,其中Axon-ARP应答包括第二路由。 该方法还包括将第一路由存储在第一Axon上的存储空间中,其中存储空间由目标标识索引,并且向源主机配置发送分组到目标主机的第一主机发送ARP应答 。

    System and method for context-independent codes for off-chip interconnects
    10.
    发明授权
    System and method for context-independent codes for off-chip interconnects 有权
    用于片外互连的上下文无关代码的系统和方法

    公开(公告)号:US07979666B2

    公开(公告)日:2011-07-12

    申请号:US11953028

    申请日:2007-12-08

    IPC分类号: G06F13/00

    摘要: A system and method for context-independent coding using frequency-based mapping schemes, sequence-based mapping schemes, memory trace-based mapping schemes, and/or transition statistics-based mapping schemes in order to reduce off-chip interconnect power consumption. State-of-the-art context-dependent, double-ended codes for processor-SDRAM off-chip interfaces require the transmitter and receiver (memory controller and SDRAM) to collaborate using the current and previously transmitted values to encode and decode data. In contrast, the memory controller can use a context-independent code to encode data stored in SDRAM and subsequently decode that data when it is retrieved, allowing the use of commodity memories. A single-ended, context-independent code is realized by assigning limited-weight codes using a frequency-based mapping technique. Experimental results show that such a code can reduce the power consumption of an uncoded off-chip interconnect by an average of 30% with less than a 0.1% degradation in performance.

    摘要翻译: 一种使用基于频率的映射方案,基于序列的映射方案,基于存储器跟踪的映射方案和/或基于过渡统计的映射方案的上下文无关编码的系统和方法,以便减少片外互连功耗。 处理器 - SDRAM片外接口的最先进的上下文相关的双端代码要求发送器和接收器(存储器控制器和SDRAM)使用当前和先前传输的值进行协作,以对数据进行编码和解码。 相比之下,存储器控制器可以使用与上下文无关的代码对存储在SDRAM中的数据进行编码,并随后在该数据被检索时解码该数据,从而允许使用商品存储器。 通过使用基于频率的映射技术分配有限权重码来实现单端,上下文无关代码。 实验结果表明,这样的代码可以将未编码的片外互连的功耗降低30%,平均性能降低0.1%。