Processing stream instruction in IC of mesh connected matrix of processors containing pipeline coupled switch transferring messages over consecutive cycles from one link to another link or memory
    11.
    发明授权
    Processing stream instruction in IC of mesh connected matrix of processors containing pipeline coupled switch transferring messages over consecutive cycles from one link to another link or memory 有权
    处理器的网格连接矩阵的处理流处理流指令,包含流水线耦合开关,在一个链路到另一个链路或存储器的连续循环中传输消息

    公开(公告)号:US07958341B1

    公开(公告)日:2011-06-07

    申请号:US12168861

    申请日:2008-07-07

    IPC分类号: G06F13/14

    CPC分类号: G06F15/17381 G06F15/167

    摘要: In some embodiments, each matrix processor in a matrix of mesh-interconnected matrix processors includes an instruction processing pipeline, and a hardware data switch capable of streaming data to/from one or more inter-processor matrix links and/or a matrix processor local memory links in response to execution of a data streaming instruction by the instruction processing pipeline. The data switch can transfer each data stream, which includes multiple words, at wire speed, one word per cycle. After initiating a data stream, the processing pipeline can execute other instructions, including streaming instructions, while a stream transfer is in progress. Different data streaming instructions may be used to transfer data streams from local memory to one or more inter-processor links, from an inter-processor link to local memory, from an inter-processor link to one or more inter-processor links, and from an inter-processor link to one or more inter-processor links and synchronously to local memory.

    摘要翻译: 在一些实施例中,网状互连矩阵处理器的矩阵中的每个矩阵处理器包括指令处理流水线和能够将数据传送到一个或多个处理器间矩阵链路和/或矩阵处理器本地存储器的硬件​​数据交换机 响应于指令处理流水线执行数据流指令的链接。 数据交换机可以以线速度传输每个包含多个字的数据流,每个周期一个字。 在启动数据流之后,处理流水线可以执行其他指令,包括流指令,而流传输正在进行中。 可以使用不同的数据流指令将数据流从本地存储器传送到从处理器间链路到本地存储器的一个或多个处理器间链路,从处理器间链路到一个或多个处理器间链路, 与一个或多个处理器间链路并且与本地存储器同步的处理器间链路。

    Random access memory controller with out of order execution
    12.
    发明授权
    Random access memory controller with out of order execution 有权
    随机存取存储器控制器无序执行

    公开(公告)号:US07281110B1

    公开(公告)日:2007-10-09

    申请号:US11498375

    申请日:2006-08-03

    申请人: Sorin C. Cismas

    发明人: Sorin C. Cismas

    IPC分类号: G06F12/00

    摘要: A memory controller for a multi-bank random access memory (RAM) such as SDRAM includes a transaction slicer for slicing complex client transactions into simple slices, and a command scheduler for re-ordering preparatory memory commands such as activate and precharge in an order that can be different from the order of the corresponding client transactions. The command scheduler may also re-order memory access commands such as read and write. The slicing and out-of-order command scheduling allow a reduction in memory latency. The data transfer to and from clients can be kept in order.

    摘要翻译: 用于诸如SDRAM的多存储体随机存取存储器(RAM)的存储器控​​制器包括用于将复杂客户端事务分片成简单片段的事务限制器,以及用于重新排序诸如激活和预充电之类的预备存储器命令的命令调度器, 可以与相应的客户端事务的顺序不同。 命令调度器还可以重新排序诸如读取和写入的存储器访问命令。 切片和无序命令调度允许减少内存延迟。 数据传输到客户端和从客户端可以保持顺序。

    Multithreaded data/context flow processing architecture
    13.
    发明授权
    Multithreaded data/context flow processing architecture 有权
    多线程数据/上下文流处理架构

    公开(公告)号:US06889310B2

    公开(公告)日:2005-05-03

    申请号:US09927625

    申请日:2001-08-09

    申请人: Sorin C. Cismas

    发明人: Sorin C. Cismas

    IPC分类号: G06F17/50 G06F13/00

    CPC分类号: G06F17/5045

    摘要: Multithreaded data- and context-flow processing is achieved by flowing data and context (thread) identification tokens through specialized cores (functional blocks, intellectual property). Each context identification token defines the identity of a context and associated context parameters affecting the processing of the data tokens. Parameter values for different contexts are stored in a distributed manner throughout the cores. Upon a context switch, only the identity of the new context is propagated. The parameter values for the new context are retrieved from the distributed storage locations. Different cores of the system and different pipestages within a core can work simultaneously in different contexts. The described architecture does not require long propagation distances for parameters upon context switches, or that an entire pipeline finish processing in one context before starting processing in another. The system can be effectively controlled by the flow of data and context identification tokens therethrough.

    摘要翻译: 多线程数据和上下文流处理通过专门的核心(功能块,知识产权)流动数据和上下文(线程)标识令牌来实现。 每个上下文识别令牌定义了上下文的身份以及影响数据令牌处理的相关联的上下文参数。 不同上下文的参数值以分布式方式存储在整个内核中。 在上下文切换时,仅传播新上下文的标识。 从分布式存储位置检索新上下文的参数值。 系统的不同核心和核心内的不同管道可以在不同的环境中同时工作。 所描述的架构对于上下文切换时的参数不需要较长的传播距离,或者在开始处理之前在一个上下文中的整个流水线完成处理。 可以通过数据流和上下文识别令牌来有效地控制系统。

    System and method for inverse discrete cosine transform implementation
    14.
    发明授权
    System and method for inverse discrete cosine transform implementation 失效
    逆离散余弦变换实现的系统和方法

    公开(公告)号:US5574661A

    公开(公告)日:1996-11-12

    申请号:US282947

    申请日:1994-07-29

    申请人: Sorin C. Cismas

    发明人: Sorin C. Cismas

    CPC分类号: G06F17/147 G06F7/49963

    摘要: An apparatus and method for calculation of the inverse discrete cosine transform for image decompression are disclosed. The apparatus may be implemented with approximately 10,000 transistors for MPEG2 main level speed and with less than 10,000 transistors for MPEG1 main level speed.

    摘要翻译: 公开了一种用于计算图像解压缩的逆离散余弦变换的装置和方法。 该装置可以用用于MPEG2主电平速度的大约10,000个晶体管实现,并且具有小于10,000个用于MPEG1主电平速度的晶体管。

    Hardware multithreading systems and methods
    15.
    发明授权
    Hardware multithreading systems and methods 有权
    硬件多线程系统和方法

    公开(公告)号:US08640129B2

    公开(公告)日:2014-01-28

    申请号:US12818006

    申请日:2010-06-17

    IPC分类号: G06F9/46 G06F7/38

    摘要: According to some embodiments, a multithreaded microcontroller includes a thread control unit comprising thread control hardware (logic) configured to perform a number of multithreading system calls essentially in real time, e.g. in one or a few clock cycles. System calls can include mutex lock, wait condition, and signal instructions. The thread controller includes a number of thread state, mutex, and condition variable registers used for executing the multithreading system calls. Threads can transition between several states including free, run, ready and wait. The wait state includes interrupt, condition, mutex, I-cache, and memory substrates. A thread state transition controller controls thread states, while a thread instructions execution unit executes multithreading system calls and manages thread priorities to avoid priority inversion. A thread scheduler schedules threads according to their priorities. A hardware thread profiler including global, run and wait profiler registers is used to monitor thread performance to facilitate software development.

    摘要翻译: 根据一些实施例,多线程微控制器包括线程控制单元,线程控制单元包括线程控制硬件(逻辑),其被配置为基本上实时地执行多个多线程系统调用,例如, 在一个或几个时钟周期。 系统调用可以包括互斥锁,等待条件和信号指令。 线程控制器包括用于执行多线程系统调用的多个线程状态,互斥和条件变量寄存器。 线程可以在包括free,run,ready和wait之间的几个状态之间切换。 等待状态包括中断,条件,互斥,I缓存和存储器基板。 线程状态转移控制器控制线程状态,而线程指令执行单元执行多线程系统调用并管理线程优先级以避免优先级反转。 线程调度器根据其优先级来调度线程。 包括全局,运行和等待分析器寄存器的硬件线程分析器用于监视线程性能以促进软件开发。

    Memory Word Array Organization and Prediction Combination for Memory Access
    16.
    发明申请
    Memory Word Array Organization and Prediction Combination for Memory Access 有权
    内存字阵列组织和内存访问的预测组合

    公开(公告)号:US20130051462A1

    公开(公告)日:2013-02-28

    申请号:US13454922

    申请日:2012-04-24

    申请人: Sorin C. Cismas

    发明人: Sorin C. Cismas

    IPC分类号: H04N7/32

    CPC分类号: H04N19/423

    摘要: Described systems and methods allow a reduction in the memory bandwidth required in video coding (decoding/encoding) applications. According to a first aspect, the data assigned to each memory word is chosen to correspond to a 2D subarray of a larger array such as a macroblock. An array memory word organization allows reducing both the average and worst-case bandwidth required to retrieve predictions from memory in video coding applications, particularly for memory word sizes (memory bus widths) larger than the size of typical predictions. According to a second aspect, two or more 2D subarrays such as video predictions are retrieved from memory simultaneously as part of a larger 2D array, if retrieving the larger array requires fewer clock cycles than retrieving the subarrays individually. Allowing the combination of multiple predictions in one memory access operation can lead to a reduction in the average bandwidth required to retrieve predictions from memory.

    摘要翻译: 所描述的系统和方法允许减少视频编码(解码/编码)应用中所需的存储器带宽。 根据第一方面,分配给每个存储器字的数据被选择为对应于诸如宏块的较大阵列的2D子阵列。 阵列存储器字组织允许减少在视频编码应用中从存储器检索预测所需的平均和最坏情况带宽,特别是对于大于典型预测大小的存储器字大小(存储器总线宽度)。 根据第二方面,如果检索较大的阵列需要比单独检索子阵列更少的时钟周期,则从存储器同时从存储器中检索两个或更多个2D子阵列作为较大的2D阵列的一部分。 允许在一个存储器访问操作中的多个预测的组合可以导致从存储器检索预测所需的平均带宽的减少。

    Matrix processor proxy systems and methods
    17.
    发明授权
    Matrix processor proxy systems and methods 有权
    矩阵处理器代理系统和方法

    公开(公告)号:US08327114B1

    公开(公告)日:2012-12-04

    申请号:US12168849

    申请日:2008-07-07

    IPC分类号: G06F15/76

    摘要: In some embodiments, processor-to-processor and/or broadcast proxies are designated in a microprocessor matrix comprising a plurality of mesh-interconnected matrix processors when default processor-to-processor or broadcast routing algorithms used by data switches within the matrix to route messages would not deliver the messages to all intended recipients. The broadcast proxies broadcast messages within individual non-overlapping broadcast domains of the matrix. P-to-P and broadcast proxies may be designated as part of a boot-time testing/initialization sequence. Improving system fault tolerance allows improving semiconductor processing yields, which may be of particular significance in relatively large integrated circuits including large numbers of relatively-complex matrix processors.

    摘要翻译: 在一些实施例中,处理器到处理器和/或广播代理在包括多个网状互连矩阵处理器的微处理器矩阵中被指定,当默认处理器到处理器或广播路由算法由矩阵内的数据交换机使用以路由消息 不会将消息传递给所有预期的收件人。 广播代理在矩阵的单个非重叠广播域内广播消息。 P-P和广播代理可以被指定为引导时测试/初始化序列的一部分。 提高系统容错能力可以提高半导体处理产量,这在包括大量相对复杂的矩阵处理器的相对较大的集成电路中尤其重要。

    Hardware Multithreading Systems and Methods
    18.
    发明申请
    Hardware Multithreading Systems and Methods 有权
    硬件多线程系统和方法

    公开(公告)号:US20100257534A1

    公开(公告)日:2010-10-07

    申请号:US12818006

    申请日:2010-06-17

    IPC分类号: G06F9/46

    摘要: According to some embodiments, a multithreaded microcontroller includes a thread control unit comprising thread control hardware (logic) configured to perform a number of multithreading system calls essentially in real time, e.g. in one or a few clock cycles. System calls can include mutex lock, wait condition, and signal instructions. The thread controller includes a number of thread state, mutex, and condition variable registers used for executing the multithreading system calls. Threads can transition between several states including free, run, ready and wait. The wait state includes interrupt, condition, mutex, I-cache, and memory substrates. A thread state transition controller controls thread states, while a thread instructions execution unit executes multithreading system calls and manages thread priorities to avoid priority inversion. A thread scheduler schedules threads according to their priorities. A hardware thread profiler including global, run and wait profiler registers is used to monitor thread performance to facilitate software development.

    摘要翻译: 根据一些实施例,多线程微控制器包括线程控制单元,线程控制单元包括线程控制硬件(逻辑),其被配置为基本上实时地执行多个多线程系统调用,例如, 在一个或几个时钟周期。 系统调用可以包括互斥锁,等待条件和信号指令。 线程控制器包括用于执行多线程系统调用的多个线程状态,互斥和条件变量寄存器。 线程可以在包括free,run,ready和wait之间的几个状态之间切换。 等待状态包括中断,条件,互斥,I缓存和存储器基板。 线程状态转移控制器控制线程状态,而线程指令执行单元执行多线程系统调用并管理线程优先级以避免优先级反转。 线程调度器根据其优先级来调度线程。 包括全局,运行和等待分析器寄存器的硬件线程分析器用于监视线程性能以促进软件开发。

    Hardware multithreading systems with state registers having thread profiling data
    19.
    发明授权
    Hardware multithreading systems with state registers having thread profiling data 有权
    具有状态寄存器的硬件多线程系统具有线程分析数据

    公开(公告)号:US07765547B2

    公开(公告)日:2010-07-27

    申请号:US10996691

    申请日:2004-11-24

    摘要: According to some embodiments, a multithreaded microcontroller includes a thread control unit comprising thread control hardware (logic) configured to perform a number of multithreading system calls essentially in real time, e.g. in one or a few clock cycles. System calls can include mutex lock, wait condition, and signal instructions. The thread controller includes a number of thread state, mutex, and condition variable registers used for executing the multithreading system calls. Threads can transition between several states including free, run, ready and wait. The wait state includes interrupt, condition, mutex, I-cache, and memory substates. A thread state transition controller controls thread states, while a thread instructions execution unit executes multithreading system calls and manages thread priorities to avoid priority inversion. A thread scheduler schedules threads according to their priorities. A hardware thread profiler including global, run and wait profiler registers is used to monitor thread performance to facilitate software development.

    摘要翻译: 根据一些实施例,多线程微控制器包括线程控制单元,线程控制单元包括线程控制硬件(逻辑),其被配置为基本上实时地执行多个多线程系统调用,例如, 在一个或几个时钟周期。 系统调用可以包括互斥锁,等待条件和信号指令。 线程控制器包括用于执行多线程系统调用的多个线程状态,互斥和条件变量寄存器。 线程可以在包括free,run,ready和wait之间的几个状态之间切换。 等待状态包括中断,条件,互斥,I缓存和内存子状态。 线程状态转移控制器控制线程状态,而线程指令执行单元执行多线程系统调用并管理线程优先级以避免优先级反转。 线程调度器根据其优先级来调度线程。 包括全局,运行和等待分析器寄存器的硬件线程分析器用于监视线程性能以促进软件开发。