Adaptive motion estimation cache organization
    1.
    发明授权
    Adaptive motion estimation cache organization 有权
    自适应运动估计缓存组织

    公开(公告)号:US08325798B1

    公开(公告)日:2012-12-04

    申请号:US11305457

    申请日:2005-12-15

    CPC分类号: H04N19/573 H04N19/433

    摘要: In some embodiments, a motion estimation search window cache is adaptively re-organized according to frame properties including a frame width and a number of reference frames corresponding to the current frame to be encoded/decoded. The cache reorganization may include an adaptive mapping of reference frame locations to search window cache allocation units (addresses). In some embodiments, a search window is shaped as a quasi-rectangle with truncated upper left and lower right corners, having a full-frame horizontal extent. A search range is defined in a central region of the search window, and is laterally bounded by the truncated corners.

    摘要翻译: 在一些实施例中,运动估计搜索窗口缓存根据包括帧编号/解码的当前帧对应的帧宽度和参考帧数量的帧属性进行自适应重组。 高速缓存重组可以包括参考帧位置到搜索窗口高速缓存分配单元(地址)的自适应映射。 在一些实施例中,搜索窗口被成形为具有截短的左上角和右下角的准矩形,具有全帧水平范围。 搜索范围在搜索窗口的中心区域中定义,并且被截顶角横向界定。

    Block and mode reordering to facilitate parallel intra prediction and motion vector prediction
    2.
    发明授权
    Block and mode reordering to facilitate parallel intra prediction and motion vector prediction 有权
    块和模式重新排序以便于并行帧内预测和运动矢量预测

    公开(公告)号:US08976870B1

    公开(公告)日:2015-03-10

    申请号:US11512684

    申请日:2006-08-30

    申请人: Sorin C. Cismas

    发明人: Sorin C. Cismas

    IPC分类号: H04N7/12 H04N19/593 H04N19/11

    摘要: A method for processing a plurality of sub-blocks in a block of video is disclosed. The method generally includes the steps of (A) intra predicting a first group of the sub-blocks in a first quadrant of the block, (B) intra predicting a second group of the sub-blocks in a second quadrant of the block after starting the intra predicting of the first group and (C) intra predicting a third group of the sub-blocks in the first quadrant after starting the intra predicting of the second group, wherein the first group and the third group together account for all of the sub-blocks in the first quadrant.

    摘要翻译: 公开了一种在视频块中处理多个子块的方法。 该方法通常包括以下步骤:(A)帧内预测块的第一象限中的第一组子块,(B)在开始之后帧内预测块的第二象限中的第二组子块 第一组的帧内预测和(C)在开始第二组的帧内预测之后帧内预测第一象限中的第三组子块,其中第一组和第三组一起计算所有子 第一象限中的阻塞。

    Memory word array organization and prediction combination for memory access
    3.
    发明授权
    Memory word array organization and prediction combination for memory access 有权
    内存字阵列组织和预测组合用于内存访问

    公开(公告)号:US08687706B2

    公开(公告)日:2014-04-01

    申请号:US13454922

    申请日:2012-04-24

    申请人: Sorin C. Cismas

    发明人: Sorin C. Cismas

    CPC分类号: H04N19/423

    摘要: Described systems and methods allow a reduction in the memory bandwidth required in video coding (decoding/encoding) applications. According to a first aspect, the data assigned to each memory word is chosen to correspond to a 2D subarray of a larger array such as a macroblock. An array memory word organization allows reducing both the average and worst-case bandwidth required to retrieve predictions from memory in video coding applications, particularly for memory word sizes (memory bus widths) larger than the size of typical predictions. According to a second aspect, two or more 2D subarrays such as video predictions are retrieved from memory simultaneously as part of a larger 2D array, if retrieving the larger array requires fewer clock cycles than retrieving the subarrays individually. Allowing the combination of multiple predictions in one memory access operation can lead to a reduction in the average bandwidth required to retrieve predictions from memory.

    摘要翻译: 所描述的系统和方法允许减少视频编码(解码/编码)应用中所需的存储器带宽。 根据第一方面,分配给每个存储器字的数据被选择为对应于诸如宏块的较大阵列的2D子阵列。 阵列存储器字组织允许减少在视频编码应用中从存储器检索预测所需的平均和最坏情况带宽,特别是对于大于典型预测大小的存储器字大小(存储器总线宽度)。 根据第二方面,如果检索较大的阵列需要比单独检索子阵列更少的时钟周期,则从存储器同时从存储器中检索两个或更多个2D子阵列作为较大的2D阵列的一部分。 允许在一个存储器访问操作中的多个预测的组合可以导致从存储器检索预测所需的平均带宽的减少。

    Random access memory controller with out of order execution

    公开(公告)号:US07093094B2

    公开(公告)日:2006-08-15

    申请号:US10215705

    申请日:2002-08-09

    申请人: Sorin C. Cismas

    发明人: Sorin C. Cismas

    IPC分类号: G06F12/00

    摘要: A memory controller for a multi-bank random access memory (RAM) such as SDRAM includes a transaction slicer for slicing complex client transactions into simple slices, and a command scheduler for re-ordering preparatory memory commands such as activate and precharge in an order that can be different from the order of the corresponding client transactions. The command scheduler may also re-order memory access commands such as read and write. The slicing and out-of-order command scheduling allow a reduction in memory latency. The data transfer to and from clients can be kept in order.

    Automatic code generation for integrated circuit design
    5.
    发明授权
    Automatic code generation for integrated circuit design 有权
    集成电路设计的自动代码生成

    公开(公告)号:US06996799B1

    公开(公告)日:2006-02-07

    申请号:US09634131

    申请日:2000-08-08

    IPC分类号: G06F9/44

    CPC分类号: G06F17/5045

    摘要: An integrated circuit is designed by interconnecting pre-designed data-driven cores (intellectual property, functional blocks). Hardware description language (e.g. Verilog or VHDL) and software language (e.g. C or C++) code for interconnecting the cores is automatically generated by software tools from a central circuit specification. The central specification recites pre-designed hardware cores (intellectual property) and the interconnections between the cores. HDL and software language test benches, and timing constraints are also automatically generated from the central specification. The automatic generation of code simplifies the interconnection of pre-existing cores for the design of complex integrated circuits.

    摘要翻译: 集成电路是通过将预先设计的数据驱动内核(知识产权,功能块)互连来设计的。 用于互连核心的硬件描述语言(例如Verilog或VHDL)和软件语言(例如C或C ++)代码由中央电路规范的软件工具自动生成。 中央规格说明了预先设计的硬件核心(知识产权)和内核之间的互连。 HDL和软件语言测试台以及时序约束也是从中央规格自动生成的。 代码的自动生成简化了用于复杂集成电路设计的预先存在的核心的互连。

    Data flow integrated circuit architecture
    6.
    发明授权
    Data flow integrated circuit architecture 有权
    数据流集成电路架构

    公开(公告)号:US6145073A

    公开(公告)日:2000-11-07

    申请号:US174439

    申请日:1998-10-16

    申请人: Sorin C. Cismas

    发明人: Sorin C. Cismas

    IPC分类号: G06F17/50 G06F13/00

    CPC分类号: G06F17/5045

    摘要: Pre-designed and verified data-driven hardware cores (intellectual property, functional blocks) are assembled to generate large systems on a single chip. Token transfer between cores is achieved upon synchronous assertion, over dedicated connections, of a one-bit ready signal by the transmitter and a one-bit request signal by the receiver. The ready-request signal handshake is necessary and sufficient for token transfer. There are no combinational paths through the cores, and no latches or master controller are used. The architecture and interface allow a significant simplification in the design and verification of large systems integrated on a single chip.

    摘要翻译: 组装预先设计和验证的数据驱动硬件核心(知识产权,功能块)在单个芯片上生成大型系统。 核心之间的令牌转换是通过发送器的一位就绪信号的专用连接同时断言和接收器的一位请求信号来实现的。 就绪请求信号握手对于令牌传送是必要的和足够的。 没有通过核心的组合路径,并且不使用锁存器或主控制器。 架构和接口可以大大简化集成在单个芯片上的大型系统的设计和验证。

    Hardware multithreading systems and methods
    7.
    发明授权
    Hardware multithreading systems and methods 有权
    硬件多线程系统和方法

    公开(公告)号:US08640129B2

    公开(公告)日:2014-01-28

    申请号:US12818006

    申请日:2010-06-17

    IPC分类号: G06F9/46 G06F7/38

    摘要: According to some embodiments, a multithreaded microcontroller includes a thread control unit comprising thread control hardware (logic) configured to perform a number of multithreading system calls essentially in real time, e.g. in one or a few clock cycles. System calls can include mutex lock, wait condition, and signal instructions. The thread controller includes a number of thread state, mutex, and condition variable registers used for executing the multithreading system calls. Threads can transition between several states including free, run, ready and wait. The wait state includes interrupt, condition, mutex, I-cache, and memory substrates. A thread state transition controller controls thread states, while a thread instructions execution unit executes multithreading system calls and manages thread priorities to avoid priority inversion. A thread scheduler schedules threads according to their priorities. A hardware thread profiler including global, run and wait profiler registers is used to monitor thread performance to facilitate software development.

    摘要翻译: 根据一些实施例,多线程微控制器包括线程控制单元,线程控制单元包括线程控制硬件(逻辑),其被配置为基本上实时地执行多个多线程系统调用,例如, 在一个或几个时钟周期。 系统调用可以包括互斥锁,等待条件和信号指令。 线程控制器包括用于执行多线程系统调用的多个线程状态,互斥和条件变量寄存器。 线程可以在包括free,run,ready和wait之间的几个状态之间切换。 等待状态包括中断,条件,互斥,I缓存和存储器基板。 线程状态转移控制器控制线程状态,而线程指令执行单元执行多线程系统调用并管理线程优先级以避免优先级反转。 线程调度器根据其优先级来调度线程。 包括全局,运行和等待分析器寄存器的硬件线程分析器用于监视线程性能以促进软件开发。

    Memory Word Array Organization and Prediction Combination for Memory Access
    8.
    发明申请
    Memory Word Array Organization and Prediction Combination for Memory Access 有权
    内存字阵列组织和内存访问的预测组合

    公开(公告)号:US20130051462A1

    公开(公告)日:2013-02-28

    申请号:US13454922

    申请日:2012-04-24

    申请人: Sorin C. Cismas

    发明人: Sorin C. Cismas

    IPC分类号: H04N7/32

    CPC分类号: H04N19/423

    摘要: Described systems and methods allow a reduction in the memory bandwidth required in video coding (decoding/encoding) applications. According to a first aspect, the data assigned to each memory word is chosen to correspond to a 2D subarray of a larger array such as a macroblock. An array memory word organization allows reducing both the average and worst-case bandwidth required to retrieve predictions from memory in video coding applications, particularly for memory word sizes (memory bus widths) larger than the size of typical predictions. According to a second aspect, two or more 2D subarrays such as video predictions are retrieved from memory simultaneously as part of a larger 2D array, if retrieving the larger array requires fewer clock cycles than retrieving the subarrays individually. Allowing the combination of multiple predictions in one memory access operation can lead to a reduction in the average bandwidth required to retrieve predictions from memory.

    摘要翻译: 所描述的系统和方法允许减少视频编码(解码/编码)应用中所需的存储器带宽。 根据第一方面,分配给每个存储器字的数据被选择为对应于诸如宏块的较大阵列的2D子阵列。 阵列存储器字组织允许减少在视频编码应用中从存储器检索预测所需的平均和最坏情况带宽,特别是对于大于典型预测大小的存储器字大小(存储器总线宽度)。 根据第二方面,如果检索较大的阵列需要比单独检索子阵列更少的时钟周期,则从存储器同时从存储器中检索两个或更多个2D子阵列作为较大的2D阵列的一部分。 允许在一个存储器访问操作中的多个预测的组合可以导致从存储器检索预测所需的平均带宽的减少。

    Hardware Multithreading Systems and Methods
    9.
    发明申请
    Hardware Multithreading Systems and Methods 有权
    硬件多线程系统和方法

    公开(公告)号:US20100257534A1

    公开(公告)日:2010-10-07

    申请号:US12818006

    申请日:2010-06-17

    IPC分类号: G06F9/46

    摘要: According to some embodiments, a multithreaded microcontroller includes a thread control unit comprising thread control hardware (logic) configured to perform a number of multithreading system calls essentially in real time, e.g. in one or a few clock cycles. System calls can include mutex lock, wait condition, and signal instructions. The thread controller includes a number of thread state, mutex, and condition variable registers used for executing the multithreading system calls. Threads can transition between several states including free, run, ready and wait. The wait state includes interrupt, condition, mutex, I-cache, and memory substrates. A thread state transition controller controls thread states, while a thread instructions execution unit executes multithreading system calls and manages thread priorities to avoid priority inversion. A thread scheduler schedules threads according to their priorities. A hardware thread profiler including global, run and wait profiler registers is used to monitor thread performance to facilitate software development.

    摘要翻译: 根据一些实施例,多线程微控制器包括线程控制单元,线程控制单元包括线程控制硬件(逻辑),其被配置为基本上实时地执行多个多线程系统调用,例如, 在一个或几个时钟周期。 系统调用可以包括互斥锁,等待条件和信号指令。 线程控制器包括用于执行多线程系统调用的多个线程状态,互斥和条件变量寄存器。 线程可以在包括free,run,ready和wait之间的几个状态之间切换。 等待状态包括中断,条件,互斥,I缓存和存储器基板。 线程状态转移控制器控制线程状态,而线程指令执行单元执行多线程系统调用并管理线程优先级以避免优先级反转。 线程调度器根据其优先级来调度线程。 包括全局,运行和等待分析器寄存器的硬件线程分析器用于监视线程性能以促进软件开发。

    Hardware multithreading systems with state registers having thread profiling data
    10.
    发明授权
    Hardware multithreading systems with state registers having thread profiling data 有权
    具有状态寄存器的硬件多线程系统具有线程分析数据

    公开(公告)号:US07765547B2

    公开(公告)日:2010-07-27

    申请号:US10996691

    申请日:2004-11-24

    摘要: According to some embodiments, a multithreaded microcontroller includes a thread control unit comprising thread control hardware (logic) configured to perform a number of multithreading system calls essentially in real time, e.g. in one or a few clock cycles. System calls can include mutex lock, wait condition, and signal instructions. The thread controller includes a number of thread state, mutex, and condition variable registers used for executing the multithreading system calls. Threads can transition between several states including free, run, ready and wait. The wait state includes interrupt, condition, mutex, I-cache, and memory substates. A thread state transition controller controls thread states, while a thread instructions execution unit executes multithreading system calls and manages thread priorities to avoid priority inversion. A thread scheduler schedules threads according to their priorities. A hardware thread profiler including global, run and wait profiler registers is used to monitor thread performance to facilitate software development.

    摘要翻译: 根据一些实施例,多线程微控制器包括线程控制单元,线程控制单元包括线程控制硬件(逻辑),其被配置为基本上实时地执行多个多线程系统调用,例如, 在一个或几个时钟周期。 系统调用可以包括互斥锁,等待条件和信号指令。 线程控制器包括用于执行多线程系统调用的多个线程状态,互斥和条件变量寄存器。 线程可以在包括free,run,ready和wait之间的几个状态之间切换。 等待状态包括中断,条件,互斥,I缓存和内存子状态。 线程状态转移控制器控制线程状态,而线程指令执行单元执行多线程系统调用并管理线程优先级以避免优先级反转。 线程调度器根据其优先级来调度线程。 包括全局,运行和等待分析器寄存器的硬件线程分析器用于监视线程性能以促进软件开发。