Adaptive motion estimation cache organization
    1.
    发明授权
    Adaptive motion estimation cache organization 有权
    自适应运动估计缓存组织

    公开(公告)号:US08325798B1

    公开(公告)日:2012-12-04

    申请号:US11305457

    申请日:2005-12-15

    CPC分类号: H04N19/573 H04N19/433

    摘要: In some embodiments, a motion estimation search window cache is adaptively re-organized according to frame properties including a frame width and a number of reference frames corresponding to the current frame to be encoded/decoded. The cache reorganization may include an adaptive mapping of reference frame locations to search window cache allocation units (addresses). In some embodiments, a search window is shaped as a quasi-rectangle with truncated upper left and lower right corners, having a full-frame horizontal extent. A search range is defined in a central region of the search window, and is laterally bounded by the truncated corners.

    摘要翻译: 在一些实施例中,运动估计搜索窗口缓存根据包括帧编号/解码的当前帧对应的帧宽度和参考帧数量的帧属性进行自适应重组。 高速缓存重组可以包括参考帧位置到搜索窗口高速缓存分配单元(地址)的自适应映射。 在一些实施例中,搜索窗口被成形为具有截短的左上角和右下角的准矩形,具有全帧水平范围。 搜索范围在搜索窗口的中心区域中定义,并且被截顶角横向界定。

    Matrix of processors with data stream instruction execution pipeline coupled to data switch linking to neighbor units by non-contentious command channel / data channel
    2.
    发明授权
    Matrix of processors with data stream instruction execution pipeline coupled to data switch linking to neighbor units by non-contentious command channel / data channel 有权
    具有数据流指令执行流水线的处理器矩阵耦合到通过非争用的命令通道/数据通道链接到相邻单元的数据交换

    公开(公告)号:US07870365B1

    公开(公告)日:2011-01-11

    申请号:US12168857

    申请日:2008-07-07

    IPC分类号: G06F15/17

    CPC分类号: G06F15/173

    摘要: In some embodiments, control and data messages are transmitted non-contentiously over corresponding control and data channels of inter-processor links in a matrix of mesh-interconnected matrix processors. A data stream instruction executed by a user thread of an instruction processing pipeline of a matrix processor may initiate a data stream transfer by a hardware data switch of the matrix processor over multiple consecutive cycles over a data channel. While the data stream is being transferred, the corresponding control channel may transfer control messages non-contentiously with respect to the data stream. The control messages may be messages received from other matrix processors and/or control messages initiated by a kernel thread of the current matrix processor.

    摘要翻译: 在一些实施例中,在网状互连矩阵处理器的矩阵中,控制和数据消息被非争用地传输到处理器间链路的相应控制和数据信道上。 由矩阵处理器的指令处理流水线的用户线执行的数据流指令可以通过数据信道上的多个连续周期的矩阵处理器的硬件数据开关来发起数据流传输。 在传输数据流的同时,相应的控制信道可以相对于数据流非争用地传送控制消息。 控制消息可以是从其他矩阵处理器接收的消息和/或由当前矩阵处理器的内核线程发起的控制消息。

    Block and mode reordering to facilitate parallel intra prediction and motion vector prediction
    3.
    发明授权
    Block and mode reordering to facilitate parallel intra prediction and motion vector prediction 有权
    块和模式重新排序以便于并行帧内预测和运动矢量预测

    公开(公告)号:US08976870B1

    公开(公告)日:2015-03-10

    申请号:US11512684

    申请日:2006-08-30

    申请人: Sorin C. Cismas

    发明人: Sorin C. Cismas

    IPC分类号: H04N7/12 H04N19/593 H04N19/11

    摘要: A method for processing a plurality of sub-blocks in a block of video is disclosed. The method generally includes the steps of (A) intra predicting a first group of the sub-blocks in a first quadrant of the block, (B) intra predicting a second group of the sub-blocks in a second quadrant of the block after starting the intra predicting of the first group and (C) intra predicting a third group of the sub-blocks in the first quadrant after starting the intra predicting of the second group, wherein the first group and the third group together account for all of the sub-blocks in the first quadrant.

    摘要翻译: 公开了一种在视频块中处理多个子块的方法。 该方法通常包括以下步骤:(A)帧内预测块的第一象限中的第一组子块,(B)在开始之后帧内预测块的第二象限中的第二组子块 第一组的帧内预测和(C)在开始第二组的帧内预测之后帧内预测第一象限中的第三组子块,其中第一组和第三组一起计算所有子 第一象限中的阻塞。

    Memory word array organization and prediction combination for memory access
    4.
    发明授权
    Memory word array organization and prediction combination for memory access 有权
    内存字阵列组织和预测组合用于内存访问

    公开(公告)号:US08687706B2

    公开(公告)日:2014-04-01

    申请号:US13454922

    申请日:2012-04-24

    申请人: Sorin C. Cismas

    发明人: Sorin C. Cismas

    CPC分类号: H04N19/423

    摘要: Described systems and methods allow a reduction in the memory bandwidth required in video coding (decoding/encoding) applications. According to a first aspect, the data assigned to each memory word is chosen to correspond to a 2D subarray of a larger array such as a macroblock. An array memory word organization allows reducing both the average and worst-case bandwidth required to retrieve predictions from memory in video coding applications, particularly for memory word sizes (memory bus widths) larger than the size of typical predictions. According to a second aspect, two or more 2D subarrays such as video predictions are retrieved from memory simultaneously as part of a larger 2D array, if retrieving the larger array requires fewer clock cycles than retrieving the subarrays individually. Allowing the combination of multiple predictions in one memory access operation can lead to a reduction in the average bandwidth required to retrieve predictions from memory.

    摘要翻译: 所描述的系统和方法允许减少视频编码(解码/编码)应用中所需的存储器带宽。 根据第一方面,分配给每个存储器字的数据被选择为对应于诸如宏块的较大阵列的2D子阵列。 阵列存储器字组织允许减少在视频编码应用中从存储器检索预测所需的平均和最坏情况带宽,特别是对于大于典型预测大小的存储器字大小(存储器总线宽度)。 根据第二方面,如果检索较大的阵列需要比单独检索子阵列更少的时钟周期,则从存储器同时从存储器中检索两个或更多个2D子阵列作为较大的2D阵列的一部分。 允许在一个存储器访问操作中的多个预测的组合可以导致从存储器检索预测所需的平均带宽的减少。

    Memory word array organization and prediction combination for memory access
    5.
    发明授权
    Memory word array organization and prediction combination for memory access 有权
    内存字阵列组织和预测组合用于内存访问

    公开(公告)号:US08165219B2

    公开(公告)日:2012-04-24

    申请号:US10794280

    申请日:2004-03-03

    申请人: Sorin C Cismas

    发明人: Sorin C Cismas

    CPC分类号: H04N19/423

    摘要: Described systems and methods allow a reduction in the memory bandwidth required in video coding (decoding/encoding) applications. According to a first aspect, the data assigned to each memory word is chosen to correspond to a 2D subarray of a larger array such as a macroblock. An array memory word organization allows reducing both the average and worst-case bandwidth required to retrieve predictions from memory in video coding applications, particularly for memory word sizes (memory bus widths) larger than the size of typical predictions. According to a second aspect, two or more 2D subarrays such as video predictions are retrieved from memory simultaneously as part of a larger 2D array, if retrieving the larger array requires fewer clock cycles than retrieving the subarrays individually. Allowing the combination of multiple predictions in one memory access operation can lead to a reduction in the average bandwidth required to retrieve predictions from memory.

    摘要翻译: 所描述的系统和方法允许减少视频编码(解码/编码)应用中所需的存储器带宽。 根据第一方面,分配给每个存储器字的数据被选择为对应于诸如宏块的较大阵列的2D子阵列。 阵列存储器字组织允许减少在视频编码应用中从存储器检索预测所需的平均和最坏情况带宽,特别是对于大于典型预测大小的存储器字大小(存储器总线宽度)。 根据第二方面,如果检索较大的阵列需要比单独检索子阵列更少的时钟周期,则从存储器同时从存储器中检索两个或更多个2D子阵列作为较大的2D阵列的一部分。 允许在一个存储器访问操作中的多个预测的组合可以导致从存储器检索预测所需的平均带宽的减少。

    Matrix processor data switch routing systems and methods
    6.
    发明授权
    Matrix processor data switch routing systems and methods 有权
    矩阵处理器数据交换路由系统和方法

    公开(公告)号:US08145880B1

    公开(公告)日:2012-03-27

    申请号:US12168853

    申请日:2008-07-07

    IPC分类号: G06F15/76

    CPC分类号: G06F15/17381

    摘要: According to some embodiments, an integrated circuit comprises a microprocessor matrix of mesh-interconnected matrix processors. Each processor comprises a data switch including a data switch link register and matrix routing logic. The data switch link register includes one or more matrix link-enable register fields specifying a link enable status (e.g. a message-independent, p-to-p, and/or broadcast link enable status) for each inter-processor matrix link of the processor. The matrix routing logic routes inter-processor messages according to the matrix link-enable register field(s). A particular link may be selected by a current matrix processor by selecting an ordered list of matrix links according to a relationship between ΔH and ΔV, and choosing the first enabled link in the selected list for routing. ΔH is the horizontal matrix position difference between the current (sender) processor and a destination processor, and ΔV is the vertical matrix position difference between the current and destination processors.

    摘要翻译: 根据一些实施例,集成电路包括网状互连矩阵处理器的微处理器矩阵。 每个处理器包括包括数据交换链路寄存器和矩阵路由逻辑的数据交换机。 数据交换链路寄存器包括一个或多个矩阵链路使能寄存器字段,其指定针对每个处理器间矩阵链路的链路使能状态(例如,消息无关,p到p和/或广播链路使能状态) 处理器。 矩阵路由逻辑根据矩阵链路使能寄存器字段来路由处理器间消息。 可以由当前矩阵处理器通过根据&Dgr; H和&Dg​​r; V之间的关系选择矩阵链接的有序列表,并选择所选列表中的第一启用链路进行路由,来选择特定链路。 &Dgr; H是当前(发送者)处理器和目标处理器之间的水平矩阵位置差,&Dgr; V是当前和目标处理器之间的垂直矩阵位置差。

    Matrix processor initialization systems and methods
    7.
    发明授权
    Matrix processor initialization systems and methods 有权
    矩阵处理器初始化系统和方法

    公开(公告)号:US08131975B1

    公开(公告)日:2012-03-06

    申请号:US12168837

    申请日:2008-07-07

    IPC分类号: G06F15/76

    CPC分类号: G06F15/8023 G06F15/17381

    摘要: In some embodiments, an integrated circuit comprises a microprocessor matrix including a plurality of mesh-interconnected matrix processors, wherein each matrix processor comprises a data switch configured to direct inter-processor communications within the matrix, and a mapping unit configured to generate a data switch functionality map for a plurality of data switches in the microprocessor matrix. The data switch functionality map is generated by sending a first message through the matrix, and, setting a first functionality status designation for the first data switch in the data switch functionality map upon receiving a reply to the first message from a first data switch through the matrix.

    摘要翻译: 在一些实施例中,集成电路包括包括多个网状互连矩阵处理器的微处理器矩阵,其中每个矩阵处理器包括被配置为引导矩阵内的处理器间通信的数据交换机,以及配置成生成数据开关 用于微处理器矩阵中的多个数据交换机的功能图。 通过发送第一消息通过矩阵来生成数据交换功能映射,并且在从第一数据交换机接收到对第一消息的回复时,通过数据交换功能映射设置数据交换功能映射中的第一数据交换机的第一功能状态指定 矩阵。

    Random access memory controller with out of order execution

    公开(公告)号:US07093094B2

    公开(公告)日:2006-08-15

    申请号:US10215705

    申请日:2002-08-09

    申请人: Sorin C. Cismas

    发明人: Sorin C. Cismas

    IPC分类号: G06F12/00

    摘要: A memory controller for a multi-bank random access memory (RAM) such as SDRAM includes a transaction slicer for slicing complex client transactions into simple slices, and a command scheduler for re-ordering preparatory memory commands such as activate and precharge in an order that can be different from the order of the corresponding client transactions. The command scheduler may also re-order memory access commands such as read and write. The slicing and out-of-order command scheduling allow a reduction in memory latency. The data transfer to and from clients can be kept in order.

    Automatic code generation for integrated circuit design
    9.
    发明授权
    Automatic code generation for integrated circuit design 有权
    集成电路设计的自动代码生成

    公开(公告)号:US06996799B1

    公开(公告)日:2006-02-07

    申请号:US09634131

    申请日:2000-08-08

    IPC分类号: G06F9/44

    CPC分类号: G06F17/5045

    摘要: An integrated circuit is designed by interconnecting pre-designed data-driven cores (intellectual property, functional blocks). Hardware description language (e.g. Verilog or VHDL) and software language (e.g. C or C++) code for interconnecting the cores is automatically generated by software tools from a central circuit specification. The central specification recites pre-designed hardware cores (intellectual property) and the interconnections between the cores. HDL and software language test benches, and timing constraints are also automatically generated from the central specification. The automatic generation of code simplifies the interconnection of pre-existing cores for the design of complex integrated circuits.

    摘要翻译: 集成电路是通过将预先设计的数据驱动内核(知识产权,功能块)互连来设计的。 用于互连核心的硬件描述语言(例如Verilog或VHDL)和软件语言(例如C或C ++)代码由中央电路规范的软件工具自动生成。 中央规格说明了预先设计的硬件核心(知识产权)和内核之间的互连。 HDL和软件语言测试台以及时序约束也是从中央规格自动生成的。 代码的自动生成简化了用于复杂集成电路设计的预先存在的核心的互连。

    Data flow integrated circuit architecture
    10.
    发明授权
    Data flow integrated circuit architecture 有权
    数据流集成电路架构

    公开(公告)号:US6145073A

    公开(公告)日:2000-11-07

    申请号:US174439

    申请日:1998-10-16

    申请人: Sorin C. Cismas

    发明人: Sorin C. Cismas

    IPC分类号: G06F17/50 G06F13/00

    CPC分类号: G06F17/5045

    摘要: Pre-designed and verified data-driven hardware cores (intellectual property, functional blocks) are assembled to generate large systems on a single chip. Token transfer between cores is achieved upon synchronous assertion, over dedicated connections, of a one-bit ready signal by the transmitter and a one-bit request signal by the receiver. The ready-request signal handshake is necessary and sufficient for token transfer. There are no combinational paths through the cores, and no latches or master controller are used. The architecture and interface allow a significant simplification in the design and verification of large systems integrated on a single chip.

    摘要翻译: 组装预先设计和验证的数据驱动硬件核心(知识产权,功能块)在单个芯片上生成大型系统。 核心之间的令牌转换是通过发送器的一位就绪信号的专用连接同时断言和接收器的一位请求信号来实现的。 就绪请求信号握手对于令牌传送是必要的和足够的。 没有通过核心的组合路径,并且不使用锁存器或主控制器。 架构和接口可以大大简化集成在单个芯片上的大型系统的设计和验证。