Interleaving saturated lower half of data elements from two source registers of packed data
    13.
    发明授权
    Interleaving saturated lower half of data elements from two source registers of packed data 失效
    从包装数据的两个源寄存器中交织饱和的下半部分数据元素

    公开(公告)号:US07966482B2

    公开(公告)日:2011-06-21

    申请号:US11451906

    申请日:2006-06-12

    Abstract: An apparatus includes an instruction decoder, first and second source registers and a circuit coupled to the decoder to receive packed data from the source registers and to pack the packed data responsive to a pack instruction received by the decoder. A first packed data element and a second packed data element are received from the first source register. A third packed data element and a fourth packed data element are received from the second source register. The circuit packs packing a portion of each of the packed data elements into a destination register resulting with the portion from second packed data element adjacent to the portion from the first packed data element, and the portion from the fourth packed data element adjacent to the portion from the third packed data element.

    Abstract translation: 一种装置包括指令解码器,第一和第二源寄存器和耦合到解码器的电路,用于从源寄存器接收压缩数据,并响应于解码器接收到的包指令对压缩数据进行打包。 从第一源寄存器接收第一打包数据元素和第二打包数据元素。 从第二源寄存器接收第三压缩数据元素和第四打包数据元素。 所述电路包装将每个打包数据元素的一部分包装到目的地寄存器中,其结果是来自与来自第一打包数据元素的部分相邻的第二打包数据元素的部分,以及来自与该部分相邻的第四打包数据元素的部分 从第三个打包的数据元素。

    Method for performing shift operations on packed data
    14.
    发明授权
    Method for performing shift operations on packed data 失效
    对打包数据执行移位操作的方法

    公开(公告)号:US5666298A

    公开(公告)日:1997-09-09

    申请号:US701564

    申请日:1996-08-22

    Abstract: A processor. The processor includes a decoder being coupled to receive a control signal. The control signal has a first source address, a second source address, a destination address, and an operation field. The first source address corresponds to a first location. The second source address corresponds to a second location. The destination address corresponds to a third location. The operation field indicates that a type of packed data shift operation is to be performed. The processor further includes a circuit being coupled to the decoder. The circuit is for shifting a first packed data being stored at the first location by a value being stored at the second location. The circuit is further for communicating a corresponding result packed data to the third location.

    Abstract translation: 处理器 处理器包括被耦合以接收控制信号的解码器。 控制信号具有第一源地址,第二源地址,目的地地址和操作字段。 第一个源地址对应于第一个位置。 第二源地址对应于第二位置。 目的地址对应于第三个位置。 操作字段指示将执行一种打包数据移位操作。 处理器还包括耦合到解码器的电路。 电路用于将存储在第一位置的第一打包数据移位存储在第二位置的值。 电路还用于将相应的结果打包数据传送到第三位置。

    Packing signed word elements from two source registers to saturated signed byte elements in destination register
    15.
    发明授权
    Packing signed word elements from two source registers to saturated signed byte elements in destination register 失效
    将来自两个源寄存器的符号字元素包装到目标寄存器中的饱和有符号字节元素

    公开(公告)号:US08639914B2

    公开(公告)日:2014-01-28

    申请号:US13730831

    申请日:2012-12-29

    Abstract: An apparatus includes an instruction decoder, first and second source registers and a circuit coupled to the decoder to receive packed data from the source registers and to unpack the packed data responsive to an unpack instruction received by the decoder. A first packed data element and a third packed data element are received from the first source register. A second packed data element and a fourth packed data element are received from the second source register. The circuit copies the packed data elements into a destination register resulting with the second packed data element adjacent to the first packed data element, the third packed data element adjacent to the second packed data element, and the fourth packed data element adjacent to the third packed data element.

    Abstract translation: 一种装置包括指令解码器,第一和第二源寄存器以及耦合到解码器的电路,用于从源寄存器接收压缩数据,并根据解码器接收到的解包指令对打包数据进行解包。 从第一源寄存器接收第一打包数据元素和第三打包数据元素。 从第二源寄存器接收第二打包数据元素和第四打包数据元素。 所述电路将打包的数据元素复制到目的地寄存器中,其中与第一打包数据元素相邻的第二打包数据元素,与第二打包数据元素相邻的第三打包数据元素以及与第三打包数据元素相邻的第四打包数据元素 数据元素。

    METHOD AND APPARATUS FOR UNPACKING PACKED DATA
    17.
    发明申请
    METHOD AND APPARATUS FOR UNPACKING PACKED DATA 有权
    打包包装数据的方法和装置

    公开(公告)号:US20130117540A1

    公开(公告)日:2013-05-09

    申请号:US13730832

    申请日:2012-12-29

    Abstract: An apparatus includes an instruction decoder, first and second source registers and a circuit coupled to the decoder to receive packed data from the source registers and to unpack the packed data responsive to an unpack instruction received by the decoder. A first packed data element and a third packed data element are received from the first source register. A second packed data element and a fourth packed data element are received from the second source register. The circuit copies the packed data elements into a destination register resulting with the second packed data element adjacent to the first packed data element, the third packed data element adjacent to the second packed data element, and the fourth packed data element adjacent to the third packed data element.

    Abstract translation: 一种装置包括指令解码器,第一和第二源寄存器以及耦合到解码器的电路,用于从源寄存器接收压缩数据,并根据解码器接收到的解包指令对打包数据进行解包。 从第一源寄存器接收第一打包数据元素和第三打包数据元素。 从第二源寄存器接收第二打包数据元素和第四打包数据元素。 所述电路将打包的数据元素复制到目的地寄存器中,其中与第一打包数据元素相邻的第二打包数据元素,与第二打包数据元素相邻的第三打包数据元素以及与第三打包数据元素相邻的第四打包数据元素 数据元素。

    DETECTING AND OPTIMIZING FALSE SHARING
    18.
    发明申请
    DETECTING AND OPTIMIZING FALSE SHARING 失效
    检测和优化虚假共享

    公开(公告)号:US20110283152A1

    公开(公告)日:2011-11-17

    申请号:US12780904

    申请日:2010-05-16

    Abstract: Systems and methods for cache optimization are provided. The method comprises tracing objects instantiated during execution of a program code under test according to type of access by one or more threads running in parallel, wherein said tracing provides information about order in which different instances of one or more objects are accessed by said one or more threads and whether the type of access is a read operation or a write operation; and utilizing tracing information to build a temporal relationship graph (TRG) for the accessed objects, wherein the objects are represented by nodes in the TRG and at least two types of edges for connecting the nodes are defined.

    Abstract translation: 提供了缓存优化的系统和方法。 该方法包括根据由并行运行的一个或多个线程的访问类型来跟踪在执行被测程序代码期间实例化的对象,其中所述跟踪提供关于一个或多个对象的不同实例被所述一个或多个对象访问的顺序的信息, 更多线程以及访问类型是读操作还是写操作; 并且利用跟踪信息为所访问的对象建立时间关系图(TRG),其中所述对象由所述TRG中的节点表示,并且用于连接所述节点的至少两种类型的边缘被定义。

Patent Agency Ranking