Patent search cpc:"G06F15/8092" Page 1

1.

发明公开
Reconfigurable Parallel Processing 审中-公开

公开(公告)号：US20240264975A1

公开(公告)日：2024-08-08

申请号：US18619382

申请日：2024-03-28

Applicant: XDL Technologies Inc.

Inventor： Yuan Li , Jianbin Zhu

IPC: G06F15/80 , G06F9/30 , G06F9/34 , G06F9/38 , G06F9/445 , G06F12/0815 , G06F13/16 , G06F15/78

CPC classification number: G06F15/8023 , G06F9/3001 , G06F9/3004 , G06F9/3009 , G06F9/30098 , G06F9/34 , G06F9/3808 , G06F9/3867 , G06F9/3885 , G06F9/44505 , G06F12/0815 , G06F13/1673 , G06F15/7821 , G06F15/7867 , G06F15/7871 , G06F15/7875 , G06F15/7878 , G06F15/7885 , G06F15/7889 , G06F15/8046 , G06F15/8061 , G06F15/8069 , G06F15/8092 , G06F2212/1021 , Y02D10/00

Abstract: Processors, systems and methods are provided for thread level parallel processing. A processor may comprise a plurality of processing elements (PEs) that each may comprise a configuration buffer, a sequencer coupled to the configuration buffer of each of the plurality of PEs and configured to distribute one or more PE configurations to the plurality of PEs, and a gasket memory coupled to the plurality of PEs and being configured to store at least one PE execution result to be used by at least one of the plurality of PEs during a next PE configuration.

2.

发明公开
RECONFIGURABLE PARALLEL PROCESSOR WITH STACKED COLUMNS FORMING A CIRCULAR DATA PATH 审中-公开

公开(公告)号：US20240160602A1

公开(公告)日：2024-05-16

申请号：US17984351

申请日：2022-11-10

Applicant: AzurEngine Technologies Zhuhai Inc.

Inventor： Ryan Braidwood , Yuan LI , Jianbin Zhu , Toshio Nagata

IPC: G06F15/80 , G06F9/30

CPC classification number: G06F15/8092 , G06F9/3001 , G06F9/30036

Abstract: Processors, systems and methods are provided for thread level parallel processing. A processor may include a plurality of columns of vector processing units arranged in a two-dimensional column array with a plurality of column stacks placed side-by-side in a first direction and each column stack having two columns stacked in a second direction and a temporary storage buffer. Each column may include a processing element (PE) that has a vector Arithmetic Logic Unit (ALU) to perform arithmetic operations in parallel threads. At a first end of the column array in the first direction, two columns in the column stack are coupled to the temporary storage buffer for one-way data flow. At a second end of the column array in the first direction, two columns are coupled to each other for one-way data flow. The column array and the temporary storage buffer may form a one-way circular data path.

3.

发明授权
Method and apparatus for asynchronous processor removal of meta-stability 有权

公开(公告)号：US09740487B2

公开(公告)日：2017-08-22

申请号：US14480522

申请日：2014-09-08

Applicant: Huawei Technologies Co., Ltd.

Inventor： Tao Huang , Qifan Zhang , Wuxian Shi , Yiqun Ge , Wen Tong

IPC: G06F9/30 , G06F15/76 , G06F1/08 , G06F1/10 , G06F9/38 , G06F9/50

CPC classification number: G06F9/30145 , G06F1/08 , G06F1/10 , G06F9/30036 , G06F9/30189 , G06F9/3826 , G06F9/3828 , G06F9/3836 , G06F9/3851 , G06F9/3853 , G06F9/3871 , G06F9/3877 , G06F9/3885 , G06F9/3889 , G06F9/3891 , G06F9/5011 , G06F15/8007 , G06F15/8053 , G06F15/8092 , G06F2009/3883

Abstract: A clock-less asynchronous processing circuit or system having a plurality of pipelined processing stages utilizes self-clocked generators to tune the delay needed in each of the processing stages to complete the processing cycle. Because different processing stages may require different amounts of time to complete processing or may require different delays depending on the processing required in a particular stage, the self-clocked generators may be tuned to each stage's necessary delay(s) or may be programmably configured.

4.

发明授权
Apparatus and method of vector unit sharing 有权

公开(公告)号：US09727526B2

公开(公告)日：2017-08-08

申请号：US13981851

申请日：2011-01-25

Applicant: Malcolm Stewart , Ali Osman Ors , Daniel Laroche

Inventor： Malcolm Stewart , Ali Osman Ors , Daniel Laroche

IPC: G06F15/76 , G06F15/80 , G06F9/30 , G06F15/78 , G06F9/38

CPC classification number: G06F15/76 , G06F9/30036 , G06F9/30112 , G06F9/3867 , G06F9/3885 , G06F9/3887 , G06F15/7867 , G06F15/8007 , G06F15/8053 , G06F15/8076 , G06F15/8084 , G06F15/8092

Abstract: A reconfigurable vector processor is described that allows the size of its vector units to be changed in order to process vectors of different sizes. The reconfigurable vector processor comprises a plurality of processor units. Each of the processor units comprises a control unit for decoding instructions and generating control signals, a scalar unit for processing instructions on scalar data, and a vector unit for processing instructions on vector data under control of control signals. The reconfigurable vector processor architecture also comprises a vector control selector for selectively providing control signals generated by one processor unit of the plurality of processor units to the vector unit of a different processor unit of the plurality of processor units.

5.

发明申请
ASYNCHRONOUS INSTRUCTION EXECUTION APPARATUS AND METHOD 审中-公开

公开(公告)号：US20170212759A1

公开(公告)日：2017-07-27

申请号：US15482550

申请日：2017-04-07

Applicant: Huawei Technologies Co., Ltd.

Inventor： Shaola YANG , Xiaocheng LIU , Zhen XU

IPC: G06F9/30 , G06F15/80 , G06F9/38 , G06F9/46

CPC classification number: G06F9/30036 , G06F9/3001 , G06F9/30145 , G06F9/3836 , G06F9/3871 , G06F9/3885 , G06F9/46 , G06F15/8092

Abstract: An asynchronous instruction execution apparatus and method are provided. The asynchronous instruction execution apparatus includes a vector execution unit control VXUC module and n vector execution unit data VXUD modules, where n is a positive integer. The VXUC module is configured to perform instruction decoding and token management. The n VXUD modules are cascaded, separately connected to the VXUC module, and configured to invoke an external calculation resource to perform data calculation. A bit width of data processed by the asynchronous instruction execution apparatus is M, a bit width of each VXUD module is N, and n=M/N. The asynchronous instruction execution apparatus is divided into two parts: the VXUC and the VXUD.

6.

发明申请
Data-Driven Accelerator For Machine Learning And Raw Data Analysis 审中-公开

公开(公告)号：US20170083827A1

公开(公告)日：2017-03-23

申请号：US14862408

申请日：2015-09-23

Applicant: QUALCOMM Incorporated

Inventor： Behnam Robatmili , Matthew Leslie Badin , Dario Suárez Gracia , Gheorghe Calin Cascaval , Nayeem Islam

IPC: G06N99/00

CPC classification number: G06N20/00 , G06F15/8092

Abstract: Embodiments include computing devices, apparatus, and methods implemented by the apparatus for accelerating machine learning on a computing device. Raw data may be received in the computing device from a raw data source device. The apparatus may identify key features as two dimensional matrices of the raw data such that the key features are mutually exclusive from each other. The key features may be translated into key feature vectors. The computing device may generate a feature vector from at least one of the key feature vectors. The computing device may receive a first partial output resulting from an execution of a basic linear algebra subprogram (BLAS) operation using the feature vector and a weight factor. The first partial output may be combined with a plurality of partial outputs to produce an output matrix. Receiving the raw data on the computing device may include receiving streaming raw data.

7.

发明申请
METHOD AND APPARATUS FOR ASYNCHRONOUS PROCESSOR BASED ON CLOCK DELAY ADJUSTMENT 有权
Title translation: 基于时钟延迟调整的异步处理器的方法和装置

公开(公告)号：US20150074446A1

公开(公告)日：2015-03-12

申请号：US14480531

申请日：2014-09-08

Applicant: Futurewei Technologies Inc.

Inventor： Wen Tong , Yiqun Ge , Qifan Zhang , Wuxian Shi , Huang Tao

IPC: G06F1/08

CPC classification number: G06F9/30145 , G06F1/08 , G06F1/10 , G06F9/30036 , G06F9/30189 , G06F9/3826 , G06F9/3828 , G06F9/3836 , G06F9/3851 , G06F9/3853 , G06F9/3871 , G06F9/3877 , G06F9/3885 , G06F9/3889 , G06F9/3891 , G06F9/5011 , G06F15/8007 , G06F15/8053 , G06F15/8092 , G06F2009/3883

Abstract: A clock-less asynchronous processing circuit or system utilizes a self-clocked generator to adjust the processing delay (latency) needed/allowed to the processing cycle in the circuit/system. The timing of the self-clocked generator is dynamically adjustable depending on various parameters. These parameters may include processing instruction, opcode information, type of processing to be performed by the circuit/system, or overall desired processing performance. The latency may also be adjusted to change processing performance, including power consumption, speed etc.

Abstract translation: 无时钟异步处理电路或系统利用自定时发生器来调整电路/系统中处理周期所需/允许的处理延迟（等待时间）。自定时发生器的定时可根据各种参数动态调整。这些参数可以包括处理指令，操作码信息，由电路/系统执行的处理类型或整体期望的处理性能。还可以调整延迟以改变处理性能，包括功耗，速度等。

8.

发明申请
METHOD AND APPARATUS FOR ASYNCHRONOUS PROCESSOR WITH FAST AND SLOW MODE 有权
Title translation: 具有快速和慢速模式的异步加工器的方法和装置

公开(公告)号：US20150074443A1

公开(公告)日：2015-03-12

申请号：US14480491

申请日：2014-09-08

Applicant: Futurewei Technologies Inc.

Inventor： Tao Huang , Qifan Zhang , Wuxian Shi , Yiqun Ge , Wen Tong

IPC: G06F1/08

CPC classification number: G06F9/30145 , G06F1/08 , G06F1/10 , G06F9/30036 , G06F9/30189 , G06F9/3826 , G06F9/3828 , G06F9/3836 , G06F9/3851 , G06F9/3853 , G06F9/3871 , G06F9/3877 , G06F9/3885 , G06F9/3889 , G06F9/3891 , G06F9/5011 , G06F15/8007 , G06F15/8053 , G06F15/8092 , G06F2009/3883

Abstract: A clock-less asynchronous processing circuit or system is configured to operation in a plurality of modes. In an initialization mode (e.g., reset, initialization, boot up), a self-clocked generator associated with the asynchronous circuit is configured to generate an active complete signal (to latch output processed data) within a first period of time after receiving a trigger signal. In a normal mode, the self-clocked generator is configured to generate the active complete signal within a second period of time after receiving the trigger signal. In one embodiment, during the initialization mode, the asynchronous circuit latches the output slower than when in the normal mode.

Abstract translation: 无时钟异步处理电路或系统被配置为以多种模式操作。在初始化模式（例如，复位，初始化，启动）中，与异步电路相关联的自定时发生器被配置为在接收到触发之后的第一时间段内产生有效完成信号（锁存输出处理的数据）信号。在正常模式下，自定时发生器被配置为在接收到触发信号之后的第二时间段内产生有效完成信号。在一个实施例中，在初始化模式期间，异步电路比正常模式下的锁存输出慢。

9.

发明申请
APPARATUS, SYSTEMS, AND METHODS FOR LOW POWER COMPUTATIONAL IMAGING 有权
Title translation: 用于低功率计算成像的装置，系统和方法

公开(公告)号：US20150046675A1

公开(公告)日：2015-02-12

申请号：US14458052

申请日：2014-08-12

Applicant: Linear Algebra Technologies Limited

Inventor： Brendan BARRY , Richard RICHMOND , Fergal CONNOR , David MOLONEY

IPC: G06F15/78 , G06F9/38 , G06F13/28

CPC classification number: G06F9/3867 , G06F13/28 , G06F15/8092 , G06T1/20

Abstract: The present application discloses a computing device that can provide a low-power, highly capable computing platform for computational imaging. The computing device can include one or more processing units, for example one or more vector processors and one or more hardware accelerators, an intelligent memory fabric, a peripheral device, and a power management module. The computing device can communicate with external devices, such as one or more image sensors, an accelerometer, a gyroscope, or any other suitable sensor devices.

Abstract translation: 本申请公开了一种计算设备，其可以提供用于计算成像的低功率，高能力的计算平台。计算设备可以包括一个或多个处理单元，例如一个或多个向量处理器和一个或多个硬件加速器，智能存储器结构，外围设备和电源管理模块。计算设备可以与诸如一个或多个图像传感器，加速度计，陀螺仪或任何其它合适的传感器设备的外部设备进行通信。

10.

发明申请
SYSTEMS, APPARATUSES, AND METHODS FOR SETTING AN OUTPUT MASK IN A DESTINATION WRITEMASK REGISTER FROM A SOURCE WRITE MASK REGISTER USING AN INPUT WRITEMASK AND IMMEDIATE 有权
Title translation: 用于使用输入写入和立即从源写入掩码寄存器设置目的地写入寄存器中的输出掩码的系统，设备和方法

公开(公告)号：US20140223139A1

公开(公告)日：2014-08-07

申请号：US13991877

申请日：2011-12-23

Applicant: Victor W. Lee , Daehyun Kim , Tin-Fook Ngai , Jayashankar Bharadwaj , Albert Hartono , Sara Baghsorkhi , Nalini Vasudevan

Inventor： Victor W. Lee , Daehyun Kim , Tin-Fook Ngai , Jayashankar Bharadwaj , Albert Hartono , Sara Baghsorkhi , Nalini Vasudevan

IPC: G06F9/30

CPC classification number: G06F9/30036 , G06F9/30018 , G06F9/30021 , G06F9/30025 , G06F9/30072 , G06F15/8007 , G06F15/8053 , G06F15/8084 , G06F15/8092

Abstract: Embodiments of systems, apparatuses, and methods for performing in a computer processor generation of a predicate mask based on vector comparison in response to a single instruction are described.

Abstract translation: 描述了用于在计算机处理器中执行基于向量比较响应于单个指令生成谓词掩码的系统，装置和方法的实施例。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification