Patent search ap:("VIA ALLIANCE SEMICONDUCTOR CO. Page LTD.") AND inv:"G. GLENN HENRY"

1.

发明申请
NEURAL NETWORK UNIT WITH MIXED DATA AND WEIGHT SIZE COMPUTATION CAPABILITY 审中-公开

公开(公告)号：US20180165575A1

公开(公告)日：2018-06-14

申请号：US15372555

申请日：2016-12-08

Applicant: VIA ALLIANCE SEMICONDUCTOR CO., LTD.

Inventor： G. GLENN HENRY , KIM C. HOUCK

IPC: G06N3/08 , G06N3/063

Abstract: In a neural network unit, each neural processing unit (NPU) of an array of N NPUs receives respective first and second upper and lower bytes of 2N bytes received from first and second RAMs. In a first mode, each NPU sign-extends the first upper byte to form a first 16-bit word and performs an arithmetic operation on the first 16-bit word and a second 16-bit word formed by the second upper and lower bytes. In a second mode, each NPU sign-extends the first lower byte to form a third 16-bit word and performs the arithmetic operation on the third 16-bit word and the second 16-bit word formed by the second upper and lower bytes. In a third mode, each NPU performs the arithmetic operation on a fourth 16-bit word formed by the first upper and lower bytes and the second 16-bit word formed by the second upper and lower bytes.

2.

发明申请
PROCESSOR WITH MEMORY ARRAY OPERABLE AS EITHER LAST LEVEL CACHE SLICE OR NEURAL NETWORK UNIT MEMORY 审中-公开

公开(公告)号：US20180157967A1

公开(公告)日：2018-06-07

申请号：US15366053

申请日：2016-12-01

Applicant: VIA ALLIANCE SEMICONDUCTOR CO., LTD.

Inventor： G. GLENN HENRY , DOUGLAS R. REED

IPC: G06N3/063 , G06N3/04

CPC classification number: G06N3/063 , G06F12/0811 , G06F12/084 , G06F12/0864 , G06F2212/1041 , G06N3/0445 , G06N3/0454

Abstract: A processor comprising a plurality of processing cores, a last level cache memory (LLC) shared by the plurality of processing cores, and a neural network unit (NNU) comprising an array of neural processing units (NPU) and a memory array. The LLC comprises a plurality of slices. To transition from a first mode in which the memory array operates to store neural network weights read by the plurality of NPUs to a second mode in which the memory array operates as a slice of the LLC in addition to the plurality of slices, the processor write-back-invalidates the LLC and updates a hashing algorithm to include the memory array as a slice of the LLC in addition to the plurality of slices. To transition from the second mode to the first mode, the processor write-back-invalidates the LLC and updates the hashing algorithm to exclude the memory array from the LLC.

3.

发明申请
NEURAL NETWORK UNIT WITH MEMORY LAYOUT TO PERFORM EFFICIENT 3-DIMENSIONAL CONVOLUTIONS 审中-公开

公开(公告)号：US20180157962A1

公开(公告)日：2018-06-07

申请号：US15366041

申请日：2016-12-01

Applicant: VIA ALLIANCE SEMICONDUCTOR CO., LTD.

Inventor： G. GLENN HENRY , KIM C. HOUCK

IPC: G06N3/04 , G06N3/063 , G06N3/08

CPC classification number: G06N3/063 , G06N3/0445 , G06N3/0454

Abstract: A neural network unit convolves an H×W×C input with F R×S×C filters to generate F Q×P outputs. N processing units (PU) each have a register receiving a respective word of an N-word row of a second memory and multiplexed-register selectively receiving a respective word of an N-word row of a first memory or word rotated from an adjacent PU multiplexed-register. H first memory rows hold input blocks of B words each of channels of respective 2-dimensional input row slices. R×S×C second memory rows hold filter blocks of B words each holding P copies of a filter weight. B is the smallest factor of N greater than W. The PU blocks multiply-accumulate input blocks and filter blocks in column-channel-row order; they read a row of input blocks and rotate it around the N PUs while performing multiply-accumulate operations so each PU block receives each input block before reading another row.

4.

发明申请
PROCESSOR WITH PROGRAMMABLE PREFETCHER 审中-公开

公开(公告)号：US20170161196A1

公开(公告)日：2017-06-08

申请号：US15372045

申请日：2016-12-07

Applicant: VIA ALLIANCE SEMICONDUCTOR CO., LTD.

Inventor： G. GLENN HENRY , RODNEY E. HOOKER , TERRY PARKS , DOUGLAS R. REED

IPC: G06F12/0862 , G06F12/0855

CPC classification number: G06F12/0862 , G06F9/30047 , G06F9/3802 , G06F12/0855 , G06F12/0875 , G06F12/0897 , G06F2212/452 , G06F2212/602 , G06F2212/6022 , G06F2212/6028

Abstract: A processor including a front end, at least one load pipeline, and a memory system that further includes a programmable prefetcher for prefetching information from an external memory. The front end converts fetched program instructions into microinstructions including load microinstructions and dispatches microinstructions for execution. The load pipeline executes dispatched load microinstructions and provides load requests to the memory system. The programmable prefetcher includes a load monitor, a programmable prefetch engine, and a prefetch requester. The load monitor tracks the load requests. The prefetch engine is configured to be programmed by at least one prefetch program to operate as a programmed prefetcher, such that during operation of the processor, the programmed prefetcher generates at least one prefetch address based on the load requests issued by the processor. The prefetch requester submits the at least one prefetch address to prefetch information from the memory system.

5.

发明申请
NEURAL NETWORK UNIT THAT PERFORMS CONVOLUTIONS USING COLLECTIVE SHIFT REGISTER AMONG ARRAY OF NEURAL PROCESSING UNITS 审中-公开

公开(公告)号：US20170103311A1

公开(公告)日：2017-04-13

申请号：US15090722

申请日：2016-04-05

Applicant: VIA ALLIANCE SEMICONDUCTOR CO., LTD.

Inventor： G. GLENN HENRY , TERRY PARKS , KYLE T. O'BRIEN

IPC: G06N3/08 , G06N3/04

Abstract: A neural network unit has a first memory that holds elements of a data matrix and a second memory that holds elements of a convolution kernel. An array of neural processing units (NPU) each have a multiplexed register that receives a corresponding element of a row from the first memory and that also receives the multiplexed register output of an adjacent NPU. A register receives a corresponding element of a row from the second memory. An arithmetic unit receives the outputs of the register, the multiplexed register and an accumulator and performs a multiply-accumulate operation on them. For each sub-matrix of a plurality of sub-matrices of the data matrix, each arithmetic unit selectively receives either the element from the first memory or the adjacent NPU multiplexed register output and performs a series of the multiply-accumulate operations to accumulate into the accumulator a convolution of the sub-matrix with the convolution kernel.

6.

发明申请
PROCESSOR WITH HYBRID COPROCESSOR/EXECUTION UNIT NEURAL NETWORK UNIT 审中-公开

公开(公告)号：US20170103307A1

公开(公告)日：2017-04-13

申请号：US15090798

申请日：2016-04-05

Applicant: VIA ALLIANCE SEMICONDUCTOR CO., LTD.

Inventor： G. GLENN HENRY , TERRY PARKS

IPC: G06N3/063 , G06N3/04

Abstract: A processor includes a front-end portion that issues instructions to execution units that execute the issued instructions. A hardware neural network unit (NNU) execution unit includes a first memory that holds data words associated with artificial neural networks (ANN) neuron outputs, a second memory that holds weight words associated with connections between ANN neurons, and a third memory that holds a program comprising NNU instructions that are distinct, with respect to their instruction set, from the instructions issued to the NNU by the front-end portion of the processor. The program performs ANN-associated computations on the data and weight words. A first instruction instructs the NNU to transfer NNU instructions of the program from architectural general purpose registers to the third memory. A second instruction instructs the NNU to invoke the program stored in the third memory.

7.

发明申请
PROCESSOR WITH ARCHITECTURAL NEURAL NETWORK EXECUTION UNIT 审中-公开

公开(公告)号：US20170103301A1

公开(公告)日：2017-04-13

申请号：US15090669

申请日：2016-04-05

Applicant: VIA ALLIANCE SEMICONDUCTOR CO., LTD.

Inventor： G. GLENN HENRY , TERRY PARKS

IPC: G06N3/04 , G06N3/063

CPC classification number: G06F7/49947 , G06F1/10 , G06F7/483 , G06F9/3001 , G06F9/30029 , G06F9/30032 , G06F9/3004 , G06F9/30098 , G06F9/30101 , G06F9/30189 , G06F9/321 , G06F9/38 , G06F9/3836 , G06F9/3867 , G06F9/3877 , G06F9/3893 , G06F9/44505 , G06F15/82 , G06N3/04 , G06N3/0445 , G06N3/0454 , G06N3/063 , G06N3/0635 , G06N3/08 , G06N3/088

Abstract: A processor has an instruction fetch unit that fetches ISA instructions from memory and execution units that perform operations on instruction operands to generate results according to the processor's ISA. A hardware neural network unit (NNU) execution unit performs computations associated with artificial neural networks (ANN). The NNU has an array of ALUs, a first memory that holds data words associated with ANN neuron outputs, and a second memory that holds weight words associated with connections between ANN neurons. Each ALU multiplies a portion of the data words by a portion of the weight words to generate products and accumulates the products in an accumulator as an accumulated value. Activation function units normalize the accumulated values to generate outputs associated with ANN neurons. The ISA includes at least one instruction that instructs the processor to write data words and the weight words to the respective first and second memories.

8.

发明申请
DIRECT EXECUTION BY AN EXECUTION UNIT OF A MICRO-OPERATION LOADED INTO AN ARCHITECTURAL REGISTER FILE BY AN ARCHITECTURAL INSTRUCTION OF A PROCESSOR 审中-公开

公开(公告)号：US20170102945A1

公开(公告)日：2017-04-13

申请号：US15090708

申请日：2016-04-05

Applicant: VIA ALLIANCE SEMICONDUCTOR CO., LTD.

Inventor： G. GLENN HENRY , TERRY PARKS

IPC: G06F9/30 , G06F9/38

CPC classification number: G06F15/82 , G06F1/10 , G06F7/483 , G06F7/49947 , G06F9/3001 , G06F9/30029 , G06F9/30032 , G06F9/3004 , G06F9/30098 , G06F9/30101 , G06F9/30189 , G06F9/321 , G06F9/38 , G06F9/3836 , G06F9/3867 , G06F9/3877 , G06F9/3893 , G06F9/44505 , G06N3/04 , G06N3/0445 , G06N3/0454 , G06N3/063 , G06N3/0635 , G06N3/08 , G06N3/088

Abstract: A processor includes an architectural register file loadable with micro-operations by architectural instructions of an architectural instruction set of the processor and an execution unit that executes instructions. The instructions are either architectural instructions or microinstructions into which architectural instructions are translated. The execution unit includes a decoder that decodes the instructions into micro-operations, a mode indicator that indicates one of first and second modes, a pipeline of stages to which are provided micro-operations that control circuits of the stages of the pipeline, and a multiplexer. The multiplexer selects for provision to the pipeline a micro-operation received from the decoder when the mode indicator indicates the first mode and selects for provision to the pipeline a micro-operation received from the architectural register file when the mode indicator indicates the second mode.

9.

发明申请
MECHANISM TO PRECLUDE I/O-DEPENDENT LOAD REPLAYS IN AN OUT-OF-ORDER PROCESSOR 审中-公开

公开(公告)号：US20160342414A1

公开(公告)日：2016-11-24

申请号：US14889199

申请日：2014-12-14

Applicant: VIA ALLIANCE SEMICONDUCTOR CO., LTD.

Inventor： GERARD M. COL , COLIN EDDY , G. GLENN HENRY

IPC: G06F9/22 , G06F1/32 , G06F12/0875

CPC classification number: G06F9/226 , G06F1/3243 , G06F9/30043 , G06F9/3836 , G06F12/0875 , G06F2212/452

Abstract: An apparatus including first and second reservation stations. The first reservation station dispatches a load micro instruction, and indicates on a hold bus if the load micro instruction is a specified load micro instruction directed to retrieve an operand from a prescribed resource other than on-core cache memory. The second reservation station is coupled to the hold bus, and dispatches one or more younger micro instructions therein that depend on the load micro instruction for execution after a number of clock cycles following dispatch of the first load micro instruction, and if it is indicated on the hold bus that the load micro instruction is the specified load micro instruction, the second reservation station is configured to stall dispatch of the one or more younger micro instructions until the load micro instruction has retrieved the operand. The resources include an input/output (I/O) unit, configured to perform I/O operations via an I/O bus coupling an out-of-order processor to I/O resources.

10.

发明申请
HARDWARE DATA COMPRESSOR WITH MULTIPLE STRING MATCH SEARCH HASH TABLES EACH BASED ON DIFFERENT HASH SIZE 有权
Title translation: 硬件数据压缩机，具有多个字符匹配搜索基于不同大小的哈希表

公开(公告)号：US20160336961A1

公开(公告)日：2016-11-17

申请号：US14883068

申请日：2015-10-14

Applicant: VIA ALLIANCE SEMICONDUCTOR CO., LTD.

Inventor： G. GLENN HENRY , TERRY PARKS

IPC: H03M7/42

CPC classification number: H03M7/42 , H03M7/3084

Abstract: A hardware data compressor. A hardware engine maintains first and second hash tables while it scans an input block of characters to be compressed. The first hash table is indexed by a hash of N characters of the input block. The second hash table is indexed by a hash of M characters of the input block. M is greater than two. N is greater than M. The engine uses the first hash table to search the input block behind a current search target location for a match of at least N characters at the current search target location, and uses the second hash table to search the input block behind the current search target location for a match of at least M characters at the current search target location when no match of at least N characters at the current search target location using the first hash table is found.

Abstract translation: 硬件数据压缩器。硬件引擎在扫描要压缩的字符的输入块时维护第一和第二散列表。第一个哈希表由输入块的N个字符的散列索引。第二个哈希表由输入块的M个字符的哈希索引。 M大于2。 N大于M.引擎使用第一哈希表来搜索当前搜索目标位置之后的输入块，以获得当前搜索目标位置处至少N个字符的匹配，并且使用第二哈希表来搜索输入块在当前搜索目标位置之后的当前搜索目标位置处的至少M个字符的匹配的当前搜索目标位置之后的当前搜索目标位置处的至少N个字符的匹配被发现。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification