Patent search ap:("MediaTek Inc.") AND inv:"Po-Chun Fan" Page 1

1.

发明授权
Adaptive execution engine for convolution computing systems 有权

公开(公告)号：US10394929B2

公开(公告)日：2019-08-27

申请号：US15787897

申请日：2017-10-19

Applicant: MediaTek Inc.

Inventor： Sung-Fang Tsai , Pei-Kuei Tsung , Po-Chun Fan , Shou-Jen Lai

IPC: G06F17/15 , G06F17/16 , G06N3/00

Abstract: A system performs convolution computing in either a matrix mode or a filter mode. An analysis module generates a mode select signal to select the matrix mode or the filter mode based on results of analyzing convolution characteristics. The results include at least a comparison of resource utilization between the matrix mode and the filter mode. A convolution module includes processing elements, each of which further includes arithmetic computing circuitry. The convolution module is configured according to the matrix mode for performing matrix multiplications converted from convolution computations, and is configured according to the filter mode for performing the convolution computations.

2.

发明授权
Apparatus for mutual-transposition of scalar and vector data sets and related method 有权
Title translation: 标量和向量数据集相互转置的装置及相关方法

公开(公告)号：US09507601B2

公开(公告)日：2016-11-29

申请号：US14184663

申请日：2014-02-19

Applicant: MEDIATEK INC.

Inventor： Pei-Kuei Tsung , Mu-Fan Murphy Chang , Po-Chun Fan

IPC: G06F9/355 , G06F12/10 , G06F9/30

CPC classification number: G06F9/3555 , G06F9/30036 , G06F12/0207 , G06F12/1009 , G06F2212/65 , G09G5/393 , G09G2360/122 , G09G2360/123

Abstract: An apparatus for processing a plurality of data sets is disclosed, wherein one data set of the plurality of data sets includes N components and has a data type of one of a scalar type and a vector type, wherein N is a positive integer number. The apparatus includes a memory module and a data accessing module. The memory module comprises N memory units configured to store the plurality of data sets. The data accessing module is configured to write the data set into the memory module according to a write data index corresponding to the data set and one of a first writing mapping information and a second writing mapping information, wherein the first writing mapping information is employed when the data type is one of the scalar and the vector type and the second writing mapping information is employed when the data type is the other of the scalar and the vector type.

Abstract translation: 公开了一种用于处理多个数据集的装置，其中所述多个数据集中的一个数据集包括N个分量，并且具有标量类型和矢量类型之一的数据类型，其中N是正整数。该装置包括存储器模块和数据访问模块。存储器模块包括被配置为存储多个数据集的N个存储器单元。数据访问模块被配置为根据与数据集相对应的写入数据索引将数据集写入存储器模块中，并且将第一写入映射信息和第二写入映射信息中的一个写入映射信息，数据类型是标量和向量类型之一，并且当数据类型是标量和向量类型中的另一个时采用第二写入映射信息。

3.

发明申请
EFFICIENT WORK EXECUTION IN A PARALLEL COMPUTING SYSTEM 审中-公开

公开(公告)号：US20190250924A1

公开(公告)日：2019-08-15

申请号：US16395193

申请日：2019-04-25

Applicant: MediaTek Inc.

Inventor： Shou-Jen Lai , Pei-Kuei Tsung , Po-Chun Fan , Sung-Fang Tsai

IPC: G06F9/38 , G06F9/30

CPC classification number: G06F9/3887 , G06F9/30036 , G06F9/3012 , G06F9/3824 , G06F9/383 , G06F9/3851

Abstract: A computing device operative to perform parallel computations. The computing device includes a controller unit to assign workgroups to a set of batches. Each batch includes a program counter shared by M workgroups assigned to the batch, where M is a positive integer determined according to a configurable batch setting. Each batch further includes a set of thread processing units operative to execute, in parallel, a subset of work items in each of the M workgroups. Each batch further includes a spilling memory to store intermediate data of the M workgroups when one or more workgroups in the M workgroups encounters a synchronization barrier.

4.

发明授权
Memory shuffle engine for efficient work execution in a parallel computing system 有权

公开(公告)号：US10324730B2

公开(公告)日：2019-06-18

申请号：US15285472

申请日：2016-10-04

Applicant: MediaTek Inc.

Inventor： Shou-Jen Lai , Pei-Kuei Tsung , Po-Chun Fan , Sung-Fang Tsai

IPC: G06F9/46 , G06F9/38 , G06F9/30

Abstract: A computing device performs parallel computations using a set of thread processing units and a memory shuffle engine. The memory shuffle engine includes a register array to store an array of data elements retrieved from a memory buffer, and an array of input selectors. According to a first control signal, each input selector transfers at least a first data element from a corresponding subset of the register array, which is coupled to the input selector via input lines, to one or more corresponding thread processing units. According to a second control signal, each input selector transfers at least a second data element from another subset of the register array, which is coupled to another input selector via other input lines, to the one or more corresponding thread processing units.

5.

发明授权
Graphic processing system and method thereof 有权

公开(公告)号：US09760969B2

公开(公告)日：2017-09-12

申请号：US14641449

申请日：2015-03-09

Applicant: MEDIATEK INC.

Inventor： Ming-Hao Liao , Shou-Jen Lai , Chia-Hsien Chou , Po-Chun Fan , Yan-Hong Lu , Chih-Chung Cheng , Hung-Yau Lin

IPC: G06T1/20

CPC classification number: G06T1/20

Abstract: A graphic processing system and a method of graphic processing are provided. The graphic processing system has a collector, a plurality of slots, a scheduler, an arbiter and at least an arithmetic logic unit (ALU). The collector is configured to group a plurality of workitems into elementary wavefronts. Each of the elementary wavefronts comprises workitems configured to execute the same kernel code. The scheduler is configured to allocate the elementary wavefronts to the slots. Two or more of the elementary wavefronts exist at one slot to form one of a plurality of macro wavefronts. The arbiter is configured to select one of the macro wavefronts. The ALU is configured to execute workitems of at least an elementary wavefront of the selected macro wavefront and output results of execution of the workitems.

6.

发明申请
ADAPTIVE EXECUTION ENGINE FOR CONVOLUTION COMPUTING SYSTEMS 审中-公开

公开(公告)号：US20180173676A1

公开(公告)日：2018-06-21

申请号：US15787897

申请日：2017-10-19

Applicant: MediaTek Inc.

Inventor： Sung-Fang Tsai , Pei-Kuei Tsung , Po-Chun Fan , Shou-Jen Lai

IPC: G06F17/15 , G06F17/16

CPC classification number: G06F17/15 , G06F17/16 , G06N3/0454 , G06N3/063

Abstract: A system performs convolution computing in either a matrix mode or a filter mode. An analysis module generates a mode select signal to select the matrix mode or the filter mode based on results of analyzing convolution characteristics. The results include at least a comparison of resource utilization between the matrix mode and the filter mode. A convolution module includes processing elements, each of which further includes arithmetic computing circuitry. The convolution module is configured according to the matrix mode for performing matrix multiplications converted from convolution computations, and is configured according to the filter mode for performing the convolution computations.

7.

发明申请
HYBRID MEMORY ACCESS TO ON-CHIP MEMORY BY PARALLEL PROCESSING UNITS 审中-公开

公开(公告)号：US20180173463A1

公开(公告)日：2018-06-21

申请号：US15675710

申请日：2017-08-12

Applicant: MediaTek Inc.

Inventor： Po-Chun Fan , Pei-Kuei Tsung , Sung-Fang Tsai , Chia-Hsien Chou , Shou-Jen Lai

IPC: G06F3/06 , G06F12/06

CPC classification number: G06F3/0659 , G06F3/0611 , G06F3/0685 , G06F12/0638 , G06F12/0806 , G06F2212/1024 , G06F2212/205

Abstract: A system is provided to manage on-chip memory access for multiple threads. The system comprises multiple parallel processing units to execute the threads, and an on-chip memory including multiple memory units and each memory unit includes a first region and a second region. The first region and the second region have different memory addressing schemes for parallel access by the threads. The system further comprises an address decoder coupled to the parallel processing units and the on-chip memory. The address decoder is operative to activate access by the threads to memory locations in the first region or the second region according to decoded address signals from the parallel processing units.

8.

发明申请
APPARATUS FOR MUTUAL-TRANSPOSITION OF SCALAR AND VECTOR DATA SETS AND RELATED METHOD 有权
Title translation: 用于标量和矢量数据集的相互传递的装置及相关方法

公开(公告)号：US20150234662A1

公开(公告)日：2015-08-20

申请号：US14184663

申请日：2014-02-19

Applicant: MEDIATEK INC.

Inventor： Pei-Kuei Tsung , Mu-Fan Murphy Chang , Po-Chun Fan

IPC: G06F9/355 , G06F9/30 , G06F12/10

CPC classification number: G06F9/3555 , G06F9/30036 , G06F12/0207 , G06F12/1009 , G06F2212/65 , G09G5/393 , G09G2360/122 , G09G2360/123

Abstract: An apparatus for processing a plurality of data sets is disclosed, wherein one data set of the plurality of data sets includes N components and has a data type of one of a scalar type and a vector type, wherein N is a positive integer number. The apparatus includes a memory module and a data accessing module. The memory module comprises N memory units configured to store the plurality of data sets. The data accessing module is configured to write the data set into the memory module according to a write data index corresponding to the data set and one of a first writing mapping information and a second writing mapping information, wherein the first writing mapping information is employed when the data type is one of the scalar and the vector type and the second writing mapping information is employed when the data type is the other of the scalar and the vector type.

Abstract translation: 公开了一种用于处理多个数据集的装置，其中所述多个数据集中的一个数据集包括N个分量，并且具有标量类型和矢量类型之一的数据类型，其中N是正整数。该装置包括存储器模块和数据访问模块。存储器模块包括被配置为存储多个数据集的N个存储器单元。数据访问模块被配置为根据与数据集相对应的写入数据索引将数据集写入存储器模块中，并且将第一写入映射信息和第二写入映射信息中的一个写入映射信息，数据类型是标量和向量类型之一，并且当数据类型是标量和向量类型中的另一个时采用第二写入映射信息。

9.

发明公开
AUTONOMOUS COPY BETWEEN EXTERNAL MEMORY AND INTERNAL MEMORY 审中-公开

公开(公告)号：US20240319904A1

公开(公告)日：2024-09-26

申请号：US18407990

申请日：2024-01-09

Applicant: MediaTek Inc.

Inventor： Pao-Hung Kuo , Po-Chun Fan , Sheng-Yen Yang

IPC: G06F3/06

CPC classification number: G06F3/065 , G06F3/0604 , G06F3/0659 , G06F3/0683

Abstract: A method of managing access to a first memory via a second memory includes autonomously copying data from one or more of the data blocks in the first plurality of data blocks in the first memory to corresponding one or more of the data blocks in the second plurality of data blocks in the second memory sequentially. Access to the first memory with a first plurality of data blocks is at a first speed and access to the second memory with a second plurality of data blocks is at a second speed. A command is received for reading from the second memory. Responsive to receiving the command, a pointer is obtained indicating an address of a data block in the second memory that contains data copied from the first memory and that is first available for access. The data is obtained from the data block based on the pointer.

10.

发明授权
Efficient work execution in a parallel computing system 有权

公开(公告)号：US11175920B2

公开(公告)日：2021-11-16

申请号：US16395193

申请日：2019-04-25

Applicant: MediaTek Inc.

Inventor： Shou-Jen Lai , Pei-Kuei Tsung , Po-Chun Fan , Sung-Fang Tsai

IPC: G06F9/46 , G06F9/38

Abstract: A computing device operative to perform parallel computations. The computing device includes a controller unit to assign workgroups to a set of batches. Each batch includes a program counter shared by M workgroups assigned to the batch, where M is a positive integer determined according to a configurable batch setting. Each batch further includes a set of thread processing units operative to execute, in parallel, a subset of work items in each of the M workgroups. Each batch further includes a spilling memory to store intermediate data of the M workgroups when one or more workgroups in the M workgroups encounters a synchronization barrier.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification