-
11.
公开(公告)号:US20180253402A1
公开(公告)日:2018-09-06
申请号:US15907356
申请日:2018-02-28
Applicant: Texas Instruments Incorporated
Inventor: Arthur John Redfern , Timothy David Anderson , Kai Chirca , Chenchi Luo , Zhenhua Yu
CPC classification number: G06F17/16 , G06F17/141 , G06N3/0454 , G06N3/063
Abstract: A method for performing a fundamental computational primitive in a device is provided, where the device includes a processor and a matrix multiplication accelerator (MMA). The method includes configuring a streaming engine in the device to stream data for the fundamental computational primitive from memory, configuring the MMA to format the data, and executing the fundamental computational primitive by the device.
-
公开(公告)号:US09294138B2
公开(公告)日:2016-03-22
申请号:US14452777
申请日:2014-08-06
Applicant: Texas Instruments Incorporated
Inventor: Guolong Su , Arthur John Redfern
IPC: H04B1/10 , H04B17/336 , H04W72/04
CPC classification number: H04W72/0446 , H04B17/336
Abstract: A bandpass filter includes a plurality of parallel paths, each receiving the input signal to the bandpass filter. Each path includes a first mixer, a low-pass filter, and a second mixer. The first mixer in each path is coupled to receive the input signal and mixes the input signal with a periodic mixer sequence having a period that is divided into a plurality of time slots. The mixer value is constant during each time slot. The low-pass filter in each path is operable to filter an output of the associated first mixer. The second mixer in each path is coupled to receive an output of the associated low-pass filter and mixes said filter output with a periodic mixer sequence having a period that is divided into a plurality of time slots, wherein again the mixer value is constant during each time slot. A summer sums the outputs of the second mixers of each of the paths to generate an output of the bandpass filter.
Abstract translation: 带通滤波器包括多个并行路径,每个并行路径将输入信号接收到带通滤波器。 每个路径包括第一混频器,低通滤波器和第二混频器。 每个路径中的第一混频器被耦合以接收输入信号,并且将输入信号与具有被划分成多个时隙的周期的周期性混频器序列混合。 混频器值在每个时隙都是恒定的。 每个路径中的低通滤波器可操作以对相关联的第一混频器的输出进行滤波。 每个路径中的第二混频器被耦合以接收相关联的低通滤波器的输出,并且将所述滤波器输出与具有被分成多个时隙的周期的周期性混频器序列混合,其中混频器值在 每个时间段 夏季对每个路径的第二混频器的输出求和,以产生带通滤波器的输出。
-
公开(公告)号:US20240411473A1
公开(公告)日:2024-12-12
申请号:US18813405
申请日:2024-08-23
Applicant: TEXAS INSTRUMENTS INCORPORATED
Inventor: Arthur John Redfern , Asheesh Bhardwaj
Abstract: A matrix transfer accelerator (MTA) system/method that coordinates data transfers between an external data memory (EDM) and a local data memory (LDM) using matrix tiling and/or grouping is disclosed. The system utilizes foreground/background buffering that overlaps compute and data transfer operations and permits EDM-to-LDM data transfers with or without zero pad peripheral matrix filling. The system may incorporate an automated zero-fill direct memory access (DMA) controller (ZDC) that transfers data from the EDM to the LDM based on a set of DMA controller registers including data width register (DWR), transfer count register (TCR), fill count register (FCR), EDM source address register (ESR), and LDM target address register (LTR). The ZDC transfers matrix data from the EDM[ESR] to the LDM[LTR] such that EDM matrix data of DWR row data width is automatically zero-filled around a periphery of a matrix written to the LDM matrix based on the FCR value.
-
公开(公告)号:US20240265062A1
公开(公告)日:2024-08-08
申请号:US18633703
申请日:2024-04-12
Applicant: Texas Instruments Incorporated
Inventor: Arthur John Redfern , Timothy David Anderson , Kai Chirca , Chenchi Luo , Zhenhua Yu
CPC classification number: G06F17/16 , G06F17/141 , G06N3/045 , G06N3/063
Abstract: A method for performing a fundamental computational primitive in a device is provided, where the device includes a processor and a matrix multiplication accelerator (MMA). The method includes configuring a streaming engine in the device to stream data for the fundamental computational primitive from memory, configuring the MMA to format the data, and executing the fundamental computational primitive by the device.
-
公开(公告)号:US10817587B2
公开(公告)日:2020-10-27
申请号:US15905250
申请日:2018-02-26
Applicant: TEXAS INSTRUMENTS INCORPORATED
Inventor: Arthur John Redfern , Donald Edward Steiss , Timothy David Anderson , Kai Chirca
IPC: G06F17/16
Abstract: A reconfigurable matrix multiplier (RMM) system/method allowing tight or loose coupling to supervisory control processor application control logic (ACL) in a system-on-a-chip (SOC) environment is disclosed. The RMM provides for C=A*B matrix multiplication operations having A-multiplier-matrix (AMM), B-multiplicand-matrix (BMM), and C-product-matrix (CPM), as well as C=A*B+D operations in which D-summation-matrix (DSM) represents the result of a previous multiplication operation or another previously defined matrix. The RMM provides for additional CPM LOAD/STORE paths allowing overlapping of compute/data transfer operations and provides for CPM data feedback to the AMM or BMM operand inputs from a previously calculated CPM result. The RMM anticipates the use of 8, 16, and 32-bit operand reconfigurable matrix datum in conjunction with a typical external memory bus data width of 512 bits and an instruction control unit (ICU) implemented using a series of RMM configuration words (RCW) and streaming opcode functions (SOF).
-
公开(公告)号:US10810281B2
公开(公告)日:2020-10-20
申请号:US16057667
申请日:2018-08-07
Applicant: Texas Instruments Incorporated
Inventor: Arthur John Redfern , Donald Edward Steiss , Mihir Narendra Mody , Tarek Aziz Lahlou
Abstract: An outer product multiplier (GPM) system/method that integrates compute gating and input/output circular column rotation functions to balance time spent in compute and data transfer operations while limiting overall dynamic power dissipation is disclosed. Matrix compute gating (MCG) based on a computation decision matrix (CDM) limits the number of computations required on a per cycle basis to reduce overall matrix compute cycle power dissipation. A circular column rotation vector (CRV) automates input/output data formatting to reduce the number of data transfer operations required to achieve a given matrix computation result. Matrix function operators (MFO) utilizing these features are disclosed and include: matrix-matrix multiplication; matrix-matrix and vector-vector point-wise multiplication, addition, and assignment; matrix-vector multiplication; vector-vector inner product; matrix transpose; matrix row permute; and vector-column permute.
-
公开(公告)号:US10809933B2
公开(公告)日:2020-10-20
申请号:US15907042
申请日:2018-02-27
Applicant: TEXAS INSTRUMENTS INCORPORATED
Inventor: Arthur John Redfern , Asheesh Bhardwaj
Abstract: A matrix transfer accelerator (MTA) system/method that coordinates data transfers between an external data memory (EDM) and a local data memory (LDM) using matrix tiling and/or grouping is disclosed. The system utilizes foreground/background buffering that overlaps compute and data transfer operations and permits EDM-to-LDM data transfers with or without zero pad peripheral matrix filling. The system may incorporate an automated zero-fill direct memory access (DMA) controller (ZDC) that transfers data from the EDM to the LDM based on a set of DMA controller registers including data width register (DWR), transfer count register (TCR), fill count register (FCR), EDM source address register (ESR), and LDM target address register (LTR). The ZDC transfers matrix data from the EDM[ESR] to the LDM[LTR] such that EDM matrix data of DWR row data width is automatically zero-filled around a periphery of a matrix written to the LDM matrix based on the FCR value.
-
公开(公告)号:US20180373678A1
公开(公告)日:2018-12-27
申请号:US16057667
申请日:2018-08-07
Applicant: Texas Instruments Incorporated
Inventor: Arthur John Redfern , Donald Edward Steiss , Mihir Narendra Mody , Tarek Aziz Lahlou
Abstract: An outer product multiplier (GPM) system/method that integrates compute gating and input/output circular column rotation functions to balance time spent in compute and data transfer operations while limiting overall dynamic power dissipation is disclosed. Matrix compute gating (MCG) based on a computation decision matrix (CDM) limits the number of computations required on a per cycle basis to reduce overall matrix compute cycle power dissipation. A circular column rotation vector (CRV) automates input/output data formatting to reduce the number of data transfer operations required to achieve a given matrix computation result. Matrix function operators (MFO) utilizing these features are disclosed and include: matrix-matrix multiplication; matrix-matrix and vector-vector point-wise multiplication, addition, and assignment; matrix-vector multiplication; vector-vector inner product; matrix transpose; matrix row permute; and vector-column permute.
-
公开(公告)号:US20180246669A1
公开(公告)日:2018-08-30
申请号:US15907042
申请日:2018-02-27
Applicant: TEXAS INSTRUMENTS INCORPORATED
Inventor: Arthur John Redfern , Asheesh Bhardwaj
IPC: G06F3/06
Abstract: A matrix transfer accelerator (MTA) system/method that coordinates data transfers between an external data memory (EDM) and a local data memory (LDM) using matrix tiling and/or grouping is disclosed. The system utilizes foreground/background buffering that overlaps compute and data transfer operations and permits EDM-to-LDM data transfers with or without zero pad peripheral matrix filling. The system may incorporate an automated zero-fill direct memory access (DMA) controller (ZDC) that transfers data from the EDM to the LDM based on a set of DMA controller registers including data width register (DWR), transfer count register (TCR), fill count register (FCR), EDM source address register (ESR), and LDM target address register (LTR). The ZDC transfers matrix data from the EDM[ESR] to the LDM[LTR] such that EDM matrix data of DWR row data width is automatically zero-filled around a periphery of a matrix written to the LDM matrix based on the FCR value.
-
公开(公告)号:US12009843B2
公开(公告)日:2024-06-11
申请号:US17195703
申请日:2021-03-09
Applicant: TEXAS INSTRUMENTS INCORPORATED
Inventor: Arthur John Redfern , Dan Wang
CPC classification number: H03M7/30 , G06F13/28 , G06F17/16 , G06N3/063 , H03M7/3082 , H03M7/6029 , H03M7/6064 , G06N3/045
Abstract: A matrix compression/decompression accelerator (MCA) system/method that coordinates lossless data compression (LDC) and lossless data decompression (LDD) transfers between an external data memory (EDM) and a local data memory (LDM) is disclosed. The system implements LDC using a 2D-to-1D transformation of 2D uncompressed data blocks (2DU) within LDM to generate 1D uncompressed data blocks (1DU). The 1DU is then compressed to generate a 1D compressed superblock (CSB) in LDM. This LDM CSB may then be written to EDM with a reduced number of EDM bus cycles. The system implements LDD using decompression of CSB data retrieved from EDM to generate a 1D decompressed data block (1DD) in LDM. A 1D-to-2D transformation is then applied to the LDM 1DD to generate a 2D decompressed data block (2DD) in LDM. This 2DD may then be operated on by a matrix compute engine (MCE) using a variety of function operators.
-
-
-
-
-
-
-
-
-