-
公开(公告)号:US10606594B2
公开(公告)日:2020-03-31
申请号:US14957912
申请日:2015-12-03
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Jin-seok Lee , Dong-kwan Suh , Seung-won Lee
Abstract: A method of executing, by a processor, a multi-thread including threads of the processor, includes setting a mask value indicating execution of one of the threads of the processor based on an instruction, setting an inverted mask value based on the set mask value; and executing the thread of the processor based on the set mask value and the set inverted mask value.
-
公开(公告)号:US11093439B2
公开(公告)日:2021-08-17
申请号:US16143922
申请日:2018-09-27
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Kyoung-hoon Kim , Young-hwan Park , Dong-kwan Suh , Keshava prasad Nagaraja , Suk-jin Kim , Han-su Cho , Hyun-jung Kim
Abstract: A processor for performing deep learning is provided herein. The processor includes a processing element unit including a plurality of processing elements arranged in a matrix form including a first row of processing elements and a second row of processing elements. The processing elements are fed with filter data by a first data input unit which is connected to the first row processing elements. A second data input unit feeds target data to the processing elements. A shifter composed of registers feeds instructions to the processing elements. A controller in the processor controls the processing elements, the first data input unit and second data input unit to process the filter data and target data, thus providing sum of products (convolution) functionality.
-
公开(公告)号:US10565017B2
公开(公告)日:2020-02-18
申请号:US15669408
申请日:2017-08-04
Applicant: Samsung Electronics Co., Ltd.
Inventor: Dong-kwan Suh , Suk-jin Kim , Jin-sae Jung , Kang-jin Yoon
Abstract: A multi-thread processor and a method of controlling a multi-thread processor are provided. The multi-thread processor includes at least one functional unit; a mode register; and a controller configured to control the mode register to store thread mode information corresponding to a task to be processed among a plurality of thread modes, wherein the plurality of thread modes are divided based on a size and a number of at least one thread that is concurrently processed in one of the at least one functional unit, allocate at least one thread included in the task to the at least one functional unit based on the thread mode information stored in the mode register and control the at least one functional unit to process the at least one thread.
-
公开(公告)号:US11568323B2
公开(公告)日:2023-01-31
申请号:US16650083
申请日:2018-05-16
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Kyoung-hoon Kim , Young-hwan Park , Dong-kwan Suh , Keshava Prasad Nagaraja , Dae-hyun Kim , Suk-jin Kim , Han-su Cho , Hyun-jung Kim
Abstract: Disclosed is an electronic device. The An electronic device including a storage, and a processor configured to perform convolution processing on target data and kernel data based on stride information that indicates an interval at which the kernel data is applied to the target data stored in the storage, in which the processor is further configured to divide the target data into a plurality of pieces of sub-data based on first stride information, perform the convolution processing on the plurality of pieces of sub-data and a plurality of pieces of sub-kernel data respectively corresponding to the plurality of pieces of sub-data based on second stride information that is different from the first stride information, and combine a plurality of processing results, the plurality of pieces of sub-kernel data are obtained by dividing the kernel data based on the first stride information, and the second stride information indicates that the interval at which the kernel data is applied to the target data is 1.
-
公开(公告)号:US11263018B2
公开(公告)日:2022-03-01
申请号:US16462086
申请日:2017-10-23
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Ki-seok Kwon , Jae-un Park , Dong-kwan Suh , Kang-jin Yoon
Abstract: A vector processor is disclosed. The vector processor includes a plurality of register files provided to each of a plurality of single instruction multiple data (SIMD) lanes, storing each of a plurality of pieces of data, and respectively outputting input data to be used in a current cycle among the plurality of pieces of data, a shuffle unit for receiving a plurality of pieces of input data outputted from the plurality of register files, and performing shuffling such that the received plurality of pieces of input data respectively correspond to the plurality of SIMD lanes and outputting the same; and a command execution unit for performing a parallel operation by receiving input data outputted from the shuffle unit.
-
公开(公告)号:US10599439B2
公开(公告)日:2020-03-24
申请号:US15125023
申请日:2015-03-11
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Dong-kwan Suh , Suk-jin Kim , Do-hyung Kim , Tai-song Jin
Abstract: Provided are a method and apparatus for processing a very long instruction word (VLIW) instruction. The method includes acquiring a calculation allocation instruction including information regarding whether the VLIW instructions are allocated to a plurality of slots; updating a database including the information regarding whether the VLIW instructions are allocated to the plurality of slots based on the acquired calculation allocation instruction; and allocating at least one VLIW instruction to each of the plurality of slots based on the updated database.
-
公开(公告)号:US10318452B2
公开(公告)日:2019-06-11
申请号:US15136183
申请日:2016-04-22
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Chae-seok Im , Dong-kwan Suh , Suk-jin Kim , Seung-won Lee
Abstract: A processor and a control method thereof are processed. The processor includes an instruction fetch module configured to receive a first instruction of an interrupt service routine without backup of data stored in a register in response to processing of the interrupt service routine being requested, a detecting module configured to analyze the received first instruction to determine whether the data stored in the register needs to be changed, an instruction generating module configured to generate a second instruction for storing data in a temporary memory when the stored data is initially changed, an instruction selecting module configured to sequentially select the generated second instruction and first instruction; and a control module configured to perform the second instruction and the first instruction.
-
8.
公开(公告)号:US10013176B2
公开(公告)日:2018-07-03
申请号:US15090111
申请日:2016-04-04
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Dong-kwan Suh , Suk-jin Kim , Young-hwan Park
CPC classification number: G06F3/061 , G06F3/0656 , G06F3/0659 , G06F3/0673 , G06F9/3001 , G06F9/30018 , G06F9/30021 , G06F9/30036 , G06F9/3004 , G06F9/30043 , G06F12/00
Abstract: Methods and apparatuses for parallel processing data are disclosed. One method includes reading items of data from a memory using at least memory access address, confirming items of data with the same memory address among the read items of data, and masking the confirmed items of data other than one of the confirmed items of data. A correction value is generated for the memory access address using the confirmed items of data, and an operation is performed on data that has not been masked using the confirmed items of data and the correction value. Data obtained by operating on the data that has not been masked is stored as at least on representative data item for the data items with the same memory address. A schedule of a compiler of a processor is adjusted by performing bypassing of memory access address alias checking for at least one memory access address.
-
公开(公告)号:US11675997B2
公开(公告)日:2023-06-13
申请号:US16163772
申请日:2018-10-18
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Kyoung-hoon Kim , Young-hwan Park , Dong-kwan Suh , Keshava Prasad , Dae-hyun Kim , Suk-jin Kim , Han-su Cho , Hyun-jung Kim
CPC classification number: G06N3/04 , G06F12/06 , G06F17/15 , G06F21/52 , G06N3/045 , G06N3/063 , G06N5/046
Abstract: Provided are a method and apparatus for processing a convolution operation in a neural network. The apparatus may include a memory, and a processor configured to read, from the memory, one of divided blocks of input data stored in a memory; generate an output block by performing the convolution operation on the one of the divided blocks with a kernel; generate a feature map by using the output block, and write the feature map to the memory.
-
公开(公告)号:US10782974B2
公开(公告)日:2020-09-22
申请号:US15360271
申请日:2016-11-23
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Young-chul Cho , Suk-jin Kim , Chul-soo Park , Dong-kwan Suh
Abstract: A VLIW (Very Long Instruction Word) interface device includes a memory configured to store instructions and data, and a processor configured to process the instructions and the data, wherein the processor includes an instruction fetcher configured to output an instruction fetch request to load the instruction from the memory, a decoder configured to decode the instruction loaded on the instruction fetcher, an arithmetic logic unit (ALU) configured to perform an operation function if the decoded instruction is an operation instruction, a memory interface scheduler configured to schedule the instruction fetch request or a data fetch request that is input from the arithmetic logic unit, and a memory operator configured to perform a memory access operation in accordance with the scheduled instruction fetch request or data fetch request.
-
-
-
-
-
-
-
-
-