-
公开(公告)号:US11720357B2
公开(公告)日:2023-08-08
申请号:US16714915
申请日:2019-12-16
发明人: Yao Zhang , Bingrui Wang
IPC分类号: G06F9/30 , G06F7/491 , G06N20/00 , G06F16/901 , G06F13/28 , G06N3/02 , G06N3/08 , G06F9/38 , G06F12/0871 , H03M7/24 , G06N3/063 , G06F17/16
CPC分类号: G06F9/30025 , G06F7/491 , G06F9/3001 , G06F9/30014 , G06F9/30036 , G06F9/30181 , G06F9/3838 , G06F12/0871 , G06F13/28 , G06F16/9027 , G06N3/02 , G06N3/08 , G06N20/00 , H03M7/24 , G06F9/30101 , G06F9/3873 , G06F9/3877 , G06F17/16 , G06N3/063
摘要: The present disclosure provides a computation device. The computation device is configured to perform a machine learning computation, and includes an operation unit, a controller unit, and a conversion unit. The storage unit is configured to obtain input data and a computation instruction. The controller unit is configured to extract and parse the computation instruction from the storage unit to obtain one or more operation instructions, and to send the one or more operation instructions and the input data to the operation unit. The operation unit is configured to perform operations on the input data according to one or more operation instructions to obtain a computation result of the computation instruction. In the examples of the present disclosure, the input data involved in machine learning computations is represented by fixed-point data, thereby improving the processing speed and efficiency of training operations.
-
2.
公开(公告)号:US20230236795A1
公开(公告)日:2023-07-27
申请号:US17678270
申请日:2022-02-23
申请人: Dell Products L.P.
发明人: Chenxi Hu , Sanping Li , Zhen Jia
CPC分类号: G06F7/4912 , G06F7/4917 , G06N20/00
摘要: Embodiments of the present disclosure provide a data processing method implemented at an edge switch, an electronic device, and a program product. For example, a data processing method implemented at an edge switch is provided. The method includes receiving at least two data packets for floating-point arithmetic operations from at least one source device. In addition, the method may include acquiring corresponding floating-point numerical sequences respectively from the at least two data packets; and acquiring a floating-point arithmetic method from at least one data packet of the at least two data packets to determine a floating-point arithmetic result of the corresponding floating-point numerical sequences. The method may further include sending the floating-point arithmetic result to a target device indicated by the at least one data packet of the at least two data packets.
-
公开(公告)号:US20230017462A1
公开(公告)日:2023-01-19
申请号:US17809137
申请日:2022-06-27
申请人: Arm Limited
发明人: Javier Diaz BRUGUERA
摘要: An apparatus comprises combined divide/square root processing circuitry to perform, in response to a divide instruction, a given radix-64 iteration of a radix-64 divide operation, and in response to a square root instruction, a given radix-64 iteration of a radix-64 square root operation; in which: the combined divide/square root processing circuitry comprises shared circuitry to generate at least one output value for the given radix-64 iteration on a same data path used for both the radix-64 divide operation and the radix-64 square root operation.
-
公开(公告)号:US20220224515A1
公开(公告)日:2022-07-14
申请号:US17145349
申请日:2021-01-10
IPC分类号: H04L9/08 , H04L9/32 , G06F16/245 , G06F7/76 , G06F7/491
摘要: Disclosed herein are methods and systems for efficiently retrieving data from an at least partially encrypted table based record using secure Multi-Party Computation (MPC). A query received to retrieve data from a table based record comprising data items arranged in rows and columns may include a queried data item (key) which potentially matches one or more encrypted data items contained in one or more of the columns. The computing nodes, each having a respective one of a plurality of shares of a one-hot representation of each of the encrypted data items engage in the MPC session to match between a one-hot representation of the queried data item and the one-hot representation of each encrypted data item and output each matching row. The match is based on multiplying, in each encrypted data item's one-hot representation, only bits identified as hot in the queried data item's one-hot representation.
-
公开(公告)号:US11243771B2
公开(公告)日:2022-02-08
申请号:US16479320
申请日:2019-03-06
发明人: Chengyang Yan , Maoyuan Lao
摘要: The present disclosure provides a data computing system. The data computing system comprises: a memory, a processor and an accelerator, wherein the memory is communicatively coupled to the processor and configured to store data to be computed and a computed result, the data being written by the processor; the processor is communicatively coupled to the accelerator and configured to control the accelerator; and the accelerator is communicatively coupled to the memory and configured to access the memory according to pre-configured control information, implement a computing process to produce the computed result and write the computed result back to the memory. The present disclosure also provides an accelerator and a method performed by an accelerator of a data computing system. The present disclosure can improve the execution efficiency of the processor and reduce the computing overhead of the processor.
-
公开(公告)号:US11182156B2
公开(公告)日:2021-11-23
申请号:US16744240
申请日:2020-01-16
申请人: FUJITSU LIMITED
摘要: An information processing apparatus includes: a memory; and a processor coupled to the memory and configured to: perform an arithmetic operation using an arithmetic operation target; repeat the arithmetic operation by using a calculated arithmetic operation result; obtain a ratio of, in a first number of elements which are included in the arithmetic operation result, a second number of elements in an expressible range as a predetermined-bit fixed point; and perform the arithmetic operation by using the predetermined-bit fixed point based on the ratio.
-
公开(公告)号:US20210117156A1
公开(公告)日:2021-04-22
申请号:US17071930
申请日:2020-10-15
发明人: Mustafa Ali , Akhilesh Jaiswal , Kaushik Roy
摘要: An in-memory vector addition method for a dynamic random access memory (DRAM) is disclosed which includes consecutively transposing two numbers across a plurality of rows of the DRAM, each number transposed across a fixed number of rows associated with a corresponding number of bits, assigning a scratch-pad including two consecutive bits for each bit of each number being added, two consecutive bits for carry-in (Cin), and two consecutive bits for carry-out-bar (Cout), assigning a plurality of bits in a transposed orientation to hold results as a sum of the two numbers, for each bit position of the two numbers: computing the associated sum of the bit position; and placing the computed sum in the associated bit of the sum.
-
公开(公告)号:US10579334B2
公开(公告)日:2020-03-03
申请号:US15974643
申请日:2018-05-08
发明人: Daniel Lo , Eric Sen Chung
摘要: A system for block floating point computation in a neural network receives a plurality of floating point numbers. An exponent value for an exponent portion of each floating point number of the plurality of floating point numbers is identified and mantissa portions of the floating point numbers are grouped. A shared exponent value of the grouped mantissa portions is selected according to the identified exponent values and then removed from the grouped mantissa portions to define multi-tiered shared exponent block floating point numbers. One or more dot product operations are performed on the grouped mantissa portions of the multi-tiered shared exponent block floating point numbers to obtain individual results. The individual results are shifted to generate a final dot product value, which is used to implement the neural network. The shared exponent block floating point computations reduce processing time with less reduction in system accuracy.
-
公开(公告)号:US10216481B2
公开(公告)日:2019-02-26
申请号:US15409749
申请日:2017-01-19
申请人: ARM Limited
发明人: Javier Diaz Bruguera
摘要: A data processing apparatus is provided to perform a digit-recurrence division operation to determine a quotient as a result of dividing a dividend by a divisor. Scaling circuitry scales the dividend and the divisor by a factor to produce a scaled dividend and a scaled divisor. Digit recurrence circuitry performs one or more iterations of the digit-recurrence division operation on the scaled dividend and the scaled divisor, with each iteration producing a digit of the quotient and a remainder value. The remainder value is provided as an input to the digit recurrence circuitry for a subsequent iteration. Initialization circuitry performs a first iteration of the one or more iterations and provides the digit of the quotient after the first iteration. The initialization circuitry receives, as an input, an intermediate value produced by the scaling circuitry while scaling the dividend.
-
公开(公告)号:US10127014B2
公开(公告)日:2018-11-13
申请号:US15852180
申请日:2017-12-22
摘要: A round-for-reround mode (preferably in a BID encoded Decimal format) of a floating point instruction prepares a result for later rounding to a variable number of digits by detecting that the least significant digit may be a 0, and if so changing it to 1 when the trailing digits are not all 0. A subsequent reround instruction is then able to round the result to any number of digits at least 2 fewer than the number of digits of the result. An optional embodiment saves a tag indicating the fact that the low order digit of the result is 0 or 5 if the trailing bits are non-zero in a tag field rather than modify the result. Another optional embodiment also saves a half-way-and-above indicator when the trailing digits represent a decimal with a most significant digit having a value of 5. An optional subsequent reround instruction is able to round the result to any number of digits fewer or equal to the number of digits of the result using the saved tags.
-
-
-
-
-
-
-
-
-