摘要:
A compute tile includes a WCB that receives a workload of writing an output tensor of a convolution into a local memory of the compute tile. The local memory may be a SRAM. The WCB receives write transactions. A write transaction includes a data block, which is a part of the output tensor, and metadata describing one or more attributes of the data block. The WCB may store write transactions in its internal buffers. The WCB may determine whether to combine two write transactions, e.g., based on an operation mode or metadata in the write transactions. In embodiments where the WCB determines to combine the two write transactions, the WCB may combine the two write transactions into a new write transaction and write the new write transaction into the local memory or an internal memory of the WCB. The total number of write transactions for the workload can be reduced.
摘要:
Sparsity processing within a compute block can be done on unpacked data. The compute block includes a sparsity decoder that generates a combined sparsity vector from an activation sparsity vector and a weight sparsity vector. The activation sparsity vector indicates positions of non-zero valued activations in an activation context. The weight sparsity vector indicates positions of non-zero valued weights in a weight context. The combined sparsity vector comprises one or more zero valued bits and one or more non-zero valued bits. The sparsity decoder may determine the position of a non-zero valued bit in the combined sparsity vector and determine an address for the non-zero valued activation and the non-zero valued weight based on the position of the non-zero valued bit. The non-zero valued activation and the non-zero valued weight may be provided to a PE for performing MAC operations.
摘要:
An DNN accelerator may perform 1×N kernel decomposition to decompose a convolutional kernel into kernel vectors, each of which includes multiple weights. Through the kernel decomposition, a weight operand may be generated from a filter. The DNN accelerator converts an input tensor into input operands. An input operand includes activations and has the same size as the weight operand. The DNN accelerator may read a first activation in the input operand from memory to an internal memory of a first PE and read a second activation in the input operand from the memory to an internal memory of a second PE. The first PE may receive the second activation from the second PE through activation broadcasting between the two PEs and perform MAC operations on the input operand and weight operand. The second PE may perform MAC operations on another input operand in the input tensor and the weight operand.
摘要:
A load module in a deep neural network (DNN) accelerator may receive a configuration parameter indicating a selection between an activation sparsity mode and a weight sparsity mode. The load module may read a sparse activation tensor, an activation sparsity bitmap, a sparse weight tensor, and a weight sparsity bitmap from a memory. The load module may densify one of the compressed tensors based on the sparsity mode and leave the other compressed tensor as is. The load module may load the dense tensor and the sparse tensor to a sparse cell. The sparse cell includes a sparsity module that may select one or more elements of the dense tensor based on the sparsity bitmap of the sparse tensor. The sparse cell also includes multiply-accumulate (MAC) units that perform MAC operation on the selected elements and the sparse tensor. MAC operations on unselected elements of the dense tensor are skipped.
摘要:
A memory array of a compute tile may store activations or weights of a DNN. The memory array may include databanks for storing contexts, context MUXs, and byte MUXs. A databank may store a context with flip-flop arrays, each of which includes a sequence of flip-flops. A logic gate and an ICG unit may gate flip-flops and control whether states of the flip-flops can be changed. The data gating can prevent a context not selected for the databank from inadvertently toggling and wasting power A context MUX may read a context from different flip-flop arrays in a databank based on gray-coded addresses. A byte MUX can combine bits from different bytes in a context read by the context MUX. The memory array may be implemented with bit packing to reduce distance between the context MUX and byte MUX to reduce lengths of wires connecting the context MUXs and byte MUXs.
摘要:
A building structure comprising a main steel frame structure 3 and roof 41, wall 74, window and door portions 137, 166, 142, 152 directly or indirectly attachable to the main structural support and are configured, in use, to combine to form a thermally insulative barrier 34, 37, 21 between the interior of the building and the external atmosphere, the barrier also acting to at least partially inhibit travel of air therebetween.
摘要:
A building structure comprising a main steel frame structure 3 and roof 41, wall 74, window and door portions 137, 166, 142, 152 directly or indirectly attachable to the main structural support and are configured, in use, to combine to form a thermally insulative barrier 34, 37, 21 between the interior of the building and the external atmosphere, the barrier also acting to at least partially inhibit travel of air therebetween.
摘要:
A percolation test apparatus comprises a water tank (12), a pipe (36) for directing water from the tank into a hole (20) in the ground, an on-off valve (32) for controlling the flow of water in the conduit, and a water level sensor (30) for detecting the level of water in the hole. To perform a test a control unit (14) automatically turns the valve on and off at times determined by the level of water detected by the sensor. To avoid collapse of the hole, and to ensure the correct size of hole for the test, a cage (16) is lowered into the hole. A rain-proof cover (18) prevents rain from falling onto the cage or the ground surrounding the cage.
摘要:
A percolation test apparatus comprises a water tank (12), a pipe (36) for directing water from the tank into a hole (20) in the ground, an on-off valve (32) for controlling the flow of water in the conduit, and a water level sensor (30) for detecting the level of water in the hole. To perform a test a control unit (14) automatically turns the valve on and off at times determined by the level of water detected by the sensor. To avoid collapse of the hole, and to ensure the correct size of hole for the test a cage (16) is lowered into the hole. A rain-proof cover (18) prevents rain from falling onto the cage or the ground surrounding the cage.
摘要:
Salts of 3-O-(3′,3′-dimethylsuccinyl)Betulinic acid (DSB) are disclosed. Particularly, the preparation, pharmaceutical evaluation, and in vivo bioavailability evaluation of N-methyl-D-glucamine and alkali metal salt forms of DSB are disclosed. Pharmaceutical compositions including these salt forms are used in methods of treating HIV and related diseases. Methods of making the salts of DSB and the pharmaceutical compositions are also provided.