摘要:
The present invention provides a method and apparatus for data caching. The method comprises: output matrixes are acquired one by one, a plurality of acquired output matrixes are written alternately into two queue sets of a first cache unit according to a sequence in which the output matrixes are acquired, and the output matrixes stored line by line in a first cache unit are written into a second cache unit one by one, according to the sequence in which the output matrixes are written into the second cache unit, valid data of each output matrix of the second cache unit is determined one by one according to preset parameters, and the valid data of each output matrix is written into a third cache unit, and the valid data of the output matrixes stored in the third cache unit are configured to be sequentially written into a memory according to a sequence in which the valid data are written into the third cache unit. In the present solution, the output matrixes are cached by using cache units with the writing speed matching with the computing speed of a processor, and the output matrixes are completely written into a memory one by one according to a sequence of generation time. Therefore, the present invention may solve the problem that the computing speed of the processor does not match with the writing speed of the memory.
摘要:
A method of implementing packet search by double sliding windows is provided. The method adopts a three-level barrel shift register to store input packet data, and a position of a sliding window 1 is determined at 32 positions by primary testing of a link, so as to ensure that the packet data is located at the center of the sliding window 1, thereby ensuring that the position of the sliding window 1 meets a transmission characteristic of a specific link to the maximum extent. After the position of the sliding window 1 is determined, 32-bit packet data can be effectively searched in the sliding window 1 by dynamically adjusting a sliding window 2, and 32-bit transmission offset is allowed for the packet data. The method of implementing packet search by double sliding windows meets a transmission characteristic of a specific link to the maximum extent.
摘要:
A neural network model for image segmentation, an image segmentation method therefor and device thereof, and a readable storage medium. The model comprises an intelligent selection module, and the intelligent selection module further comprises a feature extraction unit and an intelligent selection unit. Since the feature extraction unit uses multi-scale dilated convolution to obtain information of different scales of an input feature map, a large quantity of diverse feature information is provided for later feature screening. In addition, the intelligent selection unit trains a weight value, and performs intelligent screening on an input feature map channel according to the size of the weight value. Therefore, the intelligent selection module can ensure segmentation accuracy while reducing the number of parameters and the amount of calculation. Therefore, by using the described intelligent selection module, the neural network model of the present application can quickly extract an effective feature of an image; moreover, the amount of calculation is small and model parameters are few, and the model is applicable to a mobile terminal.
摘要:
An extension Cache Coherence protocol-based multi-level coherency domain simulation verification and test method. An extension Cache Coherence protocol-based multi-level coherency domain CC-NUMA (Cache Coherent Non-Uniform Memory Access) system protocol simulation model is built, a protocol table inquiring and state converting executing mechanism in a key node of a system ensures that a Cache Coherence protocol is maintained in a single computing domain and is simultaneously maintained among a plurality of computing domains, and accuracy and stability of intra-domain and inter-domain transmission are ensured; a credible protocol inlet conversion coverage rate evaluation driven verification method is provided, transactions are processed by loading an optimized transaction generator push model, a coverage rate index is obtained after the operation is ended, and the verification efficiency is increased in comparison with a random transaction promoting mechanism. Through building a multi-processor multi-level coherency domain verification system model and performing relevant simulation verification, the applicability and the effectiveness of the method are further confirmed.
摘要:
A computing method and apparatus for a convolutional neural network model. The method comprises: acquiring a computing model of a training task of a convolutional neural network model (S101); then splitting multiply-accumulate operation in a computing model of a training task of the convolutional neural network model into a plurality of multiply-add operation tasks (SI02); confirming a computing device corresponding to each multiply-add operation task according to the correlation between a preset computing model and the computing device (S103); and finally, respectively computing each multiply-add operation task by utilizing the computing device corresponding to each multiply-add operation task (S104). The purposes of improving the flexibility of migration of a CNN model training task on different computing devices or cooperative computing of different processors and improving the computing speed are achieved.
摘要:
A member-oriented hybrid cloud operating system architecture and a communication method thereof are provided. A hybrid architecture is established based on layer, object and message models, and a member-oriented idea is applied to manage constituent members and a processing environment thereof. On this basis, high-efficient routing, read-write separation and load balancing are performed on a member processing cluster, satisfying the requirements of being open and compatible, loosely coupled and extensible of a cloud operating system, and solving the self-management problem, the horizontal scaling problem of members and the high-availability problem of stateful members of the existing cloud operating system.
摘要:
A method of constructing a Share-F state in a local domain of a multi-level cache coherency domain system, includes: 1) when it is requested to access S state remote data at the same address, determining an accessed data copy by inquiring a remote proxy directory RDIR, and determining whether the data copy is in an inter-node S state and an intra-node F state; 2) directly forwarding the data copy to a requester, and recording the data copy of the current requester as an inter-node Cache coherency domain S state and an intra-node Cache coherency domain F state; and 3) after data forwarding is completed, recording, in a remote data directory RDIR, an intra-node processor losing an F permission state as the inter-node Cache coherency domain S state and the intra-node Cache coherency domain F state.