-
公开(公告)号:US20250036752A1
公开(公告)日:2025-01-30
申请号:US18713110
申请日:2022-10-20
Inventor: Zhe Wang , Chenggang Wu , Mengyao Xie
Abstract: A CET mechanism-based method for protecting the integrity of a general-purpose memory. In the method, the integrity of the general-purpose memory is protected on the basis of a CET mechanism. A dedicated shadow stack page is provided, and is independent of the shadow stack page maintained by the CET mechanism itself, and overhead reduction processing adaptive to content to be written that is written to the dedicated shadow stack page and requires writing overhead reduction is performed on the content to be written, so as to reduce the number of times of using WRSS instructions, such that the integrity of sensitive data and/or sensitive codes is protected in the case of using lower overhead, and performance overhead of a processor in the protection of the integrity of the general-purpose memory is reduced, thereby improving the efficiency of processing other tasks by the processor.
-
公开(公告)号:US11841733B2
公开(公告)日:2023-12-12
申请号:US17791511
申请日:2020-01-08
Inventor: Ke Zhang , Yazhou Wang , Mingyu Chen , Yisong Chang , Ran Zhao , Yungang Bao
IPC: G06F13/42
CPC classification number: G06F13/4282
Abstract: A method and system for realizing a FPGA server, wherein centralized monitoring and managing all SoC FPGA compute nodes within the server by a motherboard, the motherboard comprising: a plurality of self-defined management interfaces for connecting the SoC FPGA compute nodes to supply power and data switch to the SoC FPGA compute nodes; a management network switch module for interconnecting the SoC FPGA compute nodes and supplying management; and a core control unit for managing the SoC FPGA compute nodes through the self-defined management interfaces and a self-defined management interface protocol, and acquiring operating parameters of the SoC FPGA compute nodes to manage and monitor the SoC FPGA compute nodes based on the management interface protocol.
-
公开(公告)号:US20230101208A1
公开(公告)日:2023-03-30
申请号:US17791511
申请日:2020-01-08
Inventor: Ke ZHANG , Yazhou WANG , Mingyu CHEN , Yisong CHANG , Ran ZHAO , Yungang BAO
IPC: G06F13/42
Abstract: A method and system for realizing a FPGA server, wherein centralized monitoring and managing all SoC FPGA compute nodes within the server by a motherboard, the motherboard comprising: a plurality of self-defined management interfaces for connecting the SoC FPGA compute nodes to supply power and data switch to the SoC FPGA compute nodes; a management network switch module for interconnecting the SoC FPGA compute nodes and supplying management; and a core control unit for managing the SoC FPGA compute nodes through the self-defined management interfaces and a self-defined management interface protocol, and acquiring operating parameters of the SoC FPGA compute nodes to manage and monitor the SoC FPGA compute nodes based on the management interface protocol.
-
4.
公开(公告)号:US11616662B2
公开(公告)日:2023-03-28
申请号:US17100570
申请日:2020-11-20
Inventor: Jinhua Tao , Tao Luo , Shaoli Liu , Shijin Zhang , Yunji Chen
IPC: H04L12/44 , H04L45/16 , H04L49/109
Abstract: The present invention provides a fractal tree structure-based data transmit device and method, a control device, and an intelligent chip. The device comprises: a central node that is as a communication data center of a network-on-chip and used for broadcasting or multicasting communication data to a plurality of leaf nodes; the plurality of leaf nodes that are as communication data nodes of the network-on-chip and for transmitting the communication data to a central leaf node; and forwarder modules for connecting the central node with the plurality of leaf nodes and forwarding the communication data; the central node, the forwarder modules and the plurality of leaf nodes are connected in the fractal tree network structure, and the central node is directly connected to M the forwarder modules and/or leaf nodes, any the forwarder module is directly connected to M the next level forwarder modules and/or leaf nodes.
-
公开(公告)号:US20220374733A1
公开(公告)日:2022-11-24
申请号:US17761220
申请日:2019-12-27
Inventor: Gaogang XIE , Xinyi ZHANG , Penghao ZHANG
Abstract: The disclosure provides a data packet classification method and system based on a convolutional neural network including merging each rule set in a training rule set to form a plurality of merging schemes, and determining an optimal merging scheme for each rule set in the training rule set on the basis of performance evaluation; converting a prefix combination distribution of each rule set in the training rule set and a target rule set into an image, and training a convolutional neural network model by taking the image and the corresponding optimal merging scheme as features; and classifying the target rule set on the basis of image similarity, and constructing a corresponding hash table for data packet classification.
-
6.
公开(公告)号:US11488000B2
公开(公告)日:2022-11-01
申请号:US15770457
申请日:2016-06-17
Inventor: Zhen Li , Shaoli Liu , Shijin Zhang , Tao Luo , Cheng Qian , Yunji Chen , Tianshi Chen
Abstract: The present disclosure provides an operation apparatus and method for an acceleration chip for accelerating a deep neural network algorithm. The apparatus comprises: a vector addition processor module and a vector function value arithmetic unit and a vector multiplier-adder module wherein the three modules execute a programmable instruction, and interact with each other to calculate values of neurons and a network output result of a neural network, and a variation amount of a synaptic weight representing the interaction strength of the neurons on an input layer to the neurons on an output layer; and the three modules are all provided with an intermediate value storage region and perform read and write operations on a primary memory.
-
公开(公告)号:US20210350214A1
公开(公告)日:2021-11-11
申请号:US17250892
申请日:2019-05-21
Inventor: Xiaowei LI , Xin WEI , Hang LU
Abstract: Disclosed embodiments relate to a convolutional neural network computing method and system based on weight kneading, comprising: arranging original weights in a computation sequence and aligning by bit to obtain a weight matrix, removing slack bits in the weight matrix, allowing essential bits in each column of the weight matrix to fill the vacancies according to the computation sequence to obtain an intermediate matrix, removing null rows in the intermediate matrix, obtain a kneading matrix, wherein each row of the kneading matrix serves as a kneading weight; obtaining positional information of the activation corresponding to each bit of the kneading weight; divides the kneading weight by bit into multiple weight segments, processing summation of the weight segments and the corresponding activations according to the positional information, and sending a processing result to an adder tree to obtain an output feature map by means of executing shift-and-add on the processing result.
-
公开(公告)号:US11977784B2
公开(公告)日:2024-05-07
申请号:US17908495
申请日:2020-07-06
Inventor: Liuying Ma , Zhenqing Liu , Jin Xiong , Dejun Jiang
IPC: G06F3/06
CPC classification number: G06F3/0659 , G06F3/0611 , G06F3/0622 , G06F3/067
Abstract: The present invention proposes a dynamic resources allocation method and system for guaranteeing tail latency SLO of latency-sensitive applications. A plurality of request queues is created in a storage server node of a distributed storage system with different types of requests located in different queues, and thread groups are allocated to the request queues according to logical thread resources of the service node and target tail latency requirements, and thread resources are dynamically allocated in real time, and the thread group of each request queue is bound to physical CPU resources of the storage server node. The client sends an application's requests to the storage server node; the storage server node stores the request in a request queue corresponding to its type, uses the thread group allocated for the current queue to process the application's requests, and sends responses to the client.
-
公开(公告)号:US20210357735A1
公开(公告)日:2021-11-18
申请号:US17250890
申请日:2019-05-21
Inventor: Xiaowei LI , Xin WEI , Hang LU
Abstract: Disclosed embodiments relate to a split accumulator for a convolutional neural network accelerator, comprising: arranging original weights in a computation sequence and aligning by bit to obtain a weight matrix, removing slack bits in the weight matrix, allowing essential bits in each column of the weight matrix to fill the vacancies according to the computation sequence to obtain an intermediate matrix, removing null rows in the intermediate matrix, obtain a kneading matrix, wherein each row of the kneading matrix serves as a kneading weight; obtaining positional information of the activation corresponding to each bit of the kneading weight; divides the kneading weight by bit into multiple weight segments, processing summation of the weight segments and the corresponding activations according to the positional information, and sending a processing result to an adder tree to obtain an output feature map by means of executing shift-and-add on the processing result.
-
公开(公告)号:US20210350204A1
公开(公告)日:2021-11-11
申请号:US17250889
申请日:2019-05-21
Inventor: Xiaowei LI , Xin WEI , Hang LU
IPC: G06N3/04
Abstract: Disclosed embodiments relate to a convolutional neural network accelerator, comprising: arranging original weights in a computation sequence and aligning by bit to obtain a weight matrix, removing slack bits in the weight matrix, allowing essential bits in each column of the weight matrix to fill the vacancies according to the computation sequence to obtain an intermediate matrix, removing null rows in the intermediate matrix, obtain a kneading matrix, wherein each row of the kneading matrix serves as a kneading weight; obtaining positional information of the activation corresponding to each bit of the kneading weight; divides the kneading weight by bit into multiple weight segments, processing summation of the weight segments and the corresponding activations according to the positional information, and sending a processing result to an adder tree to obtain an output feature map by means of executing shift-and-add on the processing result.
-
-
-
-
-
-
-
-
-