- 专利标题: REDUCED MEMORY WRITE REQUIREMENTS IN A SYSTEM ON A CHIP USING AUTOMATIC STORE PREDICATION
-
申请号: US17391374申请日: 2021-08-02
-
公开(公告)号: US20230049442A1公开(公告)日: 2023-02-16
- 发明人: Ching-Yu Hung , Ravi P. Singh , Jagadeesh Sankaran , Yen-Te Shih , Ahmad Itani
- 申请人: NVIDIA Corporation
- 申请人地址: US CA Santa Clara
- 专利权人: NVIDIA Corporation
- 当前专利权人: NVIDIA Corporation
- 当前专利权人地址: US CA Santa Clara
- 主分类号: G06F9/38
- IPC分类号: G06F9/38 ; G06F9/30 ; G06F9/455
摘要:
In various examples, a VPU and associated components may be optimized to improve VPU performance and throughput. For example, the VPU may include a min/max collector, automatic store predication functionality, a SIMD data path organization that allows for inter-lane sharing, a transposed load/store with stride parameter functionality, a load with permute and zero insertion functionality, hardware, logic, and memory layout functionality to allow for two point and two by two point lookups, and per memory bank load caching capabilities. In addition, decoupled accelerators may be used to offload VPU processing tasks to increase throughput and performance, and a hardware sequencer may be included in a DMA system to reduce programming complexity of the VPU and the DMA system. The DMA and VPU may execute a VPU configuration mode that allows the VPU and DMA to operate without a processing controller for performing dynamic region based data movement operations.