Patent search ap:("Nvidia Corporation") AND inv:"Jagadeesh Sankaran" Page 2

11.

发明授权
Performing multiple point table lookups in a single cycle in a system on chip 有权

公开(公告)号：US11704067B2

公开(公告)日：2023-07-18

申请号：US17391378

申请日：2021-08-02

Applicant: NVIDIA Corporation

Inventor： Ching-Yu Hung , Ravi P. Singh , Jagadeesh Sankaran , Ahmad Itani , Yen-Te Shih

IPC: G06F3/06 , G06T1/20 , G06T1/60

CPC classification number: G06F3/0659 , G06F3/065 , G06F3/0611 , G06F3/0673 , G06T1/20 , G06T1/60

Abstract: In various examples, a VPU and associated components may be optimized to improve VPU performance and throughput. For example, the VPU may include a min/max collector, automatic store predication functionality, a SIMD data path organization that allows for inter-lane sharing, a transposed load/store with stride parameter functionality, a load with permute and zero insertion functionality, hardware, logic, and memory layout functionality to allow for two point and two by two point lookups, and per memory bank load caching capabilities. In addition, decoupled accelerators may be used to offload VPU processing tasks to increase throughput and performance, and a hardware sequencer may be included in a DMA system to reduce programming complexity of the VPU and the DMA system. The DMA and VPU may execute a VPU configuration mode that allows the VPU and DMA to operate without a processing controller for performing dynamic region based data movement operations.

12.

发明申请
HYBRID SOLUTION FOR STEREO IMAGING 有权

公开(公告)号：US20230130478A1

公开(公告)日：2023-04-27

申请号：US17798232

申请日：2020-06-22

Applicant: Nvidia Corporation

Inventor： Dong Zhang , Eric Viscito , Frans Sijstermans , Jagadeesh Sankaran , Ching Hung , Yen-Te Shih , Ravi Singh

IPC: G06T7/593

Abstract: A hybrid matching approach can be used for computer vision that balances accuracy with speed and resource consumption. Stereoscopic image data can be rectified and downsampled, then analyzed using a semi-global matching (SGM) process. The use of downsampled images greatly reduces time and bandwidth requirements, while providing high accuracy disparity results. These disparity results can be provided as external hints to a fast module that can perform a robust matching process in the time needed for applications such as real time navigation. The external hints can be used, along with potentially other hints, to define a search space for use by the fast module, which can result in higher quality disparity results obtained within specified timing constraints and with limited resources. The disparity results can be used to determine distances to various objects, as may be important for vehicle navigation or robotic task performance.

13.

发明授权
Using per memory bank load caches for reducing power use in a system on a chip 有权

公开(公告)号：US11593001B1

公开(公告)日：2023-02-28

申请号：US17391861

申请日：2021-08-02

Applicant: NVIDIA Corporation

Inventor： Ching-Yu Hung , Ravi P Singh , Jagadeesh Sankaran , Yen-Te Shih , Ahmad Itani

IPC: G06F12/00 , G06F3/06 , G06F12/0802

Abstract: A VPU and associated components include a min/max collector, automatic store predication functionality, a SIMD data path organization that allows for inter-lane sharing, a transposed load/store with stride parameter functionality, a load with permute and zero insertion functionality, hardware, logic, and memory layout functionality to allow for two point and two by two point lookups, and per memory bank load caching capabilities. In addition, decoupled accelerators are used to offload VPU processing tasks to increase throughput and performance, and a hardware sequencer is included in a DMA system to reduce programming complexity of the VPU and the DMA system. The DMA and VPU executes a VPU configuration mode that allows the VPU and DMA to operate without a processing controller for performing dynamic region based data movement operations.

14.

发明申请
PERFORMING MULTIPLE POINT TABLE LOOKUPS IN A SINGLE CYCLE IN A SYSTEM ON CHIP 有权

公开(公告)号：US20230053042A1

公开(公告)日：2023-02-16

申请号：US17391378

申请日：2021-08-02

Applicant: NVIDIA Corporation

Inventor： Ching-Yu Hung , Ravi P. Singh , Jagadeesh Sankaran , Ahmad Itani , Yen-Te Shih

IPC: G06F3/06 , G06T1/20 , G06T1/60

Abstract: In various examples, a VPU and associated components may be optimized to improve VPU performance and throughput. For example, the VPU may include a min/max collector, automatic store predication functionality, a SIMD data path organization that allows for inter-lane sharing, a transposed load/store with stride parameter functionality, a load with permute and zero insertion functionality, hardware, logic, and memory layout functionality to allow for two point and two by two point lookups, and per memory bank load caching capabilities. In addition, decoupled accelerators may be used to offload VPU processing tasks to increase throughput and performance, and a hardware sequencer may be included in a DMA system to reduce programming complexity of the VPU and the DMA system. The DMA and VPU may execute a VPU configuration mode that allows the VPU and DMA to operate without a processing controller for performing dynamic region based data movement operations.

15.

发明申请
USING PER MEMORY BANK LOAD CACHES FOR REDUCING POWER USE IN A SYSTEM ON A CHIP 有权

公开(公告)号：US20230047233A1

公开(公告)日：2023-02-16

申请号：US17391861

申请日：2021-08-02

Applicant: NVIDIA Corporation

Inventor： Ching-Yu Hung , Ravi P Singh , Jagadeesh Sankaran , Yen-Te Shih , Ahmad Itani

IPC: G06F3/06 , G06F12/0802

Abstract: In various examples, a VPU and associated components may be optimized to improve VPU performance and throughput. For example, the VPU may include a min/max collector, automatic store predication functionality, a SIMD data path organization that allows for inter-lane sharing, a transposed load/store with stride parameter functionality, a load with permute and zero insertion functionality, hardware, logic, and memory layout functionality to allow for two point and two by two point lookups, and per memory bank load caching capabilities. In addition, decoupled accelerators may be used to offload VPU processing tasks to increase throughput and performance, and a hardware sequencer may be included in a DMA system to reduce programming complexity of the VPU and the DMA system. The DMA and VPU may execute a VPU configuration mode that allows the VPU and DMA to operate without a processing controller for performing dynamic region based data movement operations.

16.

发明申请
Object Detection Using Image Alignment for Autonomous Machine Applications 有权

公开(公告)号：US20210264175A1

公开(公告)日：2021-08-26

申请号：US17187228

申请日：2021-02-26

Applicant: NVIDIA Corporation

Inventor： Dong Zhang , Sangmin Oh , Junghyun Kwon , Baris Evrim Demiroz , Tae Eun Choe , Minwoo Park , Chethan Ningaraju , Hao Tsui , Eric Viscito , Jagadeesh Sankaran , Yongqing Liang

IPC: G06K9/00 , G06K9/62 , G06K9/32 , G06N3/08 , B60W60/00

Abstract: Systems and methods are disclosed that use a geometric approach to detect objects on a road surface. A set of points within a region of interest between a first frame and a second frame are captured and tracked to determine a difference in location between the set of points in two frames. The first frame may be aligned with the second frame and the first pixel values of the first frame may be compared with the second pixel values of the second frame to generate a disparity image including third pixels. One or more subsets of the third pixels that have an disparity image value about a first threshold may be combined, and the third pixels may be scored and associated with disparity values for each pixel of the one or more subsets of the third pixels. A bounding shape may be generated based on the scoring that corresponds to the object.

17.

发明授权
Performing load and store operations of 2D arrays in a single cycle in a system on a chip 有权

公开(公告)号：US12099439B2

公开(公告)日：2024-09-24

申请号：US17391468

申请日：2021-08-02

Applicant: NVIDIA Corporation

Inventor： Ching-Yu Hung , Ravi P Singh , Jagadeesh Sankaran , Yen-Te Shih , Ahmad Itani

IPC: G06F12/1081 , G06F9/30 , G06F12/02 , G06F13/28

CPC classification number: G06F12/0238 , G06F9/30043 , G06F12/1081 , G06F13/28

Abstract: In various examples, a VPU and associated components may be optimized to improve VPU performance and throughput. For example, the VPU may include a min/max collector, automatic store predication functionality, a SIMD data path organization that allows for inter-lane sharing, a transposed load/store with stride parameter functionality, a load with permute and zero insertion functionality, hardware, logic, and memory layout functionality to allow for two point and two by two point lookups, and per memory bank load caching capabilities. In addition, decoupled accelerators may be used to offload VPU processing tasks to increase throughput and performance, and a hardware sequencer may be included in a DMA system to reduce programming complexity of the VPU and the DMA system. The DMA and VPU may execute a VPU configuration mode that allows the VPU and DMA to operate without a processing controller for performing dynamic region based data movement operations.

18.

发明公开
OBJECT DETECTION USING IMAGE ALIGNMENT FOR AUTONOMOUS MACHINE APPLICATIONS 审中-公开

公开(公告)号：US20240265555A1

公开(公告)日：2024-08-08

申请号：US18614160

申请日：2024-03-22

Applicant: NVIDIA Corporation

Inventor： Dong Zhang , Sangmin Oh , Junghyun Kwon , Baris Evrim Demiroz , Tae Eun Choe , Minwoo Park , Chethan Ningaraju , Hao Tsui , Eric Viscito , Jagadeesh Sankaran , Yongqing Liang

IPC: G06T7/246 , B60W60/00 , G06F18/214 , G06N3/08 , G06V10/25 , G06V10/75 , G06V20/56 , G06V20/58

CPC classification number: G06T7/246 , B60W60/001 , G06F18/2148 , G06N3/08 , G06V10/25 , G06V10/751 , G06V20/58 , G06V20/56

Abstract: Systems and methods are disclosed that use a geometric approach to detect objects on a road surface. A set of points within a region of interest between a first frame and a second frame are captured and tracked to determine a difference in location between the set of points in two frames. The first frame may be aligned with the second frame and the first pixel values of the first frame may be compared with the second pixel values of the second frame to generate a disparity image including third pixels. Subsets of the third pixels that have an disparity image value about a first threshold may be combined, and the third pixels may be scored and associated with disparity values for each pixel of the one or more subsets of the third pixels. A bounding shape may be generated based on the scoring that corresponds to the object.

19.

发明授权
Reduced memory write requirements in a system on a chip using automatic store predication 有权

公开(公告)号：US11954496B2

公开(公告)日：2024-04-09

申请号：US17391374

申请日：2021-08-02

Applicant: NVIDIA Corporation

Inventor： Ching-Yu Hung , Ravi P Singh , Jagadeesh Sankaran , Yen-Te Shih , Ahmad Itani

IPC: G06F9/38

CPC classification number: G06F9/3887 , G06F9/38585

Abstract: In various examples, systems and methods for reducing written requirements in a system on chip (SoC) are described herein. For instance, a total number of iterations may be determined for processing data, such as data representing an array. In some circumstances, a set of iterations may include a first number of iterations that is less than a second number of iterations. As such, and during execution of the set of iterations, a predicate flag corresponding to an excess iteration of the set of iterations may be generated, where the excess iteration corresponds to an iteration that is part of a number of excess iterations that is associated with a difference between the first number of iterations and the second number of iterations. Based on the predicate flag, one or more first values corresponding to the iteration may be prevented from being written to memory.

20.

发明授权
Accelerating table lookups using a decoupled lookup table accelerator in a system on a chip 有权

公开(公告)号：US11836527B2

公开(公告)日：2023-12-05

申请号：US17391369

申请日：2021-08-02

Applicant: NVIDIA Corporation

Inventor： Ravi P Singh , Ching-Yu Hung , Jagadeesh Sankaran , Ahmad Itani , Yen-Te Shih

IPC: G06F9/50 , G06F7/76 , G06F1/03

CPC classification number: G06F9/5027 , G06F1/03 , G06F7/76 , G06F9/5077

Abstract: In various examples, a VPU and associated components may be optimized to improve VPU performance and throughput. For example, the VPU may include a min/max collector, automatic store predication functionality, a SIMD data path organization that allows for inter-lane sharing, a transposed load/store with stride parameter functionality, a load with permute and zero insertion functionality, hardware, logic, and memory layout functionality to allow for two point and two by two point lookups, and per memory bank load caching capabilities. In addition, decoupled accelerators may be used to offload VPU processing tasks to increase throughput and performance, and a hardware sequencer may be included in a DMA system to reduce programming complexity of the VPU and the DMA system. The DMA and VPU may execute a VPU configuration mode that allows the VPU and DMA to operate without a processing controller for performing dynamic region based data movement operations.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification