-
公开(公告)号:US11567746B2
公开(公告)日:2023-01-31
申请号:US16927016
申请日:2020-07-13
Applicant: QUALCOMM TECHNOLOGIES, INC.
Inventor: Muthu M. Baskaran , Thomas Henretty , Richard A. Lethin , Benoit J. Meister
IPC: G06F8/41
Abstract: In a sequence of major computational steps or in an iterative computation, a stencil amplifier can increase the number of data elements accessed from one or more data structures in a single major step or iteration, thereby decreasing the total number of computations and/or communication operations in the overall sequence or the iterative computation. Stencil amplification, which can be optimized according to a specified parameter such as compile time, rune time, code size, etc., can improve the performance of a computing system executing the sequence or the iterative computation in terms of run time, memory load, energy consumption, etc. The stencil amplifier typically determines boundaries, to avoid erroneously accessing data elements not present in the one or more data structures.
-
公开(公告)号:US11537373B2
公开(公告)日:2022-12-27
申请号:US17034895
申请日:2020-09-28
Applicant: QUALCOMM TECHNOLOGIES, INC.
Inventor: Muthu Manikandan Baskaran , Benoit J. Meister , Benoit Pradelle
IPC: G06F8/41
Abstract: A system for compiling programs for execution thereof using a hierarchical processing system having two or more levels of memory hierarchy can perform memory-level-specific optimizations, without exceeding a specified maximum compilation time. To this end, the compiler system employs a polyhedral model and limits the dimensions of a polyhedral program representation that is processed by the compiler at each level using a focalization operator that temporarily reduces one or more dimensions of the polyhedral representation. Semantic correctness is provided via a defocalization operator that can restore all polyhedral dimensions that had been temporarily removed.
-
公开(公告)号:US11500557B2
公开(公告)日:2022-11-15
申请号:US16745890
申请日:2020-01-17
Applicant: QUALCOMM TECHNOLOGIES, INC.
Inventor: Muthu M. Baskaran , Thomas Henretty , Ann Johnson , Athanasios Konstantinidis , M. H. Langston , Janice O. McMahon , Benoit J. Meister , Paul D. Mountcastle , Aale Naqvi , Benoit Pradelle , Tahina Ramananandro , Sanket Tavarageri , Richard A. Lethin
Abstract: A compilation system using an energy model based on a set of generic and practical hardware and software parameters is presented. The model can represent the major trends in energy consumption spanning potential hardware configurations using only parameters available at compilation time. Experimental verification indicates that the model is nimble yet sufficiently precise, allowing efficient selection of one or more parameters of a target computing system so as to minimize power/energy consumption of a program while achieving other performance related goals. A voltage and/or frequency optimization and selection is presented which can determine an efficient dynamic hardware configuration schedule at compilation time. In various embodiments, the configuration schedule is chosen based on its predicted effect on energy consumption. A concurrency throttling technique based on the energy model can exploit the power-gating features exposed by the target computing system to increase the energy efficiency of programs.
-
公开(公告)号:US11789769B2
公开(公告)日:2023-10-17
申请号:US16791361
申请日:2020-02-14
Applicant: QUALCOMM TECHNOLOGIES, INC.
Inventor: Muthu M. Baskaran , Thomas Henretty , M. H. Langston , Richard A. Lethin , Benoit J. Meister , Nicolas T. Vasilache , David E. Wohlford
CPC classification number: G06F9/4843
Abstract: In a system for automatic generation of event-driven, tuple-space based programs from a sequential specification, a hierarchical mapping solution can target different runtimes relying on event-driven tasks (EDTs). The solution uses loop types to encode short, transitive relations among EDTs that can be evaluated efficiently at runtime. Specifically, permutable loops translate immediately into conservative point-to-point synchronizations of distance one. A runtime-agnostic which can be used to target the transformed code to different runtimes.
-
公开(公告)号:US11500621B2
公开(公告)日:2022-11-15
申请号:US16876739
申请日:2020-05-18
Applicant: QUALCOMM TECHNOLOGIES, INC.
Inventor: Richard A. Lethin , Allen K. Leung , Benoit J. Meister , David E. Wohlford
Abstract: Methods, apparatus and computer software product for optimization of data transfer between two memories includes determining access to master data stored in one memory and/or to local data stored in another memory such that either or both of the size of total data transferred and the number of data transfers required to transfer the total data can be minimized. The master and/or local accesses are based on, at least in part, respective structures of the master and local data.
-
公开(公告)号:US11726197B2
公开(公告)日:2023-08-15
申请号:US16653201
申请日:2019-10-15
Applicant: QUALCOMM TECHNOLOGIES, INC.
Inventor: Muthu M. Baskaran , Thomas Henretty , Ann Johnson , Athanasios Konstantinidis , M. H. Langston , Janice O. McMahon , Benoit J. Meister , Paul D. Mountcastle , Aale Naqvi , Benoit Pradelle , Tahina Ramananandro , Sanket Tavarageri , Richard A. Lethin
CPC classification number: G01S13/723 , G01S7/411 , G01S7/4802 , B64C39/024 , B64U2101/20 , G01S3/7864 , G01S7/4808 , G01S17/66
Abstract: A system for determining the physical path of an object can map several candidate paths to a suitable path space that can be explored using a convex optimization technique. The optimization technique may take advantage of the typical sparsity of the path space and can identify a likely physical path using a function of sensor observation as constraints. A track of an object can also be determined using a track model and a convex optimization technique.
-
公开(公告)号:US11573945B1
公开(公告)日:2023-02-07
申请号:US17033592
申请日:2020-09-25
Applicant: QUALCOMM TECHNOLOGIES, INC.
Abstract: In a system for storing in memory a tensor that includes at least three modes, elements of the tensor are stored in a mode-based order for improving locality of references when the elements are accessed during an operation on the tensor. To facilitate efficient data reuse in a tensor transform that includes several iterations, on a tensor that includes at least three modes, a system performs a first iteration that includes a first operation on the tensor to obtain a first intermediate result, and the first intermediate result includes a first intermediate-tensor. The first intermediate result is stored in memory, and a second iteration is performed in which a second operation on the first intermediate result accessed from the memory is performed, so as to avoid a third operation, that would be required if the first intermediate result were not accessed from the memory.
-
-
-
-
-
-