-
公开(公告)号:US20230011937A1
公开(公告)日:2023-01-12
申请号:US17869618
申请日:2022-07-20
Applicant: Intel Corporation
Inventor: Nilesh Jain , Vui Seng Chua , Fahim Mohammad , Anindya Paul
Abstract: Example systems, methods, and apparatus to generate optimized models for Internet of Things device are disclosed. An example apparatus includes a data receiver to collect data from a sensor of an internet of things device based a first sampling frequency and a buffer having a first buffer size; a model trainer to train a model based on the data collected from the sensor; a buffer analyzer to select a second sampling frequency and to reduce the buffer to a second buffer size, the model trainer to update the model based on the second buffer size; and a platform analyzer to: determine a duration of time that that internet of things device will take to analyze sensor data based on the updated model.
-
公开(公告)号:US20220300795A1
公开(公告)日:2022-09-22
申请号:US17836523
申请日:2022-06-09
Applicant: Intel Corporation
Inventor: Yash Akhauri , Nilesh Jain , Pasquale Cocchini , Eriko Nurvitadhi
Abstract: Systems, apparatuses and methods may provide for technology that includes a performance-enhanced decompression pipeline having first decoder hardware to convert variable length weights to fixed length keys, wherein the variable length weights are non-uniform quantization values, and second decoder hardware to convert the fixed length keys to bit value. In one example, the first length keys are compressed representations of the variable length weights and the bit values are bit accurate representations of the fixed length keys.
-
23.
公开(公告)号:US20250061316A1
公开(公告)日:2025-02-20
申请号:US18934700
申请日:2024-11-01
Applicant: Intel Corporation
Inventor: Sameh Gobriel , Nilesh Jain , Vui Seng Chua , Juan Pablo Munoz , Gopi Krishna Jha
IPC: G06N3/0495 , G06N3/082
Abstract: Key-value (KV) cache paging schemes can improve memory management for KV caches by storing a KV cache page having key tensors and value tensors for a fixed number of tokens in a fixed-sized block in the KV cache of a worker. To further improve memory management, the schemes can be modified to implement dynamic variable quantization. Quantization level of a KV cache page can be set based on a runtime importance score of the KV cache page. In addition, the quantization level of the KV cache page can be set based on the system load. The end result is a scheme that can achieve a high compression ratio of KV cache pages in the KV cache. Fitting more KV cache pages in the KV cache can lead to higher inference throughput, higher system-level user capacity, and higher end-to-end service availability.
-
公开(公告)号:US12197601B2
公开(公告)日:2025-01-14
申请号:US17560193
申请日:2021-12-22
Applicant: Intel Corporation
Inventor: Ren Wang , Sameh Gobriel , Somnath Paul , Yipeng Wang , Priya Autee , Abhirupa Layek , Shaman Narayana , Edwin Verplanke , Mrittika Ganguli , Jr-Shian Tsai , Anton Sorokin , Suvadeep Banerjee , Abhijit Davare , Desmond Kirkpatrick , Rajesh M. Sankaran , Jaykant B. Timbadiya , Sriram Kabisthalam Muthukumar , Narayan Ranganathan , Nalini Murari , Brinda Ganesh , Nilesh Jain
Abstract: Examples described herein relate to offload circuitry comprising one or more compute engines that are configurable to perform a workload offloaded from a process executed by a processor based on a descriptor particular to the workload. In some examples, the offload circuitry is configurable to perform the workload, among multiple different workloads. In some examples, the multiple different workloads include one or more of: data transformation (DT) for data format conversion, Locality Sensitive Hashing (LSH) for neural network (NN), similarity search, sparse general matrix-matrix multiplication (SpGEMM) acceleration of hash based sparse matrix multiplication, data encode, data decode, or embedding lookup.
-
公开(公告)号:US12101475B2
公开(公告)日:2024-09-24
申请号:US17127544
申请日:2020-12-18
Applicant: Intel Corporation
Inventor: Brinda Ganesh , Nilesh Jain , Sumit Mohan , Faouzi Kossentini , Jill Boyce , James Holland , Zhijun Lei , Chekib Nouira , Foued Ben Amara , Hassene Tmar , Sebastian Possos , Craig Hurst
IPC: H04N19/114 , H04N19/154
CPC classification number: H04N19/114 , H04N19/154
Abstract: Techniques related to distributing the video encoding processing of an input video across hardware and software systems. Such techniques include evaluating the content of the video and determine whether or the encoding operation is best to be done on the hardware system only, software system only or a hybrid hardware and software system.
-
公开(公告)号:US20240311951A1
公开(公告)日:2024-09-19
申请号:US18478286
申请日:2023-09-29
Applicant: Intel Corporation
Inventor: Selvakumar Panneer , Sarthak Rajesh Shah , Nilesh Jain , John Feit
CPC classification number: G06T1/20 , G06F9/5038
Abstract: Described herein is a graphics processor configured to perform time based frame predication to bypass execution of a command buffer based on a comparison with time stamps stored in a time stamp buffer that tracks execution time for command buffers. The graphics processors can bypass a frame that will not complete in time for a target display update and trigger neural frame generation to generate the frame data for the bypassed command buffer. Dynamic render scaling is also described.
-
公开(公告)号:US20240307773A1
公开(公告)日:2024-09-19
申请号:US18478201
申请日:2023-09-29
Applicant: Intel Corporation
Inventor: Selvakumar Panneer , John Feit , Sarthak Rajesh Shah , SungYe Kim , Nilesh Jain
IPC: A63F13/52
CPC classification number: A63F13/52 , A63F2300/66
Abstract: Described herein is a technique to enhance the responsiveness of gameplay for a 3D gaming application while maintaining the ability to enqueue multiple frames for processing on the GPU. Each frame or a set of workloads within a frame is submitted to the GPU with predication, such that the indicated rendering and resource manipulation commands are not actually performed if the predication condition is enabled. A low latency command can be submitted to the GPU via a copy engine command queue. The command will cause the copy engine to enable or disable predication for command buffers in the command queue. When predication for queued command buffers is enabled, command buffers for workloads that are not related to the workload that is generated in response to the user input are bypassed. High priority command buffers that include workloads generated in response to user input can then be executed immediately.
-
28.
公开(公告)号:US20240029455A1
公开(公告)日:2024-01-25
申请号:US18475353
申请日:2023-09-27
Applicant: Intel Corporation
Inventor: Peixi Xiong , Nilesh Jain , Ravishankar Iyer , Mrutunjayya Mrutunjayya
IPC: G06V20/64 , G06V20/70 , G06T15/20 , G06V10/56 , G06V10/774
CPC classification number: G06V20/64 , G06V20/70 , G06T15/20 , G06V10/56 , G06V10/774
Abstract: Systems, apparatuses and methods may provide for technology that encodes multi-view visual data into latent features via an aggregator encoder, decodes the latent features into one or more novel target views different from views of the multi-view visual data via a rendering decoder, and decodes the latent features into an object label via a label decoder. The operation to decode the latent features via the rendering decoder and to decode the latent features via the label decoder occur at least partially at the same time. The operation to encode, via the aggregator encoder, the multi-view visual data into the latent features further includes operations to: perform, via the aggregator encoder, semantic object recognition operations based on radiance field view synthesis operations, and perform, via the aggregator encoder, radiance field view synthesis operations based on semantic object recognition operations.
-
公开(公告)号:US20230409326A1
公开(公告)日:2023-12-21
申请号:US17841558
申请日:2022-06-15
Applicant: Intel Corporation
Inventor: Menachem Adelman , Amit Gradstein , Simon Rubanovich , Barukh Ziv , Uri Sherman , Dana Rip , Shahar Mizrahi , Dan Baum , Rinat Rappoport , Nilesh Jain , Zeev Sperber , Gideon Stupp , Alexander Heinecke , Christopher Hughes , Evangelos Georganas
CPC classification number: G06F9/30145 , G06F9/30178 , G06F9/30047 , G06F9/3887 , G06N3/04
Abstract: Techniques and mechanisms for processor circuitry to execute a load and expand instruction of an instruction set to generate decompressed matrix data. In an embodiment, the instruction comprises a source operand which indicates a location from which compressed matrix data, and corresponding metadata, are to be accessed. A destination operand of the instruction indicates a location which is to receive decompressed metadata, which is generated, during execution of the instruction, based on the compressed matrix data and the corresponding metadata. The metadata comprises compression mask information which identifies which elements of the matrix have been masked from the compressed matrix data. In another embodiment, the instruction further comprises a count operand which identifies a total number of the unmasked matrix elements which are represented in the compressed matrix data.
-
公开(公告)号:US11411832B2
公开(公告)日:2022-08-09
申请号:US16236290
申请日:2018-12-28
Applicant: Intel Corporation
Inventor: Nilesh Jain , Vui Seng Chua , Fahim Mohammad , Anindya Paul
IPC: H04L41/14 , G06N20/00 , G06N7/00 , H04L43/022 , H04L43/16 , H04L41/16 , H04L67/12 , H04W4/38 , H04W4/70
Abstract: Example systems, methods, and apparatus to generate optimized models for Internet of Things device are disclosed. An example apparatus includes a data receiver to collect data from a sensor of an internet of things device based a first sampling frequency and a buffer having a first buffer size; a model trainer to train a model based on the data collected from the sensor; a buffer analyzer to select a second sampling frequency and to reduce the buffer to a second buffer size, the model trainer to update the model based on the second buffer size; and a platform analyzer to: determine a duration of time that that internet of things device will take to analyze sensor data based on the updated model.
-
-
-
-
-
-
-
-
-