SCALABLE PARALLEL SORTING ON MANYCORE-BASED COMPUTING SYSTEMS
    101.
    发明申请
    SCALABLE PARALLEL SORTING ON MANYCORE-BASED COMPUTING SYSTEMS 审中-公开
    基于MANYCORE的计算系统可分级并行分配

    公开(公告)号:US20150066988A1

    公开(公告)日:2015-03-05

    申请号:US14472752

    申请日:2014-08-29

    CPC classification number: G06F7/36

    Abstract: Systems and methods for sorting data, including chunking unsorted data such that each chunk is of a size that fits within a last level cache of the system. One or more threads are instantiated in each physical core of the system, chunks assigned physical cores are distributed evenly across the threads on the physical cores. Subchunks in the physical cores are sorted using vector intrinsics, the subchunks being data assigned to the threads in the physical cores, and the subchunks are merged to generate sorted large chunks. A binary tree, which includes leaf nodes that correspond to the sorted large chunks, is built, leaf nodes are assigned to threads, and tree nodes are assigned to a circular buffer, wherein the circular buffer is lock and synchronization free. The large chunks are sorted to generate sorted data as output.

    Abstract translation: 用于排序数据的系统和方法,包括分块未排序的数据,使得每个块的大小适合系统的最后一级高速缓存。 一个或多个线程在系统的每个物理核心中实例化,分配的物理核心的分组均匀分布在物理内核上的线程之间。 物理核心中的子块使用向量内在函数进行排序,子块是分配给物理内核中的线程的数据,并且子块被合并以生成排序的大块。 构建了包含与排序的大块对应的叶节点的二叉树,将叶节点分配给线程,并将树节点分配给循环缓冲区,其中循环缓冲器是锁定和同步的。 大块被排序以生成排序数据作为输出。

    AUTOMATIC ASYNCHRONOUS OFFLOAD FOR MANY-CORE COPROCESSORS
    102.
    发明申请
    AUTOMATIC ASYNCHRONOUS OFFLOAD FOR MANY-CORE COPROCESSORS 有权
    自动异步无人机卸载多个核心协处理器

    公开(公告)号:US20140053131A1

    公开(公告)日:2014-02-20

    申请号:US13940974

    申请日:2013-07-12

    Abstract: Methods and systems for asynchronous offload to many-core coprocessors include splitting a loop in an input source code into a sampling sub-part, a many integrated core (MIC) sub-part, and a central processing unit (CPU) sub-part; executing the sampling sub-part with a processor to determine loop characteristics including memory- and processor-operations executed by the loop; identifying optimal split boundaries based on the loop characteristics such that the MIC sub-part will complete in a same amount of time when executed on a MIC processor as the CPU sub-part will take when executed on a CPU; and modifying the input source code to split the loop at the identified boundaries, such that the MIC sub-part is executed on a MIC processor and the CPU sub-part is concurrently executed on a CPU.

    Abstract translation: 用于异步卸载到多核协处理器的方法和系统包括将输入源代码中的循环分解成采样子部分,许多集成核(MIC)子部分和中央处理单元(CPU)子部分; 使用处理器执行采样子部分以确定包括由循环执行的存储器和处理器操作的循环特性; 基于循环特性识别最佳分割边界,使得在CPU处理器上执行时,当CPU子部件在CPU上执行时,MIC子部件将在与MIC处理器上执行时相同的时间量完成; 并且修改输入源代码以在所识别的边界处分割循环,使得在MIC处理器上执行MIC子部分,并且CPU子部件在CPU上同时执行。

    OPTIMIZING EDGE-ASSISTED AUGMENTED REALITY DEVICES

    公开(公告)号:US20250159339A1

    公开(公告)日:2025-05-15

    申请号:US18945989

    申请日:2024-11-13

    Abstract: Systems and methods for optimizing edge-assisted augmented reality (AR) devices. To optimize the AR devices, frame capture timings of AR devices can be profiled that capture relationships between the AR devices. Requests from the AR devices can be analyzed to determine accuracy of the frame capture timings of the AR devices based on a service level objective (SLO) metric. A frame timing plan that minimizes overall timing changes of the AR devices can be determined by adapting the accuracy of the frame capture timings to optimal adjustments generated based on a change in device metrics for requests below an accuracy threshold. Current frame capture timings of cameras of the AR devices can be adjusted based on the frame timing plan by generating a response pocket for the AR devices.

    ENCODING AND DECODING IMAGES USING DIFFERENTIABLE JPEG COMPRESSION

    公开(公告)号:US20250008132A1

    公开(公告)日:2025-01-02

    申请号:US18755150

    申请日:2024-06-26

    Abstract: Systems and methods are provided for encoding and decoding images using differentiable JPEG compression, including converting images from RGB color space to YCbCr color space to obtain a luminance and chrominance channels, and applying chroma subsampling to the chrominance channels to reduce resolution. The YCbCr image is divided into pixel blocks and a DCT is performed on the pixel blocks to obtain DCT coefficients. DCT coefficients are quantized using a scaled quantization table to reduce precision, and quantized DCT coefficients are encoded using lossless entropy coding, forming a compressed JPEG file decoded by reversing the lossless entropy coding to obtain quantized DCT coefficients, which are dequantized using the scaled quantization table to restore the precision. The dequantized DCT coefficients are converted back to a spatial domain using an IDCT, the chrominance channels are upsampled to original resolution, and the YCbCr image is converted back to the RGB color space.

    Dynamic, contextualized AI models
    106.
    发明授权

    公开(公告)号:US12136255B2

    公开(公告)日:2024-11-05

    申请号:US17577664

    申请日:2022-01-18

    Abstract: A method for employing a semi-supervised learning approach to improve accuracy of a small model on an edge device is presented. The method includes collecting a plurality of frames from a plurality of video streams generated from a plurality of cameras, each camera associated with a respective small model, each small model deployed in the edge device, sampling the plurality of frames to define sampled frames, performing inference to the sampled frames by using a big model, the big model shared by all of the plurality of cameras and deployed in a cloud or cloud edge, using the big model to generate labels for each of the sampled frames to generate training data, and training each of the small models with the training data to generate updated small models on the edge device.

    System for application self-optimization in serverless edge computing environments

    公开(公告)号:US11847510B2

    公开(公告)日:2023-12-19

    申请号:US17964170

    申请日:2022-10-12

    CPC classification number: G06F9/543 G06F9/505

    Abstract: A method for implementing application self-optimization in serverless edge computing environments is presented. The method includes requesting deployment of an application pipeline on data received from a plurality of sensors, the application pipeline including a plurality of microservices, enabling communication between a plurality of pods and a plurality of analytics units (AUs), each pod of the plurality of pods including a sidecar, determining whether each of the plurality of AUs maintains any state to differentiate between stateful AUs and stateless AUs, scaling the stateful AUs and the stateless AUs, enabling communication directly between the sidecars of the plurality of pods, and reusing and resharing common AUs of the plurality of AUs across different applications.

Patent Agency Ranking