Abstract:
Systems and methods for sorting data, including chunking unsorted data such that each chunk is of a size that fits within a last level cache of the system. One or more threads are instantiated in each physical core of the system, chunks assigned physical cores are distributed evenly across the threads on the physical cores. Subchunks in the physical cores are sorted using vector intrinsics, the subchunks being data assigned to the threads in the physical cores, and the subchunks are merged to generate sorted large chunks. A binary tree, which includes leaf nodes that correspond to the sorted large chunks, is built, leaf nodes are assigned to threads, and tree nodes are assigned to a circular buffer, wherein the circular buffer is lock and synchronization free. The large chunks are sorted to generate sorted data as output.
Abstract:
Methods and systems for asynchronous offload to many-core coprocessors include splitting a loop in an input source code into a sampling sub-part, a many integrated core (MIC) sub-part, and a central processing unit (CPU) sub-part; executing the sampling sub-part with a processor to determine loop characteristics including memory- and processor-operations executed by the loop; identifying optimal split boundaries based on the loop characteristics such that the MIC sub-part will complete in a same amount of time when executed on a MIC processor as the CPU sub-part will take when executed on a CPU; and modifying the input source code to split the loop at the identified boundaries, such that the MIC sub-part is executed on a MIC processor and the CPU sub-part is concurrently executed on a CPU.
Abstract:
Systems and methods for optimizing edge-assisted augmented reality (AR) devices. To optimize the AR devices, frame capture timings of AR devices can be profiled that capture relationships between the AR devices. Requests from the AR devices can be analyzed to determine accuracy of the frame capture timings of the AR devices based on a service level objective (SLO) metric. A frame timing plan that minimizes overall timing changes of the AR devices can be determined by adapting the accuracy of the frame capture timings to optimal adjustments generated based on a change in device metrics for requests below an accuracy threshold. Current frame capture timings of cameras of the AR devices can be adjusted based on the frame timing plan by generating a response pocket for the AR devices.
Abstract:
Methods and systems of training a neural network include training a feature extractor and a classifier using a first set of training data that includes one or more base cases. The classifier is trained with few-shot adaptation using a second set of training data, smaller than the first set of training data, while keeping parameters of the feature extractor constant.
Abstract:
Systems and methods are provided for encoding and decoding images using differentiable JPEG compression, including converting images from RGB color space to YCbCr color space to obtain a luminance and chrominance channels, and applying chroma subsampling to the chrominance channels to reduce resolution. The YCbCr image is divided into pixel blocks and a DCT is performed on the pixel blocks to obtain DCT coefficients. DCT coefficients are quantized using a scaled quantization table to reduce precision, and quantized DCT coefficients are encoded using lossless entropy coding, forming a compressed JPEG file decoded by reversing the lossless entropy coding to obtain quantized DCT coefficients, which are dequantized using the scaled quantization table to restore the precision. The dequantized DCT coefficients are converted back to a spatial domain using an IDCT, the chrominance channels are upsampled to original resolution, and the YCbCr image is converted back to the RGB color space.
Abstract:
A method for employing a semi-supervised learning approach to improve accuracy of a small model on an edge device is presented. The method includes collecting a plurality of frames from a plurality of video streams generated from a plurality of cameras, each camera associated with a respective small model, each small model deployed in the edge device, sampling the plurality of frames to define sampled frames, performing inference to the sampled frames by using a big model, the big model shared by all of the plurality of cameras and deployed in a cloud or cloud edge, using the big model to generate labels for each of the sampled frames to generate training data, and training each of the small models with the training data to generate updated small models on the edge device.
Abstract:
Systems and methods for scaling in a container orchestration platform are described that include configuring an autoscaler in a control plane of the container orchestration platform to receive stream data from a data exchange system that is measuring stream processing of a pipeline of microservices for an application. The systems and methods further include controlling a number of deployment pods in at least one node of the container orchestration platform to meet requirements for the application provided by the pipeline of microservices.
Abstract:
Methods and systems for camera configuration include configuring an image capture configuration parameter of a camera according to a multi-objective reinforcement learning aggregated reward function. Respective quality estimates for analytics are determined after configuring the image capture parameters. The aggregated reward function is updated based on the quality estimates.
Abstract:
A method for implementing application self-optimization in serverless edge computing environments is presented. The method includes requesting deployment of an application pipeline on data received from a plurality of sensors, the application pipeline including a plurality of microservices, enabling communication between a plurality of pods and a plurality of analytics units (AUs), each pod of the plurality of pods including a sidecar, determining whether each of the plurality of AUs maintains any state to differentiate between stateful AUs and stateless AUs, scaling the stateful AUs and the stateless AUs, enabling communication directly between the sidecars of the plurality of pods, and reusing and resharing common AUs of the plurality of AUs across different applications.
Abstract:
A method is provided for managing applications for sensors. In one embodiment, the method includes loading a plurality of applications and links for communicating with a plurality of sensors on a platform having an interface for entry of a requested use case; and copying a configuration from a grouping of application instances being applied to a first sensor performing in a function comprising of the requested use case. The method may further include applying the configuration for the grouping of application instances to a second set of sensors to automatically conform the plurality of sensors on the platform to perform the requested use case.