Systems and methods for predicting performance of applications on an internet of things (IoT) platform

    公开(公告)号:US10338967B2

    公开(公告)日:2019-07-02

    申请号:US15410178

    申请日:2017-01-19

    Abstract: Performance prediction systems and method of an Internet of Things (IoT) platform and applications includes obtaining input(s) comprising one of (i) user requests and (ii) sensor observations from sensor(s); invoking Application Programming Interface (APIs) of the platform based on input(s); identifying open flow (OF) and closed flow (CF) requests of system(s) connected to the platform; identifying workload characteristics of the OF and CF requests to obtain segregated OF and segregated CF requests, and a combination of open and closed flow requests; executing performance tests with the APIs based on the workload characteristics; measuring resource utilization of the system(s) and computing service demands of resource(s) from measured utilization, and user requests processed by the platform per unit time; executing the performance tests with the invoked APIs based on volume of workload characteristics pertaining to the application(s); and predicting, using queuing network, performance of the application(s) for the volume of workload characteristics.

    OPTIMAL DEPLOYMENT OF EMBEDDINGS TABLES ACROSS HETEROGENEOUS MEMORY ARCHITECTURE FOR HIGH-SPEED RECOMMENDATIONS INFERENCE

    公开(公告)号:US20240119008A1

    公开(公告)日:2024-04-11

    申请号:US18455890

    申请日:2023-08-25

    CPC classification number: G06F12/0897 G06N3/063 G06F2212/1024

    Abstract: Works in the literature fail to leverage embedding access patterns and memory units' access/storage capabilities, which when combined can yield high-speed heterogeneous systems by dynamically re-organizing embedding tables partitions across hardware during inference. A method and system for optimal deployment of embeddings tables across heterogeneous memory architecture for high-speed recommendations inference is disclosed, which dynamically partitions and organizes embedding tables across fast memory architectures to reduce access time. Partitions are chosen to take advantage of the past access patterns of those tables to ensure that frequently accessed data is available in the fast memory most of the time. Partition and replication is used to co-optimize memory access time and resources. Dynamic organization of embedding tables changes location of embedding, hence needs an efficient mechanism to track if a required embedding is present in the fast memory with its current address for faster look-up, which is performed using spline-based learned index.

    Systems and methods for performance evaluation of input/output (I/O) intensive enterprise applications

    公开(公告)号:US11151013B2

    公开(公告)日:2021-10-19

    申请号:US15882568

    申请日:2018-01-29

    Abstract: The present disclosure provides systems and methods for performance evaluation of Input/Output (I/O) intensive enterprise applications. Representative workloads may be generated for enterprise applications using synthetic benchmarks that can be used across multiple platforms with different storage systems. I/O traces are captured for an application of interest at low concurrencies and features that affect performance significantly are extracted, fed to a synthetic benchmark and replayed on a target system thereby accurately creating the same behavior of the application. Statistical methods are used to extrapolate the extract features to predict performance at higher concurrency level without generating traces at those concurrency levels. The method does not require deploying the application or database on the target system since performance of system is dependent on access patterns instead of actual data. Identical access patterns are re-created using only replica of database files of the same size as in the real database.

    Methods and systems for designing correlation filter

    公开(公告)号:US09953394B2

    公开(公告)日:2018-04-24

    申请号:US15055338

    申请日:2016-02-26

    CPC classification number: G06T1/20 G06F13/4282 G06K9/00973 G06K9/6203 G06T1/60

    Abstract: This disclosure relates generally to correlation filters, and more particularly to designing of correlation filter. In one embodiment, a system for designing a correlation filter in a multi-processor system includes a multi-core processor coupled to a first memory and one or more co-processors coupled to one or more respective second memories. The multi-core processor partitions each of a plurality of frames associated with media content into a plurality of pixel-columns, and systematically stores said pixel-columns width-wise in a plurality of temporary matrices by a plurality of threads of the multi-core processor. The plurality of temporary matrices are transferred by the multi-core processor to one or more respective second memories in a plurality of streams simultaneously in an asynchronous mode. A plurality of filter harmonics of the correlation filter are computed by performing compute operations involving at least the plurality of temporary matrices, to obtain the correlation filter.

    System and method facilitating performance prediction of multi-threaded application in presence of resource bottlenecks
    7.
    发明授权
    System and method facilitating performance prediction of multi-threaded application in presence of resource bottlenecks 有权
    在存在资源瓶颈的情况下,系统和方法便于多线程应用程序的性能预测

    公开(公告)号:US09317330B2

    公开(公告)日:2016-04-19

    申请号:US14183461

    申请日:2014-02-18

    Abstract: The present disclosure generally relates to a system and method for predicting performance of a multi-threaded application, and particularly, to a system and method for predicting performance of the multi-threaded application in the presence of resource bottlenecks. In one embodiment, a system for predicting performance of a multi-threaded software application is disclosed. The system may include one or more processors and a memory storing processor-executable instructions for configuring a processor to: represent one or more queuing networks corresponding to resources, the resources being employed to run the multi-threaded application; detect, based on the one or more queuing networks, a concurrency level associated with encountering of a first resource bottleneck; determine, based on the concurrency level, performance metrics associated with the multi-threaded application; and predict the performance of the multi-threaded application based on the performance metrics.

    Abstract translation: 本公开通常涉及用于预测多线程应用的性能的系统和方法,特别地涉及用于在存在资源瓶颈的情况下预测多线程应用的性能的系统和方法。 在一个实施例中,公开了一种用于预测多线程软件应用的性能的系统。 系统可以包括一个或多个处理器和存储处理器可执行指令的存储器,用于将处理器配置为:表示对应于资源的一个或多个排队网络,所述资源被用于运行多线程应用; 基于一个或多个排队网络检测与遇到第一资源瓶颈相关联的并发级别; 基于并发级别确定与多线程应用程序相关联的性能度量; 并基于性能指标预测多线程应用程序的性能。

    Exactly-once transaction semantics for fault tolerant FPGA based transaction systems

    公开(公告)号:US10965519B2

    公开(公告)日:2021-03-30

    申请号:US16283242

    申请日:2019-02-22

    Abstract: This disclosure relates generally to methods and systems for providing exactly-once transaction semantics for fault tolerant FPGA based transaction systems. The systems comprise middleware components in a server as well as client end. The server comprises Hosts and FPGAs. The FPGAs control transaction execution (the application processing logic also resides in the FPGA) and provide fault tolerance with high performance by means of a modified TCP implementation. The Hosts buffer and persist transaction records for failure recovery and achieving exactly-once transaction semantics. The monitoring and fault detecting components are distributed across the FPGA's and Hosts. Exactly-once transaction semantics is implemented without sacrificing performance by switching between a high performance mode and a conservative mode depending on component failures. PCIE switches for connectivity between FPGAs and Hosts ensure FPGAs are available even if Hosts fail. When FPGA's provide higher processing elements and memory, the Hosts may be eliminated.

    Method and system for pre-deployment performance estimation of input-output intensive workloads

    公开(公告)号:US10558549B2

    公开(公告)日:2020-02-11

    申请号:US15361129

    申请日:2016-11-25

    Abstract: A method and system is provided for pre-deployment performance estimation of input-output intensive workloads. Particularly, the present application provides a method and system for predicting the performance of input-output intensive distributed enterprise application on multiple storage devices without deploying the application and the complete database in the target environment. The present method comprises of generating the input-output traces of an application on a source system with varying concurrencies; replaying the generated traces from the source system on a target system where application needs to be migrated; gathering performance data in the form of resource utilization, through-put and response time from the target system; extrapolating the data gathered from the target system in order to accurately predict the performance of multi-threaded input-output intensive applications in the target system for higher concurrencies.

Patent Agency Ranking