Automated end-to-end analysis of customer service requests

    公开(公告)号:US10191837B2

    公开(公告)日:2019-01-29

    申请号:US15415047

    申请日:2017-01-25

    Applicant: VMware, Inc.

    Abstract: An automated end-to-end analysis of customer service requests is disclosed. A core dump is received, wherein the core dump corresponds to a customer service request regarding a crash of a computer system. The core dump is automatically analyzed with a processor to generate analysis results. A graphical representation for display on a graphic user interface of a computer is generate, wherein the graphical representation corresponds to the analysis results for the core dump.

    HOST AND DPU COORDINATION FOR DPU MAINTENANCE EVENTS

    公开(公告)号:US20240241728A1

    公开(公告)日:2024-07-18

    申请号:US18097601

    申请日:2023-01-17

    Applicant: VMware, Inc.

    CPC classification number: G06F9/4403

    Abstract: Disclosed are various examples of host and data processing unit (DPU) coordination for DPU maintenance events. A host device can have a DPU device connected to it. A data processing unit (DPU) maintenance process executed by a host device can quiesce applications or virtual machines of the host device, and call a DPU isolation interface that isolates the DPU device to prevent host panic. A kernel process of the host device unloads a driver of the DPU device from the host device and removes the DPU device from a device manager of the host device. A DPU maintenance action is performed once the DPU device is isolated.

    Managing Virtual Machines in the Presence of Uncorrectable Memory Errors

    公开(公告)号:US20210216394A1

    公开(公告)日:2021-07-15

    申请号:US16743895

    申请日:2020-01-15

    Applicant: VMware, Inc.

    Abstract: Techniques for migrating virtual machines (VMs) in the presence of uncorrectable memory errors are provided. According to one set of embodiments, a source host hypervisor of a source host system can determine, for each guest memory page of a VM to be migrated from the source host system to a destination host system, whether the guest memory page is impacted by an uncorrectable memory error in a byte-addressable memory of the source host system. If the source host hypervisor determines that the guest memory page is impacted, the source host hypervisor can transmit a data packet to a destination host hypervisor of the destination host system that includes error metadata identifying the guest memory page as being corrupted. Alternatively, if the source host hypervisor determines that the guest memory page is not impacted, the source host hypervisor can attempt to read the guest memory page from the byte-addressable memory in a memory exception-safe manner.

    Computer crash risk assessment
    5.
    发明授权

    公开(公告)号:US10331508B2

    公开(公告)日:2019-06-25

    申请号:US15415235

    申请日:2017-01-25

    Applicant: VMware, Inc.

    Abstract: A computer-implemented method assessing the risk of a future crash occurring on a computer system is disclosed. Crash results are received from a crash analysis system. The crash results are analyzed, at a processor, to determine the likelihood of the future crash occurring on the computer system. Information regarding the likelihood of the future crash occurring on the computer system is provided to a user of the computer system.

    Detecting X86 CPU register corruption from kernel crash dumps
    6.
    发明授权
    Detecting X86 CPU register corruption from kernel crash dumps 有权
    从内核崩溃转储检测X86 CPU寄存器损坏

    公开(公告)号:US09552250B2

    公开(公告)日:2017-01-24

    申请号:US14669049

    申请日:2015-03-26

    Applicant: VMware, Inc.

    Abstract: Discovering a hardware failure in a processor is disclosed. When an operating system or application fails, a function containing the instruction that failed along with the register set of the CPU at the failure is recorded. The function is analyzed into its basic blocks. The failing instruction, the failing basic block, the definitions that reach the failing instruction, and the CPU register set at the failure provide information to determine whether the failure was caused by hardware or software. If, after a complete search of the definitions reaching the failing instruction, the search discovers a first definition defining the failing instruction and a second definition defining the first definition such that the second definition reaches the failing instruction and the first definition assigns a register value that does not match a register value in the failing instruction, then a hardware failure is the cause of the crash.

    Abstract translation: 发现在处理器中发现硬件故障。 当操作系统或应用程序发生故障时,将记录包含失败的指令失败的功能以及故障时CPU的寄存器组。 该功能被分析成其基本块。 失败的指令,失败的基本块,到达故障指令的定义以及故障时设置的CPU寄存器提供信息,以确定故障是由硬件还是软件引起。 如果在对达到故障指令的定义的完整搜索完成之后,搜索发现定义故障指令的第一定义和定义第一定义的第二定义,使得第二定义到达故障指令,并且第一定义分配一个寄存器值, 在失败的指令中与寄存器值不匹配,则硬件故障是崩溃的原因。

    Culprit module detection and signature back trace generation

    公开(公告)号:US10338990B2

    公开(公告)日:2019-07-02

    申请号:US15415089

    申请日:2017-01-25

    Applicant: VMware, Inc.

    Abstract: In a crash analysis system, a method for analyzing a core dump corresponding to a crash of a computer system is disclosed. A core dump is received wherein the core dump corresponds to a crash of a computer system. A culprit module responsible for the crash of the computer system is determined. A signature back trace, which pertains to a symptom of the crash of the computer system is generated.

    Determination of a culprit thread after a physical central processing unit lockup

    公开(公告)号:US10331546B2

    公开(公告)日:2019-06-25

    申请号:US15415261

    申请日:2017-01-25

    Applicant: VMware, Inc.

    Abstract: An automated end-to-end analysis of customer service requests is disclosed. A core dump is received, wherein the core dump corresponds to a customer service request regarding a crash of a computer system. A processor automatically analyzes the core dump to determine if a pcpu lockup of the computer system is due to a software issue. Provided the pcpu lockup of the computer system is due to the software issue, the processor determines which thread is a culprit thread responsible for the pcpu lockup of the computer system.

Patent Agency Ranking