CONSISTENT DATA STORAGE IN DISTRIBUTED COMPUTING SYSTEMS

    公开(公告)号:US20200218701A1

    公开(公告)日:2020-07-09

    申请号:US16818404

    申请日:2020-03-13

    Abstract: Methods and apparatus for providing consistent data storage in distributed computing systems. A consistent distributed computing file system (consistent DCFS) may be backed by an object storage service that only guarantees eventual consistency, and may leverage a data storage service (e.g., a database service) to store and maintain a file system/directory structure (a consistent DCFS directory) for the consistent DCFS that may be accessed by compute nodes for file/directory information relevant to the data objects in the consistent DCFS, rather than relying on the information maintained by the object storage service. The compute nodes may reference the consistent DCFS directory to, for example, store and retrieve strongly consistent metadata referencing data objects in the consistent DCFS. The compute nodes may, for example, retrieve metadata from consistent DCFS directory to determine whether the object storage service is presenting all of the data that it is supposed to have.

    Managing authorized execution of code

    公开(公告)号:US10574646B2

    公开(公告)日:2020-02-25

    申请号:US15922735

    申请日:2018-03-15

    Inventor: Peter Sirota

    Abstract: Techniques are described for providing customizable sign-on functionality, such as via an access manager system that provides single sign-on functionality and other functionality to other services for use with those services' users. The access manager system may maintain various sign-on and other account information for various users, and provide single sign-on functionality for those users using that maintained information on behalf of multiple unrelated services with which those users interact. The access manager may allow a variety of types of customizations to single sign-on functionality and/or other functionality available from the access manager, such as on a per-service basis via configuration by an operator of the service, such as co-branding customizations, customizations of information to be gathered from users, customizations of authority that may be delegated to other services to act on behalf of users, etc., and with the customizations that are available being determined specifically for that service.

    Controlling access to services via usage models

    公开(公告)号:US10291715B1

    公开(公告)日:2019-05-14

    申请号:US14144377

    申请日:2013-12-30

    Abstract: Techniques are described for facilitating interactions between computing systems, such as in accordance with usage models that are configured for available services by the providers of the services. In some situations, the services are Web services, and an electronic Web service (“WS”) marketplace is provided via which third-party WS providers make their WSes available to third-party WS consumers who purchase access to those WSes via the electronic marketplace based on configured usage models selected by the consumers. Some or all of the one or more usage models configured for an available WS may each have associated use prices and/or non-price use conditions, and if so access to those WSes using those usage models may be provided only if a consumer requesting access provides appropriate payment and otherwise satisfies the specified use conditions for a selected usage model.

    Executing parallel jobs with message passing on compute clusters

    公开(公告)号:US10148736B1

    公开(公告)日:2018-12-04

    申请号:US14281582

    申请日:2014-05-19

    Abstract: A client may submit a job to a service provider that processes a large data set and that employs a message passing interface (MPI) to coordinate the collective execution of the job on multiple compute nodes. The framework may create a MapReduce cluster (e.g., within a VPC) and may generate a single key pair for the cluster, which may be downloaded by nodes in the cluster and used to establish secure node-to-node communication channels for MPI messaging. A single node may be assigned as a mapper process and may launch the MPI job, which may fork its commands to other nodes in the cluster (e.g., nodes identified in a hostfile associated with the MPI job), according to the MPI interface. A rankfile may be used to synchronize the MPI job and another MPI process used to download portions of the data set to respective nodes in the cluster.

    SAVING PROGRAM EXECUTION STATE
    45.
    发明申请

    公开(公告)号:US20180129570A1

    公开(公告)日:2018-05-10

    申请号:US15867516

    申请日:2018-01-10

    CPC classification number: G06F11/1451 G06F9/4806 G06F9/485 G06F11/1469

    Abstract: Techniques are described for managing distributed execution of programs. In at least some situations, the techniques include decomposing or otherwise separating the execution of a program into multiple distinct execution jobs that may each be executed on a distinct computing node, such as in a parallel manner with each execution job using a distinct subset of input data for the program. In addition, the techniques may include temporarily terminating and later resuming execution of at least some execution jobs, such as by persistently storing an intermediate state of the partial execution of an execution job, and later retrieving and using the stored intermediate state to resume execution of the execution job from the intermediate state. Furthermore, the techniques may be used in conjunction with a distributed program execution service that executes multiple programs on behalf of multiple customers or other users of the service.

    SAVING PROGRAM EXECUTION STATE
    46.
    发明申请
    SAVING PROGRAM EXECUTION STATE 审中-公开
    节省计划执行状态

    公开(公告)号:US20150169412A1

    公开(公告)日:2015-06-18

    申请号:US14571093

    申请日:2014-12-15

    CPC classification number: G06F11/1451 G06F9/4806 G06F9/485 G06F11/1469

    Abstract: Techniques are described for managing distributed execution of programs. In at least some situations, the techniques include decomposing or otherwise separating the execution of a program into multiple distinct execution jobs that may each be executed on a distinct computing node, such as in a parallel manner with each execution job using a distinct subset of input data for the program. In addition, the techniques may include temporarily terminating and later resuming execution of at least some execution jobs, such as by persistently storing an intermediate state of the partial execution of an execution job, and later retrieving and using the stored intermediate state to resume execution of the execution job from the intermediate state. Furthermore, the techniques may be used in conjunction with a distributed program execution service that executes multiple programs on behalf of multiple customers or other users of the service.

    Abstract translation: 描述了用于管理程序的分布式执行的技术。 在至少一些情况下,这些技术包括将程序的执行分解或以其他方式分离成可以在不同的计算节点上执行的多个不同的执行作业,例如以每个执行作业使用不同的输入子集的并行方式 程序数据。 此外,这些技术可以包括临时终止和稍后恢复至少一些执行作业的执行,例如通过持续地存储执行作业的部分执行的中间状态,以及稍后检索和使用存储的中间状态来恢复执行 执行作业从中间状态。 此外,这些技术可以与代表多个客户或服务的其他用户执行多个程序的分布式程序执行服务结合使用。

    MANAGING DISTRIBUTED EXECUTION OF PROGRAMS
    47.
    发明申请
    MANAGING DISTRIBUTED EXECUTION OF PROGRAMS 审中-公开
    管理程序的分布式执行

    公开(公告)号:US20140330981A1

    公开(公告)日:2014-11-06

    申请号:US14338150

    申请日:2014-07-22

    CPC classification number: H04L67/1008 G06F9/485 H04L29/08135 H04L67/16

    Abstract: Techniques are described for managing distributed execution of programs. In some situations, the techniques include determining configuration information to be used for executing a particular program in a distributed manner on multiple computing nodes and/or include providing information and associated controls to a user regarding ongoing distributed execution of one or more programs to enable the user to modify the ongoing distributed execution in various manners. Determined configuration information may include, for example, configuration parameters such as a quantity of computing nodes and/or other measures of computing resources to be used for the executing, and may be determined in various manners, including by interactively gathering values for at least some types of configuration information from an associated user (e.g., via a GUI that is displayed to the user) and/or by automatically determining values for at least some types of configuration information (e.g., for use as recommendations to a user).

    Abstract translation: 描述了用于管理程序的分布式执行的技术。 在某些情况下,技术包括确定用于在多个计算节点上以分布式方式执行特定程序的配置信息和/或包括向用户提供关于一个或多个程序的正在进行的分布式执行的信息和相关控制,以使能 用户以各种方式修改正在进行的分布式执行。 确定的配置信息可以包括例如诸如计算节点的数量的配置参数和/或要用于执行的计算资源的其他度量,并且可以以各种方式来确定,包括通过交互地收集至少一些 来自相关用户的配置信息的类型(例如,经由显示给用户的GUI)和/或通过自动确定至少一些类型的配置信息(例如,用作对用户的建议)的值。

    Connector interface for data pipeline
    48.
    发明授权
    Connector interface for data pipeline 有权
    数据管道连接器接口

    公开(公告)号:US08812752B1

    公开(公告)日:2014-08-19

    申请号:US13764711

    申请日:2013-02-11

    CPC classification number: G06F9/52 G06F9/542 G06Q10/063

    Abstract: Methods and systems for a connector interface in a data pipeline are disclosed. A pipeline comprising two data source nodes and an activity node is configured. Each data source node represents data from a different data source, and the activity node represents a workflow activity that uses the data as input. Two connectors which implement the same connector interface are triggered. In response, data is acquired at each connector from the corresponding data source through the connector interface. The data is sent from the connectors to the activity node through the connector interface. The workflow activity is performed using the acquired data.

    Abstract translation: 公开了数据管线中的连接器接口的方法和系统。 配置包括两个数据源节点和活动节点的流水线。 每个数据源节点表示来自不同数据源的数据,活动节点表示使用数据作为输入的工作流活动。 实现相同连接器接口的两个连接器被触发。 作为响应,通过连接器接口从相应的数据源在每个连接器处获取数据。 数据通过连接器接口从连接器发送到活动节点。 使用获取的数据执行工作流活动。

Patent Agency Ranking