-
公开(公告)号:US10318399B2
公开(公告)日:2019-06-11
申请号:US13796923
申请日:2013-03-12
Applicant: Netflix, Inc.
Inventor: Philip Simon Tuffs , Roy Rapoport , Ariel Tseitlin
Abstract: Techniques for evaluating a second version of software. Embodiments selectively route incoming requests to software instances within a plurality of baseline instances and a plurality of canary instances, where the baseline instances run a first software version and the canary instances run the second software version. The software instances are monitored to collect performance data for a plurality of performance metrics. Embodiments calculate aggregate baseline performance metrics, where each of the aggregate baseline performance metrics is calculated based on the collected performance data for the plurality of baseline instances. For each of the performance metrics and canary instances, embodiments calculate a relative performance value that measures the collected performance data for the respective canary instance and for the respective performance metric, relative to the corresponding aggregate baseline performance metric. A final measure of performance is calculated for the second version of software, based on the relative performance values.
-
公开(公告)号:US20140282422A1
公开(公告)日:2014-09-18
申请号:US13796923
申请日:2013-03-12
Applicant: NETFLIX, INC.
Inventor: Philip Simon Tuffs , Roy Rapoport , Ariel Tseitlin
IPC: G06F11/34
CPC classification number: G06F11/3452 , G06F11/3428 , G06F2201/865
Abstract: Techniques for evaluating a second version of software. Embodiments selectively route incoming requests to software instances within a plurality of baseline instances and a plurality of canary instances, where the baseline instances run a first software version and the canary instances run the second software version. The software instances are monitored to collect performance data for a plurality of performance metrics. Embodiments calculate aggregate baseline performance metrics, where each of the aggregate baseline performance metrics is calculated based on the collected performance data for the plurality of baseline instances. For each of the performance metrics and canary instances, embodiments calculate a relative performance value that measures the collected performance data for the respective canary instance and for the respective performance metric, relative to the corresponding aggregate baseline performance metric. A final measure of performance is calculated for the second version of software, based on the relative performance values.
Abstract translation: 用于评估第二版软件的技术。 实体选择性地将传入请求路由到多个基线实例和多个金丝雀实例中的软件实例,其中基准实例运行第一软件版本,而金丝雀实例运行第二软件版本。 监视软件实例以收集多个性能度量的性能数据。 实施例计算聚合基线性能度量,其中基于多个基准实例的收集的性能数据来计算每个聚合基线性能指标。 对于每个性能指标和金丝雀实例,实施例计算相对性能值,该相对性能值相对于相应的聚合基线性能度量来衡量针对相应的金丝雀实例和相应的性能度量的收集的性能数据。 基于相对性能值,针对第二版软件计算性能的最终测量。
-
3.
公开(公告)号:US09584395B1
公开(公告)日:2017-02-28
申请号:US14079483
申请日:2013-11-13
Applicant: NETFLIX, INC.
Inventor: Roy Rapoport , Brent Pitman , Brian Harrington , Daniel Muino
IPC: G06F15/173 , H04L12/26
CPC classification number: H04L43/16 , G06F11/008 , G06F11/0709 , G06F11/076 , G06F11/328 , G06F11/3409 , G06F11/3452 , G06F11/3476 , G06F2201/81 , H04L41/0681 , H04L41/147 , H04L43/04
Abstract: Techniques for adaptive metric collection, metric storage, and alert thresholds are described. In an approach, a metric collector computer processes metrics as a collection of key/value pairs. The key/value pairs represent the dimensionality of the metrics and allows for semantic queries on the metrics based on keys. In an approach, a storage controller computer maintains a storage system with multiple storage tiers ranked by speed of access. The storage computer stores policy data that specifies the rules by which metric records are stored across the multiple storage tiers. Periodically, the storage computer moves database records to higher or lower tiers based on the policy data. In an approach, a metric collector in response to receiving a new metric, generates a predicted metric value based on previously recorded metric values and measures the deviation from the new metric value to determine whether an alert is appropriate.
Abstract translation: 描述了自适应度量收集,度量存储和警报阈值的技术。 在一种方法中,度量收集器计算机将度量标准作为键/值对的集合。 键/值对表示度量的维度,并允许基于键的度量的语义查询。 在一种方法中,存储控制器计算机维护具有按访问速度排列的多个存储层的存储系统。 存储计算机存储指定在多个存储层中存储度量记录的规则的策略数据。 定期地,存储计算机基于策略数据将数据库记录移动到更高层或更低层。 在一种方法中,响应于接收到新度量的度量收集器,基于先前记录的度量值生成预测度量值,并测量与新度量值的偏差以确定警报是否合适。
-
公开(公告)号:US20170149644A1
公开(公告)日:2017-05-25
申请号:US15425905
申请日:2017-02-06
Applicant: NETFLIX, INC.
Inventor: Roy Rapoport , Brent Pitman , Brian Harrington , Daniel Muino
CPC classification number: H04L43/16 , G06F11/008 , G06F11/0709 , G06F11/076 , G06F11/328 , G06F11/3409 , G06F11/3452 , G06F11/3476 , G06F2201/81 , H04L41/0681 , H04L41/147 , H04L43/04
Abstract: Techniques for adaptive metric collection, metric storage, and alert thresholds are described. In an approach, a metric collector computer processes metrics as a collection of key/value pairs. The key/value pairs represent the dimensionality of the metrics and allows for semantic queries on the metrics based on keys. In an approach, a storage controller computer maintains a storage system with multiple storage tiers ranked by speed of access. The storage computer stores policy data that specifies the rules by which metric records are stored across the multiple storage tiers. Periodically, the storage computer moves database records to higher or lower tiers based on the policy data. In an approach, a metric collector in response to receiving a new metric, generates a predicted metric value based on previously recorded metric values and measures the deviation from the new metric value to determine whether an alert is appropriate.
-
公开(公告)号:US20140281739A1
公开(公告)日:2014-09-18
申请号:US13826942
申请日:2013-03-14
Applicant: NETFLIX, INC.
Inventor: Philip Simon Tuffs , Roy Rapoport , Ariel Tseitlin
IPC: G06F11/34
CPC classification number: G06F11/3452 , G06F11/0709 , G06F11/079 , G06F11/3409 , G06F11/3495
Abstract: Techniques are described for identifying a root cause of a pattern of performance data in a system including a plurality of services. Embodiments provide dependency information for each of the plurality of services, where at least one of the plurality of services is dependent upon a first one of the plurality of services. Each of the plurality of services is monitored to collect performance data for the respective service. Embodiments further analyze the performance data to identify a cluster of services that each follow a pattern of performance data. The first one of the services in the cluster of services is determined to be a root cause of the pattern of performance data, based on the determined dependency information for each of the plurality of services.
Abstract translation: 描述了用于识别包括多个服务的系统中的性能数据模式的根本原因的技术。 实施例为多个服务中的每一个提供依赖信息,其中多个服务中的至少一个服务依赖于多个服务中的第一个服务。 监视多个服务中的每一个以收集相应服务的性能数据。 实施例进一步分析性能数据,以识别每个遵循性能数据模式的服务集群。 基于针对多个服务中的每一个确定的依赖性信息,将服务集群中的服务中的第一个服务确定为性能数据模式的根本原因。
-
公开(公告)号:US11683234B2
公开(公告)日:2023-06-20
申请号:US15042116
申请日:2016-02-11
Applicant: Netflix, Inc.
Inventor: Roy Rapoport , Christopher Sanden , Cody Rioux , Gregory Burrell
CPC classification number: H04L41/12 , G06F11/3409 , G06F11/3452 , H04L67/10
Abstract: One embodiment of the invention disclosed herein provides techniques for detecting and remediating an outlier server in a distributed computer system. A control server retrieves a group of time-series data sets associated with a first time period, where each time-series data set represents a performance metric for a different server in a group of servers. The control server generates a cluster that includes two or more of the time-series data sets, where the performance metric for each server that is associated with one of the time-series data sets in the cluster is within a threshold distance from the performance metric for the servers that are associated with the other time-series data sets in the cluster. The control server determines that a particular time-series data set corresponds to a server included in the group of servers and is not included in the cluster, and marks the server as an outlier server.
-
公开(公告)号:US11212208B2
公开(公告)日:2021-12-28
申请号:US16701065
申请日:2019-12-02
Applicant: NETFLIX, INC.
Inventor: Roy Rapoport , Brent Pitman , Brian Harrington , Daniel Muino
Abstract: Techniques for adaptive metric collection, metric storage, and alert thresholds are described. In an approach, a metric collector computer processes metrics as a collection of key/value pairs. The key/value pairs represent the dimensionality of the metrics and allows for semantic queries on the metrics based on keys. In an approach, a storage controller computer maintains a storage system with multiple storage tiers ranked by speed of access. The storage computer stores policy data that specifies the rules by which metric records are stored across the multiple storage tiers. Periodically, the storage computer moves database records to higher or lower tiers based on the policy data. In an approach, a metric collector in response to receiving a new metric, generates a predicted metric value based on previously recorded metric values and measures the deviation from the new metric value to determine whether an alert is appropriate.
-
8.
公开(公告)号:US10691814B2
公开(公告)日:2020-06-23
申请号:US15960468
申请日:2018-04-23
Applicant: NETFLIX, INC.
Inventor: Ariel Tseitlin , Roy Rapoport , Jason Chan
IPC: H04L29/06 , G06F21/60 , G06F16/28 , G06F21/45 , G06F21/57 , H04L9/32 , H04L12/26 , G06F9/50 , H04L12/24 , G06F11/30 , G06F21/00 , H04L29/08
Abstract: A security application manages security and reliability of networked applications executing collection of interacting computing elements within a distributed computing architecture. The security application monitors various classes of resources utilized by the collection of nodes within the distributed computing architecture and determine whether utilization of a class of resources is approaching a pre-determined maximum limit. The security application performs a vulnerability scan of a networked application to determine whether the networked application is prone to a risk of intentional or inadvertent breach by an external application. The security application scans a distributed computing architecture for the existence of access control lists (ACLs), and stores ACL configurations and configuration changes in a database. The security application scans a distributed computing architecture for the existence of security certificates, places newly discovered security certificates in a database, and deletes outdated security certificates. Advantageously, security and reliability are improved in a distributed computing architecture.
-
9.
公开(公告)号:US09953173B2
公开(公告)日:2018-04-24
申请号:US14703862
申请日:2015-05-04
Applicant: NETFLIX, INC.
Inventor: Ariel Tseitlin , Roy Rapoport , Jason Chan
CPC classification number: G06F21/604 , G06F9/50 , G06F11/302 , G06F11/3051 , G06F17/30598 , G06F21/00 , G06F21/45 , G06F21/577 , G06F2209/504 , G06F2221/034 , G06F2221/2141 , H04L9/3268 , H04L41/12 , H04L43/16 , H04L63/101 , H04L63/1408 , H04L63/1433 , H04L67/10 , Y02D10/22
Abstract: A security application manages security and reliability of networked applications executing collection of interacting computing elements within a distributed computing architecture. The security application monitors various classes of resources utilized by the collection of nodes within the distributed computing architecture and determine whether utilization of a class of resources is approaching a pre-determined maximum limit. The security application performs a vulnerability scan of a networked application to determine whether the networked application is prone to a risk of intentional or inadvertent breach by an external application. The security application scans a distributed computing architecture for the existence of access control lists (ACLs), and stores ACL configurations and configuration changes in a database. The security application scans a distributed computing architecture for the existence of security certificates, places newly discovered security certificates in a database, and deletes outdated security certificates. Advantageously, security and reliability are improved in a distributed computing architecture.
-
公开(公告)号:US09582395B2
公开(公告)日:2017-02-28
申请号:US13826942
申请日:2013-03-14
Applicant: Netflix, Inc.
Inventor: Philip Simon Tuffs , Roy Rapoport , Ariel Tseitlin
CPC classification number: G06F11/3452 , G06F11/0709 , G06F11/079 , G06F11/3409 , G06F11/3495
Abstract: Techniques are described for identifying a root cause of a pattern of performance data in a system including a plurality of services. Embodiments provide dependency information for each of the plurality of services, where at least one of the plurality of services is dependent upon a first one of the plurality of services. Each of the plurality of services is monitored to collect performance data for the respective service. Embodiments further analyze the performance data to identify a cluster of services that each follow a pattern of performance data. The first one of the services in the cluster of services is determined to be a root cause of the pattern of performance data, based on the determined dependency information for each of the plurality of services.
Abstract translation: 描述了用于识别包括多个服务的系统中的性能数据模式的根本原因的技术。 实施例为多个服务中的每一个提供依赖信息,其中多个服务中的至少一个服务依赖于多个服务中的第一个服务。 监视多个服务中的每一个以收集相应服务的性能数据。 实施例进一步分析性能数据,以识别每个遵循性能数据模式的服务集群。 基于针对多个服务中的每一个确定的依赖性信息,将服务集群中的服务中的第一个服务确定为性能数据模式的根本原因。
-
-
-
-
-
-
-
-
-