-
公开(公告)号:US09813467B1
公开(公告)日:2017-11-07
申请号:US15452241
申请日:2017-03-07
申请人: Ryan Barrett , Taylor Sittler , Krishna Pant , Zhenghua Li , Katsuya Noguchi , Nishant Bhat
发明人: Ryan Barrett , Taylor Sittler , Krishna Pant , Zhenghua Li , Katsuya Noguchi , Nishant Bhat
CPC分类号: G06F17/30 , G06F17/30516
摘要: Techniques are disclosed for processing and aligning incomplete data. A stream of data is received from a data source including a plurality of reads. While receiving the stream of data and prior to having received all of the plurality of reads, a set of reads is extracted from the plurality of reads. Each of the set of reads is aligned to a corresponding portion of a reference data set. For each particular position of a plurality of particular positions of the reference data set, a subset of reads of the aligned set of reads is identified. A value of a client data set is generated based on the subset of reads. A variable is generated based on the client data set. Data is routed when a condition, based on the variable, is satisfied.
-
公开(公告)号:US09811391B1
公开(公告)日:2017-11-07
申请号:US15449579
申请日:2017-03-03
申请人: Ryan Barrett , Taylor Sittler , Krishna Pant , Zhenghua Li
发明人: Ryan Barrett , Taylor Sittler , Krishna Pant , Zhenghua Li
CPC分类号: G06F19/18 , G06F9/5083 , G06F19/22 , G06F19/24
摘要: Embodiments in the disclosure are directed to the use of distributed computing to align reads against multiple portions of a reference dataset. Aligned portions of the reference dataset that correspond with an above-threshold alignment score can be assessed for the presence of sparse indicators that can be categorized and used to influence a determination of a state transition likelihood. Various tasks associated with the processing of reads (e.g., alignment, sparse indicator detection, and/or determination of a state transition likelihood) may be able to take advantage of parallel processing and can be distributed among the machines while considering the resource utilization of those machines. Different load-balancing mechanisms can be employed in order to achieve even resource utilization across the machines, and in some cases may involve assessing various processing characteristics that reflect a predicted resource expenditure and/or time profile for each task to be processed by a machine.
-
公开(公告)号:US09773031B1
公开(公告)日:2017-09-26
申请号:US15489473
申请日:2017-04-17
申请人: Krishna Pant , Taylor Sittler , Ryan Barrett
发明人: Krishna Pant , Taylor Sittler , Ryan Barrett
IPC分类号: G06F17/30
CPC分类号: G06F17/30303 , G06F17/30312 , G06F17/30569 , G06K9/00483 , G06K9/6249
摘要: Techniques for accurately identifying duplications and deletions using depth vectors. A depth vector is generated for each of multiple clients based on a set of reads that is received and aligned to a reference data set. A transformation processing of the depth vectors is performed to produce multiple components. Each of the components is assigned an order based on the extent to which it accounts for cross-client differences in the depth vectors. Each of the components includes an intensity, multiple values, and multiple client weights. A subset of the components is identified based on the order. A sparse indicator and positional data for the sparse indicator can be determined from the components in the subset, and one or more clients can be identified as being associated with the components.
-
公开(公告)号:US20170161105A1
公开(公告)日:2017-06-08
申请号:US15366409
申请日:2016-12-01
申请人: Ryan Barrett , Katsuya Noguchi , Nishant Bhat , Zhengua Li , Kurt Smith
发明人: Ryan Barrett , Katsuya Noguchi , Nishant Bhat , Zhengua Li , Kurt Smith
CPC分类号: G06F11/3419 , G06F9/4881 , G06F9/4887 , G06F11/3024 , G06F19/00 , G06N99/005
摘要: Methods and systems disclosed herein relate generally to data processing by applying machine learning techniques to iteration data to identify anomaly subsets of iteration data. More specifically, iteration data for individual iterations of a workflow involving a set of tasks may contain a client data set, client-associated sparse indicators and their classifications, and a set of processing times for the set of tasks performed in that iteration of the workflow. These individual iterations of the workflow may also be associated with particular data sources. Using the iteration data, anomaly subsets within the iteration data can be identified, such as data items resulting from systematic error associated with particular data sources, sets of sparse indicators to be validated or double-checked, or tasks that are associated with long processing times. The anomaly subsets can be provided in a generated communication or report in order to optimize future iterations of the workflow.
-
公开(公告)号:US09678794B1
公开(公告)日:2017-06-13
申请号:US15366409
申请日:2016-12-01
申请人: Ryan Barrett , Katsuya Noguchi , Nishant Bhat , Zhengua Li , Kurt Smith
发明人: Ryan Barrett , Katsuya Noguchi , Nishant Bhat , Zhengua Li , Kurt Smith
CPC分类号: G06F11/3419 , G06F9/4881 , G06F9/4887 , G06F11/3024 , G06F19/00 , G06N99/005
摘要: Methods and systems disclosed herein relate generally to data processing by applying machine learning techniques to iteration data to identify anomaly subsets of iteration data. More specifically, iteration data for individual iterations of a workflow involving a set of tasks may contain a client data set, client-associated sparse indicators and their classifications, and a set of processing times for the set of tasks performed in that iteration of the workflow. These individual iterations of the workflow may also be associated with particular data sources. Using the iteration data, anomaly subsets within the iteration data can be identified, such as data items resulting from systematic error associated with particular data sources, sets of sparse indicators to be validated or double-checked, or tasks that are associated with long processing times. The anomaly subsets can be provided in a generated communication or report in order to optimize future iterations of the workflow.
-
公开(公告)号:US09811438B1
公开(公告)日:2017-11-07
申请号:US15592949
申请日:2017-05-11
申请人: Ryan Barrett , Katsuya Noguchi , Nishant Bhat , Zhengua Li , Kurt Smith
发明人: Ryan Barrett , Katsuya Noguchi , Nishant Bhat , Zhengua Li , Kurt Smith
CPC分类号: G06F11/3419 , G06F9/4881 , G06F9/4887 , G06F11/3024 , G06F19/00 , G06N99/005
摘要: Methods and systems disclosed herein relate generally to data processing by applying machine learning techniques to iteration data to identify anomaly subsets of iteration data. More specifically, iteration data for individual iterations of a workflow involving a set of tasks may contain a client data set, client-associated sparse indicators and their classifications, and a set of processing times for the set of tasks performed in that iteration of the workflow. These individual iterations of the workflow may also be associated with particular data sources. Using the iteration data, anomaly subsets within the iteration data can be identified, such as data items resulting from systematic error associated with particular data sources, sets of sparse indicators to be validated or double-checked, or tasks that are associated with long processing times. The anomaly subsets can be provided in a generated communication or report in order to optimize future iterations of the workflow.
-
公开(公告)号:US09811552B1
公开(公告)日:2017-11-07
申请号:US15133089
申请日:2016-04-19
申请人: Katsuya Noguchi , Krishna Pant , Ryan Barrett , Elad Gil , Othman Laraki
发明人: Katsuya Noguchi , Krishna Pant , Ryan Barrett , Elad Gil , Othman Laraki
CPC分类号: G06F17/30371 , G06F17/30327 , G06F17/30598 , G06F17/30867
摘要: Techniques, systems, and products for analyzing sparse indicators and generating communications based on bucketing of sparse indicators are disclosed.
-
公开(公告)号:US09811439B1
公开(公告)日:2017-11-07
申请号:US15489492
申请日:2017-04-17
申请人: Ryan Barrett , Krishna Pant
发明人: Ryan Barrett , Krishna Pant
CPC分类号: G06F11/3612 , G06F11/26
摘要: Techniques for using functional testing to detect run-time impacts of code modifications. A method includes accessing a workflow including a plurality of stages for processing reads. The stages are defined based on modifiable code and include a first stage for aligning reads with a corresponding portion of a reference data set and a second stage for collectively analyzing data corresponding to the aligned reads. The method includes identifying functional testing specifications to correspond with the workflow, including a definition of which stages are to be performed during functional testing, a reduced reference data set, and a set of reads. The method includes performing the functional testing using the reduced reference data set and the set of reads, detecting a result generated via the performance, and outputting the result.
-
9.
公开(公告)号:US20170255790A1
公开(公告)日:2017-09-07
申请号:US15163191
申请日:2016-05-24
申请人: Ryan Barrett , Othman Laraki , Wendy McKennon , Katsuya Noguchi , Huy Hong
发明人: Ryan Barrett , Othman Laraki , Wendy McKennon , Katsuya Noguchi , Huy Hong
CPC分类号: G06F21/6218 , G06F17/30867 , G06F21/6245 , G06F21/6254 , G06Q10/00 , G06Q50/22 , G16H10/00
摘要: Methods and systems disclosed herein relate generally to processing data requests from external assessment systems. More specifically, an interface is availed to external assessment systems that accepts an identification of one or more genes. Upon receiving a request identifying one or more genes, a type of access authorized for the requesting external assessment system is assessed. When it is determined that the type of data access indicates that the external assessment system is authorized to access data for the one or more genes, a data repository is queried to identify client data that corresponds to the one or more genes and that indicates or can be used to detect a presence of client-associated variants. A response data set that includes at least some of the client data is transmitted to the external assessment system.
-
10.
公开(公告)号:US09785792B2
公开(公告)日:2017-10-10
申请号:US15163191
申请日:2016-05-24
申请人: Ryan Barrett , Othman Laraki , Wendy McKennon , Katsuya Noguchi , Huy Hong
发明人: Ryan Barrett , Othman Laraki , Wendy McKennon , Katsuya Noguchi , Huy Hong
CPC分类号: G06F21/6218 , G06F17/30867 , G06F21/6245 , G06F21/6254 , G06Q10/00 , G06Q50/22 , G16H10/00
摘要: Methods and systems disclosed herein relate generally to processing data requests from external assessment systems. More specifically, an interface is availed to external assessment systems that accepts an identification of one or more genes. Upon receiving a request identifying one or more genes, a type of access authorized for the requesting external assessment system is assessed. When it is determined that the type of data access indicates that the external assessment system is authorized to access data for the one or more genes, a data repository is queried to identify client data that corresponds to the one or more genes and that indicates or can be used to detect a presence of client-associated variants. A response data set that includes at least some of the client data is transmitted to the external assessment system.
-
-
-
-
-
-
-
-
-