CROSS-PLATFORM TRANSFORMATION OF GENE EXPRESSION DATA
    2.
    发明申请
    CROSS-PLATFORM TRANSFORMATION OF GENE EXPRESSION DATA 审中-公开
    基因表达数据的跨平台转换

    公开(公告)号:WO2016059604A1

    公开(公告)日:2016-04-21

    申请号:PCT/IB2015/057952

    申请日:2015-10-16

    Abstract: Data-driven generalized regression-based frameworks that support the transformation of measurements, applicable but not limited to gene expressions, from one platform to another over a wide dynamic range, with selected summary statistics / feature values as predictors for the model parameters. The framework consists of primary model training and transformation, and additional levels of categorical regression and transformation processes.

    Abstract translation: 数据驱动的基于广义回归的框架,支持在广泛的动态范围内从一个平台到另一个平台的测量(可应用但不限于基因表达)的转换,并将选定的摘要统计/特征值作为模型参数的预测因子。 该框架包括初级模型培训和转型,以及其他级别的分类回归和转换过程。

    SYSTEM AND METHOD FOR AUTO-CONFIGURABLE DATA COMPRESSION FRAMEWORK

    公开(公告)号:WO2022106289A1

    公开(公告)日:2022-05-27

    申请号:PCT/EP2021/081320

    申请日:2021-11-11

    Inventor: CHEUNG, Yee Him

    Abstract: A method (100) for compressing and decompressing a data file, comprising: (i) receiving (120) a data file for compression comprising a plurality of different attributes; (ii) identifying (130) a first attribute of the plurality of different attributes; (iii) selecting (140) a plurality of compression types and/or configurations; (iv) compressing (150) at least some of the data from the received data file for the identified first attribute using each of the selected plurality of compression types and/or configurations; (v) determining (160) which one of the selected plurality of compression types and/or configurations is most suitable for compression; (vi) generating (170) a compression parameter data structure comprising an identification of the selected plurality of compression types and/or configurations; (vii) compressing (180) the data from the received data file for the first attribute to generate a compressed data file; and (viii) storing (190) the compression parameter data structure and the compressed data file.

    SUB-POPULATION DETECTION AND QUANTIZATION OF RECEPTOR-LIGAND STATES FOR CHARACTERIZING INTER-CELLULAR COMMUNICATION AND INTRATUMORAL HETEROGENEITY
    5.
    发明申请
    SUB-POPULATION DETECTION AND QUANTIZATION OF RECEPTOR-LIGAND STATES FOR CHARACTERIZING INTER-CELLULAR COMMUNICATION AND INTRATUMORAL HETEROGENEITY 审中-公开
    受体 - 配体状态的亚群检测和量化表征细胞间通讯和内源非均一性

    公开(公告)号:WO2017178345A1

    公开(公告)日:2017-10-19

    申请号:PCT/EP2017/058322

    申请日:2017-04-07

    Abstract: A system for characterizing intercellular communication and heterogeneity in cancer tumors, and more particularly a method for detecting sub-populations and receptor-ligand states for providing predictive information in relation to cancer and cancer treatment is disclosed. The system comprises the steps of obtaining from a NGS sequencer, single- cell RNA-seq for a plurality of cells within a tumor, correlation with a plurality of data sets from a curated gene list of receptor-ligand pairs, normalizing their transcript abundance data, assigning states (e.g. 0,1,2,3) to each curated receptor-ligand pair in each cell (e.g. depending on {L:R} = {0:0, 0:1, 1:0, 1:1}), thereby forming a matrix of receptor-ligand states, extracting sub-groups from the matrix that are not invariant and applying unsupervised clustering methods to identifying sub-clusters, identifying sub-populations within the set based on pair-wise distances between individual cells and similarity of cellular transcriptomes, identifying expressed ligands and receptors across the sub- populations, cross-referencing against the curated set of receptor-ligand pairs and providing a visually display the results by a mapping module for the clinician. The method can be used to study intercellular communication to elicit the etiology of diseases, and can be used to measure the disruption of intercellular communication to diagnose similarly disrupted disease patterns across patients.

    Abstract translation: 描述了用于表征癌症肿瘤中细胞间通讯和异质性的系统,并且更具体地公开了用于检测亚群和受体 - 配体状态以提供与癌症和癌症治疗相关的预测信息的方法。 该系统包括以下步骤:从NGS测序仪获得肿瘤内多个细胞的单细胞RNA-seq,与来自受体 - 配体对的管理基因列表的多个数据集的相关性,将它们的转录本丰度数据 ,将状态(例如0,1,2,3)分配给每个细胞中的每个有策略的受体 - 配体对(例如取决于{L:R} = {0:0,0:1,1:0,1:1} ),从而形成受体 - 配体状态的矩阵,从矩阵中提取不是不变的子组,并且应用无监督聚类方法来识别子群,基于个体细胞之间的成对距离来鉴定该组内的子群 和细胞转录组的相似性,鉴定跨越亚群的表达的配体和受体,与策展的受体 - 配体对组交叉参考,并通过用于临床医生的绘图模块在视觉上显示结果。 该方法可用于研究细胞间通讯以引发疾病的病因学,并可用于测量细胞间通讯的破坏以诊断患者间相似的疾病模式。

    SYSTEM AND METHODS FOR THE EFFICIENT IDENTIFICATION AND EXTRACTION OF SEQUENCE PATHS IN GENOME GRAPHS

    公开(公告)号:WO2021063904A1

    公开(公告)日:2021-04-08

    申请号:PCT/EP2020/077158

    申请日:2020-09-29

    Inventor: CHEUNG, Yee Him

    Abstract: A method for storing, by a processor, a genome graph representing a plurality of individual genomes, including: storing a linear representation of a reference genome in a data storage; receiving a first genome; identifying variations in the first genome from the reference genome; generating graph edges for each variation in the first genome from the reference genome; generating for each generated graph edge: an edge identifier that uniquely identifies the current edge in the genome graph; a start edge identifier that identifies the edge from which the current edge branches out; a start position that indicates the position on the start edge that serves as an anchoring point for the current edge; an end edge identifier that identifies the edge into which the current edge joins in; an end position that indicates the position on the end edge that serves as an anchoring point for the current edge; and a sequence indicating the nucleotide sequence of the current edge; and storing the edge identifier, start edge identifier, start position, end edge identifier, end edge position, and sequence for each generated graph edge in the data storage. Based on this genome graph data structure, we further propose a scheme for specifying a path, which may traverse one or more edges, and the ways to extend existing genomic data formats such as SAM, VCF and MPEG-G to support the use of genome graph reference using our proposed coordinate system.

    USING K-MERS FOR RAPID QUALITY CONTROL OF SEQUENCING DATA WITHOUT ALIGNMENT

    公开(公告)号:WO2019091986A1

    公开(公告)日:2019-05-16

    申请号:PCT/EP2018/080376

    申请日:2018-11-07

    CPC classification number: G16B30/00 C12Q1/68 G16B20/00 G16B45/00 G16B50/30

    Abstract: A method (200) for evaluating nucleic acid sequencing data using a quality control analysis system (300), comprising: receiving (210) a plurality of reads of a nucleic acid sequence; extracting (220) a plurality of k-mers from the plurality of reads; identifying (230), using the plurality of extracted k-mers, one or more of a plurality of annotated k-mers found in the plurality of reads, wherein the plurality of extracted k-mers are stored in an annotation database (350), and further wherein the annotated k-mers are annotated with annotation information about the one or more nucleic acid sequences from which the annotated k-mers are generated; gathering (240), based on the identified annotated k-mers found in the plurality of reads, annotation information about the plurality of reads; and determining (250), based on the gathered annotation information, a quality control metric for at least some of the plurality of reads.

    RELEVANCE FEEDBACK TO IMPROVE THE PERFORMANCE OF CLUSTERING MODEL THAT CLUSTERS PATIENTS WITH SIMILAR PROFILES TOGETHER
    8.
    发明申请
    RELEVANCE FEEDBACK TO IMPROVE THE PERFORMANCE OF CLUSTERING MODEL THAT CLUSTERS PATIENTS WITH SIMILAR PROFILES TOGETHER 审中-公开
    相关反馈提高聚类模型的性能,将类似分布的病人聚集在一起

    公开(公告)号:WO2017158472A1

    公开(公告)日:2017-09-21

    申请号:PCT/IB2017/051345

    申请日:2017-03-08

    Abstract: In patient cohort identification, clustering (30) of patients is performed using a patient comparison metric dependent on a set of features (24). Information is displayed on sample patients who are similar or dissimilar to a query patient according to the clustering. User inputted comparison values are received comparing the sample patients with the query patient. The set of features and/or feature weights are adjusted to generate an adjusted patient comparison metric having improved agreement with the user inputted comparison values. The clustering is repeated using the adjusted patient comparison metric. A patient cohort is identified from a cluster (34) containing the query patient produced by the last clustering repetition. The information on the sample patients may be shown by simultaneously displaying two or more graphical modality representations (70, 72, 74) each plotting the sample patients and the query patient against two or more features of the modality.

    Abstract translation: 在患者队列识别中,使用依赖于一组特征(24)的患者比较度量来执行患者的聚类(30)。 根据聚类,信息显示在与查询患者相似或不相似的样本患者中。 接收用户输入的比较值,将样本患者与查询患者进行比较。 调整该组特征和/或特征权重以生成与用户输入的比较值具有改进的一致性的调整的患者比较度量。 使用调整的患者比较度量重复聚类。 从包含由最后一次聚类重复产生的查询患者的聚类(34)中识别患者群组。 样本患者的信息可以通过同时显示两个或更多个图形模态表示(70,72,74)来显示,每个图形模式表示样本患者和查询患者针对模态的两个或更多个特征。

Patent Agency Ranking