Abstract:
Reference feature vectors are constructed representing refer-ence genetic data sets of a reference population. The reference feature vec-tors are transformed using a linear transformation to generate reduced di-mensionality vector representations of the reference genetic data sets of the reference population. A tree-based spatial data structure is constructed to index the reference genetic data sets as data points defined by at least some dimensions of the reduced dimensionality vector representations of the ref-erence genetic data sets of the reference population. The linear transform may be generated by performing feature reduction on the reference feature vectors. A feature vector representing a proband genetic data set is trans-formed using the linear transformation to generate a reduced-dimensional-ity vector representation that is located in the tree-based spatial data struc-ture to perform population assignment for the proband genetic data set.