Fast PCA method for big discrete data

    公开(公告)号:US09922058B2

    公开(公告)日:2018-03-20

    申请号:US14333130

    申请日:2014-07-16

    IPC分类号: G06F19/00 G06F17/30

    CPC分类号: G06F17/30306

    摘要: This disclosure is related to further approximating multiple data vectors of a dataset. The multiple data vectors are initially approximated by one or more stored principle components. A processor performs multiple iterations of determining an updated estimate of a further principle component based on the multiple data vectors that are initially approximated by the one or more stored principle components. The processor performs this step such that the updated estimate of the further principal component further approximates the dataset. In each iteration the processor constrains the updated estimate of the further principal component to be orthogonal to each of the one or more stored principal components. The data vectors of the dataset are not manipulated but remain the same data vectors that are approximated by the stored principal components.

    FAST PCA METHOD FOR BIG DISCRETE DATA
    2.
    发明申请
    FAST PCA METHOD FOR BIG DISCRETE DATA 有权
    用于大型分析数据的快速PCA方法

    公开(公告)号:US20150026134A1

    公开(公告)日:2015-01-22

    申请号:US14333130

    申请日:2014-07-16

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30306

    摘要: This disclosure is related to further approximating multiple data vectors of a dataset. The multiple data vectors are initially approximated by one or more stored principle components. A processor performs multiple iterations of determining an updated estimate of a further principle component based on the multiple data vectors that are initially approximated by the one or more stored principle components. The processor performs this step such that the updated estimate of the further principal component further approximates the dataset. In each iteration the processor constrains the updated estimate of the further principal component to be orthogonal to each of the one or more stored principal components. The data vectors of the dataset are not manipulated but remain the same data vectors that are approximated by the stored principal components.

    摘要翻译: 本公开涉及进一步近似数据集的多个数据向量。 多个数据向量最初由一个或多个存储的主成分近似。 处理器基于由一个或多个存储的主要分量最初近似的多个数据向量来执行确定另一主要分量的更新估计的多次迭代。 处理器执行该步骤,使得另外的主成分的更新的估计进一步近似于数据集。 在每次迭代中,处理器将更新的主要组件的更新的估计约束为与一个或多个存储的主要组件中的每一个正交。 数据集的数据向量不被操纵,但保持与存储的主要组件近似的相同的数据向量。