-
1.
公开(公告)号:US09183203B1
公开(公告)日:2015-11-10
申请号:US13252559
申请日:2011-10-04
IPC分类号: G06F17/30
CPC分类号: G06F17/30011 , G06F17/3061 , G06F17/3064 , G06F17/30646 , G06F17/30651 , G06F17/3069 , G06F17/30696 , G06F2216/11
摘要: The GENERALIZED DATA MINING AND ANALYTICS APPARATUSES, METHODS AND SYSTEMS (“GDMA”), in various embodiments, may identify statistical relationships among query terms by analyzing a corpus of electronic documents. Inputs may be automatically generated automatically and/or user provided. In one embodiment, a method includes: accessing a term tensor associated with at least one term in a corpus of documents, wherein the term tensor comprises a plurality of data type vectors corresponding respectively to a plurality of term-correlated data types correlated with the at least one term in the corpus and each data type vector comprising a plurality of binned data type values with corresponding weighted occurrence values derived from the corpus; providing at least one of the plurality of term-correlated data types for selectable display; receiving at least one term-correlated data type selection; and providing data type values associated with the at least one term-correlated data type selection for display.
摘要翻译: 在各种实施例中,通用数据挖掘和分析装置,方法和系统(“GDMA”)可以通过分析电子文档的语料库来识别查询词之间的统计关系。 输入可以自动自动生成和/或用户提供。 在一个实施例中,一种方法包括:访问与文档语料库中的至少一个项相关联的项张量,其中所述项张量包括分别对应于与所述文档相关联的多个项相关数据类型的多个数据类型向量 语料库中的至少一个项目和每个数据类型向量包括具有从语料库导出的对应加权出现值的多个合并数据类型值; 提供所述多个术语相关数据类型中的至少一个用于可选择显示; 接收至少一个术语相关数据类型选择; 以及提供与所述至少一个术语相关数据类型选择相关联的数据类型值以进行显示。