Data import system for data analysis system
    1.
    发明授权
    Data import system for data analysis system 有权
    数据导入系统用于数据分析系统

    公开(公告)号:US06718336B1

    公开(公告)日:2004-04-06

    申请号:US09672622

    申请日:2000-09-29

    IPC分类号: G06F1730

    摘要: A data import system enables access to data of multiple types from multiple data sources of different formats and provides an interface for importing data into a data analysis system. The interface enables a user to customize the formatting of the data as the data is being imported into a data analysis system. A user may select first user defined options for operating on a first data set received during a data importation process. An intermediate representation of the data set is generated based on the user first defined options. A user may specify second user defined options based on the intermediate representation during the data importation process. The second user defined options are processed to produce a final data representation of the data set to be used for analysis of the data. The intermediate representation may be a data table. The processing of a data set may include merging a first and second data set to produce the final data representation. The second user defined options may enable a user to select a basic operation for merging the data sets or to select a non-basic operation for merging the data sets. The basic operation may combine data sets in response to a user's selection of a first graphical interface control, and the non-basic operation may combine the data sets based on user selection of at least two graphical interface controls from a group of graphical interface controls.

    摘要翻译: 数据导入系统可以访问来自不同格式的多个数据源的多种类型的数据,并提供用于将数据导入数据分析系统的接口。 该界面使用户能够在将数据导入数据分析系统时自定义数据的格式。 用户可以选择用于在数据导入过程期间接收的第一数据集上操作的第一用户定义的选项。 基于用户首先定义的选项生成数据集的中间表示。 用户可以在数据导入过程期间基于中间表示来指定第二用户定义的选项。 处理第二个用户定义的选项以产生要用于数据分析的数据集的最终数据表示。 中间表示可以是数据表。 数据集的处理可以包括合并第一和第二数据集以产生最终数据表示。 第二用户定义的选项可以使得用户能够选择用于合并数据集的基本操作或者选择用于合并数据集的非基本操作。 基本操作可以响应于用户对第一图形界面控件的选择来组合数据集,并且非基本操作可以基于来自一组图形界面控件的至少两个图形界面控件的用户选择来组合数据集。

    ISOLATING DESIRED CONTENT, METADATA, OR BOTH FROM SOCIAL MEDIA
    2.
    发明申请
    ISOLATING DESIRED CONTENT, METADATA, OR BOTH FROM SOCIAL MEDIA 有权
    分离所需的内容,元数据,或两个来自社会媒体

    公开(公告)号:US20120221545A1

    公开(公告)日:2012-08-30

    申请号:US13036776

    申请日:2011-02-28

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30705 G06F17/30864

    摘要: Desired content, metadata, or both can be isolated from the full content of social media websites having content-rich pages. Achieving this can include obtaining from the content-rich pages a language-independent representation having a hierarchical structure of nodes and then generating a node representation for each node. Feature vectors for the nodes are generated and a label is assigned to each node representation according to a schema. Assignment can occur by executing a trained classification algorithm on the feature vectors. The schema has schema elements and each schema element corresponds to a label. For each schema element, all node representations having matching labels are gathered and then one node representation is elected from among those with matching labels to be assigned to a schema element field in a template. The template can be applied to extract desired content, metadata, or both according to the schema from all the content-rich pages.

    摘要翻译: 期望的内容,元数据或两者都可以从具有内容丰富的网页的社交媒体网站的完整内容中隔离开来。 实现这一点可以包括从内容丰富的页面获得具有节点的分层结构然后为每个节点生成节点表示的独立于语言的表示。 生成节点的特征向量,并根据模式将标签分配给每个节点表示。 可以通过对特征向量执行经过训练的分类算法来进行分配。 模式具有模式元素,每个模式元素对应于一个标签。 对于每个模式元素,收集具有匹配标签的所有节点表示,然后从具有匹配标签的那些中选择一个节点表示,以将其分配给模板中的模式元素字段。 该模板可以应用于根据所有富含内容的页面的模式提取所需内容,元数据或二者。

    Data visualization methods, data visualization devices, data visualization apparatuses, and articles of manufacture
    4.
    发明授权
    Data visualization methods, data visualization devices, data visualization apparatuses, and articles of manufacture 有权
    数据可视化方法,数据可视化装置,数据可视化装置和制品

    公开(公告)号:US09069847B2

    公开(公告)日:2015-06-30

    申请号:US11256225

    申请日:2005-10-21

    IPC分类号: G06F17/00 G06F17/30

    摘要: Data visualization methods, data visualization devices, data visualization apparatuses, and articles of manufacture are described according to some aspects. In one aspect, a data visualization method includes accessing a plurality of initial documents at a first moment in time, first processing the initial documents providing processed initial documents, first identifying a plurality of first associations of the initial documents using the processed initial documents, generating a first visualization depicting the first associations, accessing a plurality of additional documents at a second moment in time after the first moment in time, second processing the additional documents providing processed additional documents, second identifying a plurality of second associations of the additional documents and at least some of the initial documents, wherein the second identifying comprises identifying using the processed initial documents and the processed additional documents, and generating a second visualization depicting the second associations.

    摘要翻译: 根据一些方面描述数据可视化方法,数据可视化设备,数据可视化设备和制品。 一方面,数据可视化方法包括在第一时刻访问多个初始文档,首先处理提供处理的初始文档的初始文档,首先使用处理的初始文档识别初始文档的多个第一关联,生成 描绘第一关联的第一可视图,在第一时刻之后的第二时刻访问多个附加文档,第二处理提供经处理的附加文档的附加文档,第二识别附加文档的多个第二关联,以及在 至少一些初始文档,其中所述第二识别包括使用所处理的初始文档和所处理的附加文档进行识别,以及生成描绘所述第二关联的第二可视化。

    Isolating desired content, metadata, or both from social media
    5.
    发明授权
    Isolating desired content, metadata, or both from social media 有权
    从社交媒体隔离所需的内容,元数据或两者

    公开(公告)号:US08239425B1

    公开(公告)日:2012-08-07

    申请号:US13036776

    申请日:2011-02-28

    IPC分类号: G06F7/00

    CPC分类号: G06F17/30705 G06F17/30864

    摘要: Desired content, metadata, or both can be isolated from the full content of social media websites having content-rich pages. Achieving this can include obtaining from the content-rich pages a language-independent representation having a hierarchical structure of nodes and then generating a node representation for each node. Feature vectors for the nodes are generated and a label is assigned to each node representation according to a schema. Assignment can occur by executing a trained classification algorithm on the feature vectors. The schema has schema elements and each schema element corresponds to a label. For each schema element, all node representations having matching labels are gathered and then one node representation is elected from among those with matching labels to be assigned to a schema element field in a template. The template can be applied to extract desired content, metadata, or both according to the schema from all the content-rich pages.

    摘要翻译: 期望的内容,元数据或两者都可以从具有内容丰富的网页的社交媒体网站的完整内容中隔离开来。 实现这一点可以包括从内容丰富的页面获得具有节点的分层结构然后为每个节点生成节点表示的独立于语言的表示。 生成节点的特征向量,并根据模式将标签分配给每个节点表示。 可以通过对特征向量执行经过训练的分类算法来进行分配。 模式具有模式元素,每个模式元素对应于一个标签。 对于每个模式元素,收集具有匹配标签的所有节点表示,然后从具有匹配标签的那些中选择一个节点表示,以将其分配给模板中的模式元素字段。 该模板可以应用于根据所有富含内容的页面的模式提取所需内容,元数据或二者。