Decision-Theoretic Control of Crowd-Sourced Workflows
    1.
    发明申请
    Decision-Theoretic Control of Crowd-Sourced Workflows 审中-公开
    人群工作流的决策理论控制

    公开(公告)号:US20110313933A1

    公开(公告)日:2011-12-22

    申请号:US13049769

    申请日:2011-03-16

    IPC分类号: G06Q10/00

    CPC分类号: G06Q10/10 G06Q10/103

    摘要: Systems and methods for the decision-theoretic control and optimization of crowd-sources workflows utilize a computing device to map a workflow to complete a directive. The directive includes a utility function, and the workflow comprises an ordered task set. Decision points precede and follow each task in the task set, and each decision point may require (a) posting a call for workers to complete instances of tasks in the task set; (b) adjusting parameters of tasks in the task set; or (c) submitting an artifact generated by a worker as output. The computing device accesses a plurality of workers having capability parameters that describe the workers' respective abilities to complete tasks. The computing device implements the workflow by optimizing and/or selecting user-preferred choices at decision points according to the utility function and submits an artifact as output. The computing device may also implement a training phase to ascertain worker capability parameters.

    摘要翻译: 用于决策理论控制和优化人群来源工作流程的系统和方法利用计算设备映射工作流以完成指令。 该指令包括效用函数,工作流包括有序任务集。 决策点在任务集中的每个任务之前和之后,每个决策点可能需要(a)发布一个呼叫,让工作人员完成任务集中的任务实例; (b)调整任务集中任务的参数; 或(c)提交由工作人员生成的工件作为输出。 计算设备访问多个具有描述工人相应能力以完成任务的能力参数的工作人员。 计算设备通过根据效用函数优化和/或选择决策点上的用户偏好选择来实现工作流程,并将工件提交为输出。 计算设备还可以实现训练阶段以确定工作人员能力参数。

    Method and apparatus of automatically generating a procedure for extracting information from textual information sources
    2.
    发明授权
    Method and apparatus of automatically generating a procedure for extracting information from textual information sources 失效
    自动生成从文本信息源提取信息的过程的方法和装置

    公开(公告)号:US06304870B1

    公开(公告)日:2001-10-16

    申请号:US08982857

    申请日:1997-12-02

    IPC分类号: G06F1730

    摘要: A procedure is disclosed for automatically constructing wrappers for performing information-extraction from sites such as Internet resources that display relevant information, interspersed with extraneous text fragments, such as HTML formatting commands or advertisements. The procedure has three basic steps. First, a set of example pages are collected with a subroutine named GatherExamples. Gather Examples is provided with information describing how to pose example queries to the site whose wrapper is to be learned. Second, these example pages are labeled by a subroutine named LabelExamples—i.e., the information to be extracted from each example is identified for use in the third step. The LabelExamples subroutine uses a general framework for labeling pages using site-specific heuristics called recognizers, as well as allowing users to correct and modify the recognized instances. Finally, the labeled example pages are passed to a BuildWrapper subroutine, which constructs a wrapper.

    摘要翻译: 公开了一种用于自动构建用于从诸如显示相关信息的因特网资源的站点执行信息提取的包装器的程序,散布有诸如HTML格式化命令或广告之类的无关文本片段。 程序有三个基本步骤。 首先,使用名为GatherExamples的子例程收集一组示例页面。 收集示例提供了描述如何向包装器学习的站点提供示例查询的信息。 其次,这些示例页面被称为LabelExamples的子程序标记,即从每个示例中提取的信息被识别用于第三步。 LabelExamples子例程使用通用框架来标记页面,使用称为识别器的站点特定启发式方法,并允许用户更正和修改已识别的实例。 最后,标记的示例页面被传递给构建包装器的BuildWrapper子例程。

    Method and system for network information access
    4.
    发明授权
    Method and system for network information access 失效
    网络信息访问方法和系统

    公开(公告)号:US5995959A

    公开(公告)日:1999-11-30

    申请号:US12441

    申请日:1998-01-23

    IPC分类号: G06F9/44 G06F17/30

    摘要: This invention provides methods to locate and plan the retrieval of data from networked information sources in response to a user query. The methods utilize descriptions of the information sources, the information domain of the sources, and of the query. The methods of this invention integrate both legacy systems and full relational databases with an efficient, domain-independent, query-planning algorithm, reason about the capabilities of different information sources, handle partial goal satisfaction i.e., gather as much data as possible when all that the user requested cannot be gathered, are both sound and complete, and are efficient.

    摘要翻译: 本发明提供了响应于用户查询来定位和规划从联网信息源检索数据的方法。 这些方法利用了信息源的描述,信息源和查询的信息域。 本发明的方法将遗留系统和完整的关系数据库与有效的,独立于领域的查询计划算法相结合,关于不同信息源的能力的原因,处理部分目标满足度,即尽可能地收集尽可能多的数据 用户请求无法收集,既完整又完整,效率高。