Structuring unstructured web data using crowdsourcing
    21.
    发明授权
    Structuring unstructured web data using crowdsourcing 有权
    使用众包构建非结构化Web数据

    公开(公告)号:US09460419B2

    公开(公告)日:2016-10-04

    申请号:US12971976

    申请日:2010-12-17

    IPC分类号: G06F17/30 G06Q10/10

    CPC分类号: G06Q10/101 G06F17/30882

    摘要: A crowdsourcing data structuring system and method for capturing unstructured data from the Web and adding structure by placing the data in a document that is accessible by others in a cloud computing environment. Using crowdsourcing, the unstructured data is annotated, amended, and verified to add structure to the unstructured data. An anchor and update module convert the data to a pointer that links the document to the data at an information source and stores the pointer in the document rather than the data itself. The data displayed in the document is updated whenever the information source is updated. A contribution module allows users to add data to the document, a validation module allows users to determine the validity of the data linked to in the document, and an expert ranking module allows users to rank the expert or contributor of the data in the document.

    摘要翻译: 用于从Web获取非结构化数据并通过将数据放置在可由其他人在云计算环境中访问的文档中来添加结构的众包数据结构化系统和方法。 使用众包,非结构化数据进行注释,修改和验证,以向非结构化数据添加结构。 锚和更新模块将数据转换为将文档链接到信息源上的数据的指针,并将指针存储在文档中而不是数据本身。 每当更新信息源时,文档中显示的数据都会更新。 贡献模块允许用户向文档添加数据,验证模块允许用户确定文档中链接的数据的有效性,专家排名模块允许用户对文档中的数据的专家或贡献者进行排名。

    Systematic approach to uncover visual ambiguity vulnerabilities
    22.
    发明授权
    Systematic approach to uncover visual ambiguity vulnerabilities 有权
    发现视觉模糊漏洞的系统方法

    公开(公告)号:US08539585B2

    公开(公告)日:2013-09-17

    申请号:US11768134

    申请日:2007-06-25

    IPC分类号: G06F21/00

    摘要: To achieve end-to-end security, traditional machine-to-machine security measures are insufficient if the integrity of the graphical user interface (GUI) is compromised. GUI logic flaws are a category of software vulnerabilities that result from logic flaws in GUI implementation. The invention described here is a technology for uncovering these flaws using a systematic reasoning approach. Major steps in the technology include: (1) mapping a visual invariant to a program invariant; (2) formally modeling the program logic, the user actions and the execution context, and systematically exploring the possibilities of violations of the program invariant; (3) finding real spoofing attacks based on the exploration.

    摘要翻译: 为了实现端到端的安全性,如果图形用户界面(GUI)的完整性受到损害,则传统的机器对机器的安全措施是不够的。 GUI逻辑缺陷是由GUI实现中的逻辑缺陷引起的一类软件漏洞。 这里描述的发明是使用系统推理方法揭露这些缺陷的技术。 该技术的主要步骤包括:(1)将视觉不变量映射到程序不变; (2)对程序逻辑,用户动作和执行上下文进行正式建模,并系统地探索违反程序不变的可能性; (3)根据探索找到真正的欺骗攻击。

    Interactive web crawler
    23.
    发明授权
    Interactive web crawler 有权
    互动式网页抓取工具

    公开(公告)号:US08538949B2

    公开(公告)日:2013-09-17

    申请号:US13163001

    申请日:2011-06-17

    IPC分类号: G06F17/30

    摘要: The claimed subject matter provides a system or method for web crawling hidden files. An exemplary method includes loading a web page with a browser agent, and executing any dynamic elements hosted on the web page using the browser agent to insert pre-determined values. A list of form controls may be retrieved from the web page using the browser agent, and the controls may be analyzed using a driver component. Form control values may be sent from the driver component to the browser agent, and an event may be submitted to the web page by the browser agent or scripted content may be run to trigger operations on the web page corresponding to the form control values. A URL may be generated for various form control values using a generalizer.

    摘要翻译: 所要求保护的主题提供用于网络爬行隐藏文件的系统或方法。 示例性方法包括使用浏览器代理加载网页,以及使用浏览器代理来执行托管在网页上的任何动态元素以插入预定值。 可以使用浏览器代理从网页检索表单控件的列表,并且可以使用驱动器组件来分析控件。 表单控制值可以从驱动器组件发送到浏览器代理,并且可以由浏览器代理将事件提交到网页,或者可以运行脚本内容来触发对应于表单控制值的网页上的操作。 可以使用泛化器为各种形式控制值生成URL。

    Web page load time prediction and simulation
    24.
    发明授权
    Web page load time prediction and simulation 有权
    网页加载时间预测和模拟

    公开(公告)号:US08078691B2

    公开(公告)日:2011-12-13

    申请号:US12547704

    申请日:2009-08-26

    IPC分类号: G06F15/173

    摘要: Web page load time production and simulation includes determining an original page load time (PLT) of a webpage and timing information of each web object of the web page in a scenario. Each object is also annotated with client delay information based on a parental dependency graph (PDG) of the web page. The time information of each web object is further adjusted to reflect a second scenario that includes one or more modified parameters. The page loading of the web page is then simulated based on the adjusted timing information of each web object and the PDG of the web page to estimate a new PLT of the web page.

    摘要翻译: 网页加载时间生产和模拟包括确定网页的原始页面加载时间(PLT)和场景中网页的每个web对象的定时信息。 每个对象还使用基于网页的父母依赖图(PDG)的客户端延迟信息进行注释。 进一步调整每个web对象的时间信息以反映包括一个或多个修改参数的第二场景。 然后基于每个web对象的调整的定时信息和网页的PDG来模拟网页的页面加载,以估计网页的新的PLT。

    PROBABILISTIC GRADIENT BOOSTED MACHINES
    25.
    发明申请
    PROBABILISTIC GRADIENT BOOSTED MACHINES 审中-公开
    概念梯级增压机

    公开(公告)号:US20110264609A1

    公开(公告)日:2011-10-27

    申请号:US12764979

    申请日:2010-04-22

    申请人: Chao Liu Yi-Min Wang

    发明人: Chao Liu Yi-Min Wang

    IPC分类号: G06F15/18

    CPC分类号: G06N20/00

    摘要: Probabilistic gradient boosted machines are described herein. A probabilistic gradient boosted machine can be utilized to learn a function based at least in part upon sets of observations of a target attribute that is common across a plurality of entities and feature vectors that are representative of such entities. The sets of observations are assumed to accord to a distribution function in the exponential family. The learned function is utilized to generate values that are employed parameterize the distribution function, such that sets of observations can be predicted for different entities.

    摘要翻译: 这里描述了概率梯度升高的机器。 可以使用概率梯度增强机器来至少部分地基于多个实体中共同的目标属性的观察集合和代表这样的实体的特征向量来学习功能。 假设观测集合符合指数族中的分布函数。 所学习的函数用于产生参数化分布函数的值,使得可以针对不同实体预测观测集。

    DISTRIBUTED NON-NEGATIVE MATRIX FACTORIZATION
    26.
    发明申请
    DISTRIBUTED NON-NEGATIVE MATRIX FACTORIZATION 有权
    分布式非负矩阵法

    公开(公告)号:US20110246573A1

    公开(公告)日:2011-10-06

    申请号:US12750772

    申请日:2010-03-31

    IPC分类号: G06F15/16

    CPC分类号: G06F17/16

    摘要: Architecture that scales up the non-negative matrix factorization (NMF) technique to a distributed NMF (denoted DNMF) to handle large matrices, for example, on a web scale that can include millions and billions of data points. To analyze web-scale data, DNMF is applied through parallelism on distributed computer clusters, for example, with thousands of machines. In order to maximize the parallelism and data locality, matrices are partitioned in the short dimension. The probabilistic DNMF can employ not only Gaussian and Poisson NMF techniques, but also exponential NMF for modeling web dyadic data (e.g., dwell time of a user on browsed web pages).

    摘要翻译: 将非负矩阵分解(NMF)技术扩展到分布式NMF(表示为DNMF)以处理大型矩阵的架构,例如,可以包括数百万和数十亿个数据点的网络规模。 为了分析网络规模数据,DNMF通过并行性应用于分布式计算机集群,例如数千台机器。 为了最大化并行度和数据局部性,矩阵在短维中被划分。 概率DNMF不仅可以采用高斯和泊松NMF技术,还可以采用指数NMF来建模网络二进制数据(例如,用户在浏览的网页上的停留时间)。

    DETECTING USER-MODE ROOTKITS
    27.
    发明申请
    DETECTING USER-MODE ROOTKITS 有权
    检测用户模式

    公开(公告)号:US20110099632A1

    公开(公告)日:2011-04-28

    申请号:US12983849

    申请日:2011-01-03

    IPC分类号: G06F11/00

    摘要: A method and system for determining whether resources of a computer system are being hidden is provided. The security system invokes a high-level function of user mode that is intercepted and filtered by the malware to identify resources. The security system also directly invokes a low-level function of kernel mode that is not intercepted and filtered by the malware to identify resources. After invoking the high-level function and the low-level function, the security system compares the identified resources. If the low-level function identified a resource that was not identified by the high-level function, then the security system may consider the resource to be hidden.

    摘要翻译: 提供了一种用于确定计算机系统的资源是否被隐藏的方法和系统。 安全系统调用被恶意软件拦截和过滤的用户模式的高级功能,以识别资源。 安全系统还直接调用内核模式的低级功能,不被恶意软件拦截和过滤,以识别资源。 调用高级功能和低级功能后,安全系统将比较所识别的资源。 如果低级功能识别出高级功能未识别的资源,则安全系统可以考虑资源被隐藏。

    Changed file identification, software conflict resolution and unwanted file removal
    28.
    发明授权
    Changed file identification, software conflict resolution and unwanted file removal 失效
    更改文件识别,软件冲突解决和不需要的文件删除

    公开(公告)号:US07765592B2

    公开(公告)日:2010-07-27

    申请号:US10830334

    申请日:2004-04-22

    IPC分类号: G06F21/00 G06F12/14

    CPC分类号: G06F9/44505 G06F8/65

    摘要: As computer programs grow more complex, extensible, and connected, it becomes increasingly difficult for users to understand what has changed on their machines and what impact those changes have. An embodiment of the invention is described via a software tool, called AskStrider, that answers those questions by correlating volatile process information with persistent-state context information and change history. AskStrider scans a system for active components, matches them against a change log to identify recently updated and hence more interesting state, and searches for context information to help users understand the changes. Several real-world cases are provided to demonstrate the effectiveness of using AskStrider to quickly identify the presence of unwanted software, to determine if a software patch is potentially breaking an application, and to detect lingering components left over from an unclean uninstallation.

    摘要翻译: 随着计算机程序变得越来越复杂,可扩展和连接,用户越来越难以了解机器上发生了什么变化,以及这些更改有什么影响。 通过称为AskStrider的软件工具来描述本发明的实施例,其通过将易失性进程信息与持久状态上下文信息和变化历史相关联来回答这些问题。 AskStrider扫描系统中的活动组件,将其与更改日志进行匹配,以识别最近更新并因此更有趣的状态,并搜索上下文信息以帮助用户了解更改。 提供了几个真实案例来证明使用AskStrider快速识别不需要的软件的存在,确定软件补丁是否潜在地破坏应用程序,以及检测从不洁净卸载中遗留的剩余部件的有效性。

    System and method for protecting privacy and anonymity of parties of network communications
    29.
    发明授权
    System and method for protecting privacy and anonymity of parties of network communications 失效
    保护网络通信各方隐私和匿名的系统和方法

    公开(公告)号:US07669049B2

    公开(公告)日:2010-02-23

    申请号:US11072143

    申请日:2005-03-04

    IPC分类号: G06F9/00

    摘要: A system and method is provided for handling network communications between a client and a target server on the Internet to protect the privacy and anonymity of the client.For a session between the client and the target server, a routing control server sets up a routing chain using a plurality of Web servers randomly selected from a pool of participating Web servers as routers for routing messages between the client and the target server. To prevent traffic analysis, an “onion encryption” scheme is applied to the messages as they are forwarded along the routing chain.A payment service cooperating with the routing control server allows a user to pay for the privacy protection service without revealing her real identity.

    摘要翻译: 提供了一种用于处理因特网上的客户机和目标服务器之间的网络通信以保护客户端的隐私和匿名性的系统和方法。 对于客户端和目标服务器之间的会话,路由控制服务器使用从参与的Web服务器池中随机选择的多个Web服务器设置路由链,作为用于在客户端和目标服务器之间路由消息的路由器。 为了防止流量分析,当消息沿着路由链转发时,“洋葱加密”方案被应用于消息。 与路由控制服务器协作的支付服务器允许用户支付隐私保护服务,而不暴露她的真实身份。