BOTTOM-UP ANALYSIS OF NETWORK SITES
    11.
    发明申请
    BOTTOM-UP ANALYSIS OF NETWORK SITES 有权
    网络站点底层分析

    公开(公告)号:US20100262693A1

    公开(公告)日:2010-10-14

    申请号:US12421644

    申请日:2009-04-10

    IPC分类号: G06F15/173

    摘要: An approach for identifying suspect network sites in a network environment entails using one or more malware analysis modules to identify distribution sites that host malicious content and/or benign content. The approach then uses a linking analysis module to identify landing sites that are linked to the distribution sites. These linked sites are identified as suspect sites for further analysis. This analysis can be characterized as “bottom up” because it is initiated by the detection of potentially problematic distribution sites. The approach can also perform linking analysis to identify a suspect network site based on a number of alternating paths between that network site and a set of distribution sites that are known to host malicious content. The approach can also train a classifier module to predict whether an unknown landing site is a malicious landing site or a benign landing site.

    摘要翻译: 在网络环境中识别可疑网络站点的方法需要使用一个或多个恶意软件分析模块来识别托管恶意内容和/或良性内容的分发站点。 然后,该方法使用链接分析模块来标识与分发站点相关联的着陆站点。 这些链接站点被确定为可疑站点进行进一步分析。 这种分析可以被描述为“自下而上”,因为它是通过检测潜在的有问题的分发站点而启动的。 该方法还可以执行链接分析,以基于网络站点与已知承载恶意内容的一组分发站点之间的多个交替路径来识别可疑网络站点。 该方法还可以训练分类器模块来预测未知的着陆点是否是恶意着陆点或良性着陆点。

    Robust indexing and retrieval of electronic ink
    12.
    发明授权
    Robust indexing and retrieval of electronic ink 有权
    电子墨水的稳健索引和检索

    公开(公告)号:US07646940B2

    公开(公告)日:2010-01-12

    申请号:US11397436

    申请日:2006-04-04

    IPC分类号: G06K9/60

    CPC分类号: G06K9/00422 G06K9/6247

    摘要: A unique system and method that facilitates indexing and retrieving electronic ink objects with improved efficiency and accuracy is provided. Handwritten words or characters are mapped to a low dimension through a process of segmentation, stroke classification using a neural network, and projection along directions found using OPCA, for example. The employment of OPCA makes these low dimensional representations robust to handwriting variations or noise. Each handwritten word or set of characters is stored along with neighborhood hyperrectangle that represents word variations. Redundant bit vectors are used to index the hyperrectangles for efficient storage and retrieval. Ink-based queries can be submitted in order to retrieve at least one ink object. To do so, the ink query is processed to determine its query point which is represented by a (query) hyperrectangle. A data store can be searched for any hyperrectangles that match the query hyperrectangle.

    摘要翻译: 提供了一种独特的系统和方法,其以提高的效率和精度便于索引和检索电子墨水对象。 例如,手写字或字符通过分割过程,使用神经网络的笔划分类和使用OPCA发现的方向进行投影来映射到低维度。 OPCA的使用使得这些低维度表示对于手写变体或噪声是鲁棒的。 每个手写字或一组字符与代表词变化的邻域超矩形一起存储。 冗余位向量用于索引超矩形用于有效的存储和检索。 可以提交基于墨迹的查询,以便检索至少一个墨水对象。 为此,将处理墨水查询以确定其由(查询)超矩形表示的查询点。 可以搜索与查询超矩形匹配的任何超矩形的数据存储。

    DATA STORAGE STRUCTURE
    13.
    发明申请
    DATA STORAGE STRUCTURE 有权
    数据存储结构

    公开(公告)号:US20090222408A1

    公开(公告)日:2009-09-03

    申请号:US12038813

    申请日:2008-02-28

    IPC分类号: G06F7/06

    CPC分类号: G06F17/3033 G06F17/30424

    摘要: Efficient data storage and retrieval (e.g., in terms of time and space requirements) is facilitated by implementing an indexing structure comprising an indexing array. That is, a functional relationship between elements of a source set and elements of a query result set can be stored in the indexing structure. This allows, for example, a query regarding whether an element is a member of a set (e.g., whether a particular website or Uniform Resource Locator (URL)) has been visited before) as well as a relationship between the member set and the query (e.g., the number of hyperlinks in the website the last time it was visited) to be resolved efficiently.

    摘要翻译: 通过实现包括索引数组的索引结构,便于有效的数据存储和检索(例如,在时间和空间方面的要求方面)。 也就是说,源集合的元素和查询结果集合的元素之间的功能关系可以存储在索引结构中。 这允许例如关于元素是否是集合的成员(例如,是否已经访问过特定网站或统一资源定位符(URL))的查询)以及成员集和查询之间的关系 (例如,上次访问时网站中的超链接数)有效解决。

    Scalable minimal perfect hashing
    14.
    发明授权
    Scalable minimal perfect hashing 有权
    可扩展的最小完美散列

    公开(公告)号:US07792877B2

    公开(公告)日:2010-09-07

    申请号:US11799370

    申请日:2007-05-01

    IPC分类号: G06F7/00 G06F17/00

    CPC分类号: G06F17/30949 Y10S707/953

    摘要: A minimal perfect hash function can be created for input data by dividing the input data into multiple collections, with each collection comprising fewer elements that the input data as a whole. Subsequently, minimal perfect hash functions can be created for each of the collections and the resulting hash values can be offset by a value equivalent to the number of input data in preceding collections. The minimal perfect hash function can, thereby, be derived in parallel and can consume substantially less storage space. To further save storage space, the internal state of each individual minimal perfect hash function can be further compressed using algorithms exploiting a skewed distribution of values in a lookup table comprising the internal state.

    摘要翻译: 可以通过将输入数据划分为多个集合来创建输入数据的最小完美散列函数,每个集合包含作为整体的输入数据较少的元素。 随后,可以为每个集合创建最小的完美散列函数,并且所得到的散列值可以被等价于前面集合中的输入数据的值的值偏移。 因此,最小完美散列函数可以并行导出,并且可以消耗大大减少的存储空间。 为了进一步节省存储空间,可以使用利用包含内部状态的查找表中的值的偏斜分布的算法来进一步压缩每个单独的最小完美散列函数的内部状态。

    Allograph based writer adaptation for handwritten character recognition
    15.
    发明授权
    Allograph based writer adaptation for handwritten character recognition 有权
    基于笔记本的作家适应手写字符识别

    公开(公告)号:US07646913B2

    公开(公告)日:2010-01-12

    申请号:US11305968

    申请日:2005-12-19

    IPC分类号: G06K9/00

    摘要: The claimed subject matter provides a system and/or a method that facilitates analyzing and/or recognizing a handwritten character. An interface component can receive at least one handwritten character. A personalization component can train a classifier based on an allograph related to a handwriting style to provide handwriting recognition for the at least one handwritten character. In addition, the personalization component can employ any suitable combiner to provide optimized recognition.

    摘要翻译: 所要求保护的主题提供了便于分析和/或识别手写字符的系统和/或方法。 接口组件可以接收至少一个手写字符。 个性化组件可以基于与手写风格相关的笔记本来训练分类器,以提供至少一个手写字符的手写识别。 此外,个性化组件可以使用任何合适的组合器来提供优化的识别。

    High Performance Script Behavior Detection Through Browser Shimming
    16.
    发明申请
    High Performance Script Behavior Detection Through Browser Shimming 有权
    通过浏览器边框高性能脚本行为检测

    公开(公告)号:US20080320498A1

    公开(公告)日:2008-12-25

    申请号:US11767486

    申请日:2007-06-23

    IPC分类号: G06F13/00

    CPC分类号: G06F17/30864 G06F9/45508

    摘要: The behavior of browser applications, such as web browsers, can be controlled in part by script-based instructions present within documents read by those browsers. To analyze such scripts in an efficient manner, a script analyzer can identify the scripts in the document, divide them into script modules, and order the modules to represent an interpretational flow. The script can be interpreted and executed on a line-by-line basis and its behavior analyzed. Prior to interpretation, the scripts can be reviewed for delay conditionals, and such statements can be modified for more efficient interpretation. Additionally, if, during interpretation, the script generates new script, or modifies existing script, such new scripts can be themselves interpreted. External function calls made by the script can be intercepted and responded to in a generic fashion, limiting the need to create a document object model, based on the document's data, solely for script analysis purposes.

    摘要翻译: 浏览器应用程序(如Web浏览器)的行为可以部分通过这些浏览器读取的文档中的基于脚本的指令进行控制。 为了以有效的方式分析这些脚本,脚本分析器可以识别文档中的脚本,将它们分成脚本模块,并对模块进行排序以表示解释流程。 脚本可以逐行解释和执行,并分析其行为。 在解释之前,可以对延迟条件进行审查,并且可以对这些语句进行修改以进行更有效的解释。 另外,如果在解释期间脚本生成新脚本或修改现有脚本,则可以自己解释这些新脚本。 由脚本进行的外部函数调用可以以通用的方式进行截取和响应,仅限用于脚本分析的目的,限制了基于文档数据创建文档对象模型的需要。

    Scalable minimal perfect hashing
    17.
    发明申请
    Scalable minimal perfect hashing 有权
    可扩展的最小完美散列

    公开(公告)号:US20080275847A1

    公开(公告)日:2008-11-06

    申请号:US11799370

    申请日:2007-05-01

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30949 Y10S707/953

    摘要: A minimal perfect hash function can be created for input data by dividing the input data into multiple collections, with each collection comprising fewer elements that the input data as a whole. Subsequently, minimal perfect hash functions can be created for each of the collections and the resulting hash values can be offset by a value equivalent to the number of input data in preceding collections. The minimal perfect hash function can, thereby, be derived in parallel and can consume substantially less storage space. To further save storage space, the internal state of each individual minimal perfect hash function can be further compressed using algorithms exploiting a skewed distribution of values in a lookup table comprising the internal state.

    摘要翻译: 可以通过将输入数据划分为多个集合来创建输入数据的最小完美散列函数,每个集合包含作为整体的输入数据较少的元素。 随后,可以为每个集合创建最小的完美散列函数,并且所得到的散列值可以被等价于前面集合中的输入数据的值的值偏移。 因此,最小完美散列函数可以并行导出,并且可以消耗大大减少的存储空间。 为了进一步节省存储空间,可以使用利用包含内部状态的查找表中的值的偏斜分布的算法来进一步压缩每个单独的最小完美散列函数的内部状态。

    SEARCH QUERY MONETIZATION-BASED RANKING AND FILTERING
    18.
    发明申请
    SEARCH QUERY MONETIZATION-BASED RANKING AND FILTERING 审中-公开
    搜索查询基于功能的排序和筛选

    公开(公告)号:US20080033797A1

    公开(公告)日:2008-02-07

    申请号:US11461552

    申请日:2006-08-01

    IPC分类号: G06Q30/00

    摘要: Advertiser monetization information is utilized to determine a search query monetization value that can be employed in web-search ranking to facilitate in ranking search results and/or in email spam filtering to reduce unsolicited emails and the like. Various methods can be employed to filter and/or rank and the like based on the search query monetization value. This can include biasing based on high values and/or low values. The search query monetization value can be determined based on, for example, independent phrases and/or bids. In other instances, personal user advertising interactions can be employed as well to facilitate search result ranking and/or email spam filtering. Employment of search query monetization value techniques can substantially reduce various types of subversive/undesired information.

    摘要翻译: 广告商获利信息用于确定可以在网页搜索排名中使用的搜索查询营利价值,以便于排名搜索结果和/或电子邮件垃圾邮件过滤以减少未经请求的电子邮件等。 可以使用各种方法来基于搜索查询营利值来过滤和/或排名等。 这可以包括基于高值和/或低值的偏置。 可以基于例如独立短语和/或出价来确定搜索查询营利值。 在其他情况下,也可以使用个人用户广告交互来促进搜索结果排名和/或邮件垃圾邮件过滤。 采用搜索查询营利价值技术可以大大减少各种类型的颠覆性/不需要的信息。

    INTERACTIVE PAPER SYSTEM
    19.
    发明申请
    INTERACTIVE PAPER SYSTEM 有权
    互动纸系统

    公开(公告)号:US20120207391A1

    公开(公告)日:2012-08-16

    申请号:US13365569

    申请日:2012-02-03

    IPC分类号: G06K9/34 G06K9/18

    摘要: A printer, scanner device and methods for using same are described herein. A printer device may include a dedicated input that, when actuated, generates and sends a request to a computer for known data or a predetermined print job, e.g., schedule information from a personal information management (PIM) application. A scanner device may include another dedicated input that, when actuated, automatically scans a document fed to the device by the user and sends the scanned image to IM (or other) software on a computer, bypassing the need to manipulate the scanned image using scanner software. The device may be used with printed metapaper, which includes a barcode or other indicia identifying the metapaper and corresponds to a stored template image of the metapaper. When the metapaper is rescanned, the scan can be compared to the stored template information to identify changes and synchronize the changes with the IM software.

    摘要翻译: 本文描述了打印机,扫描仪装置及其使用方法。 打印机设备可以包括专用输入,其在被致动时,生成并向计算机发送已知数据或预定打印作业的请求,例如来自个人信息管理(PIM)应用的调度信息。 扫描仪装置可以包括另一个专用输入,当被致动时,它自动扫描由用户馈送到装置的文件,并将扫描的图像发送到计算机上的IM(或其他)软件,绕过使用扫描仪操纵扫描图像的需要 软件。 该设备可以与打印的元数据文件一起使用,其包括标识元分析器的条形码或其他标记,并且对应于元数据文件的存储的模板图像。 当重新扫描Metapaper时,可以将扫描与存储的模板信息进行比较,以识别更改并使IM软件同步更改。

    Robust personalization through biased regularization
    20.
    发明授权
    Robust personalization through biased regularization 有权
    通过有偏见的正规化强化个性化

    公开(公告)号:US07886266B2

    公开(公告)日:2011-02-08

    申请号:US11278949

    申请日:2006-04-06

    IPC分类号: G06F9/44

    CPC分类号: G10L15/07

    摘要: The subject disclosure pertains to systems and methods for personalization of a recognizer. In general, recognizers can be used to classify input data. During personalization, a recognizer is provided with samples specific to a user, entity or format to improve performance for the specific user, entity or format. Biased regularization can be utilized during personalization to maintain recognizer performance for non-user specific input. In one aspect, regularization can be biased to the original parameters of the recognizer, such that the recognizer is not modified excessively during personalization.

    摘要翻译: 本发明涉及用于识别器个性化的系统和方法。 通常,识别器可用于对输入数据进行分类。 在个性化期间,向识别器提供特定于用户,实体或格式的样本,以提高特定用户,实体或格式的性能。 在个性化过程中可以利用偏置正则化来维持非用户特定输入的识别器性能。 在一个方面,正则化可以偏向识别器的原始参数,使得识别器在个性化期间不被过度修改。