High performance content alteration architecture and techniques
    1.
    发明授权
    High performance content alteration architecture and techniques 有权
    高性能内容改变架构和技术

    公开(公告)号:US07505946B2

    公开(公告)日:2009-03-17

    申请号:US10815086

    申请日:2004-03-31

    IPC分类号: G06F17/00 G06F17/20

    摘要: The present invention provides a unique system and method that facilitates obtaining high performance and more secure HIPs. More specifically, the HIPs can be generated in part by caching pre-rendered characters and/or pre-rendered arcs as bitmaps in binary form and then selecting any number of the characters and/or arcs randomly to form a HIP sequence. The warp field can be pre-computed and converted to integers in binary form and can include a plurality of sub-regions. The warp field can be cached as well. Any one sub-region can be retrieved from the warp field cache and mapped to the HIP sequence to warp the HIP. Thus, the pre-computed warp field can be used to warp multiple HIP sequences. The warping can occur in binary form and at a high resolution to mitigate reverse engineering. Following, the warped HIP sequence can be down-sampled and texture and/or color can be added as well to improve its appearance.

    摘要翻译: 本发明提供了一种独特的系统和方法,其有助于获得高性能和更安全的HIP。 更具体地说,可以部分地通过将预渲染字符和/或预渲染的弧缓存为二进制形式的位图,然后随机选择任意数量的字符和/或弧形成HIP序列,来部分地生成HIP。 翘曲域可以被预先计算并转换成二进制形式的整数,并且可以包括多个子区域。 翘曲区也可以缓存。 任何一个子区域都可以从warp域高速缓存中检索,并映射到HIP序列以扭曲HIP。 因此,可以使用预先计算的翘曲场来扭曲多个HIP序列。 翘曲可以以二进制形式和高分辨率发生,以减轻逆向工程。 以下,翘曲的HIP序列可以进行下采样,并且可以添加纹理和/或颜色以改善其外观。

    Logical structure layout identification and classification for offline character recognition
    2.
    发明授权
    Logical structure layout identification and classification for offline character recognition 有权
    逻辑结构布局识别和离线字符识别分类

    公开(公告)号:US07844114B2

    公开(公告)日:2010-11-30

    申请号:US11299873

    申请日:2005-12-12

    IPC分类号: G06K9/18

    CPC分类号: G06K9/80

    摘要: A method and system for implementing character recognition is described herein. An input character is received. The input character is composed of one or more logical structures in a particular layout. The layout of the one or more logical structures is identified. One or more of a plurality of classifiers are selected based on the layout of the one or more logical structures in the input character. The entire character is input into the selected classifiers. The selected classifiers classify the logical structures. The outputs from the selected classifiers are then combined to form an output character vector.

    摘要翻译: 本文描述了用于实现字符识别的方法和系统。 接收到一个输入字符。 输入字符由特定布局中的一个或多个逻辑结构组成。 识别一个或多个逻辑结构的布局。 基于输入字符中的一个或多个逻辑结构的布局来选择多个分类器中的一个或多个。 整个字符被输入到所选择的分类器中。 所选分类器对逻辑结构进行分类。 然后将所选分类器的输出组合以形成输出字符向量。

    Scalable hash-based character recognition
    3.
    发明授权
    Scalable hash-based character recognition 有权
    可扩展的基于哈希的字符识别

    公开(公告)号:US07664323B2

    公开(公告)日:2010-02-16

    申请号:US11045792

    申请日:2005-01-28

    IPC分类号: G06K9/00

    摘要: The subject invention leverages a scalable character glyph hash table to provide an efficient means to identify print characters where the character glyphs are identical over independent presentation. The hash table allows for quick determinations of glyph meta data as, for example, a pre-filter to traditional OCR techniques. The hash table can be trained for a particular environment, user, language, character set (e.g., alphabet), document type, and/or specific document and the like. This permits substantial flexibility and increases in speed in identifying unknown glyphs. The hash table itself can be composed of single or multiple tables that have a specific optimization purpose. In one instance of the subject invention, traditional OCR techniques can be utilized to update the hash tables as needed based on glyph frequency. This keeps the hash tables from growing by limiting updates that reduce its performance, while adding frequently determined glyphs to increase the pre-filter performance.

    摘要翻译: 本发明利用可缩放的字符字形哈希表来提供用于识别字符字形在独立呈现上相同的打印字符的有效手段。 哈希表允许快速确定字形元数据,例如,对传统的OCR技术进行预过滤。 可以针对特定环境,用户,语言,字符集(例如字母表),文档类型和/或特定文档等对哈希表进行训练。 这允许在识别未知字形中的基本灵活性和速度增加。 散列表本身可以由具有特定优化目的的单个或多个表组成。 在本发明的一个实例中,可以使用传统的OCR技术来根据字形频率根据需要来更新哈希表。 这样可以通过限制降低性能的更新来限制哈希表的增长,同时添加经常确定的字形以增加预过滤器的性能。

    Segmentation based content alteration techniques
    5.
    发明授权
    Segmentation based content alteration techniques 有权
    基于分割的内容变更技术

    公开(公告)号:US07653944B2

    公开(公告)日:2010-01-26

    申请号:US11046996

    申请日:2005-01-31

    IPC分类号: G06F7/04 G06F17/30 H04N7/16

    摘要: The subject invention provides a unique system and method that facilitates creating HIP challenges (HIPs) that can be readily segmented and solved by human users but that are too difficult for non-human users. More specifically, the system and method utilize a variety of unique alteration techniques that are segmentation-based. For example, the system and method employ thicker arcs or occlusions that do not intersect characters already placed in the HIP. The thickness of the arc can be measured or determined by the thickness of the characters in the HIP. In addition to increasing the thickness, the arcs can be lengthened because longer arcs tend to resemble pieces of characters and may be harder to erode. Usability maps can be generated and used to selectively place clutter or occlusions and to selectively warp characters or the character sequence to facilitate human recognition of the characters.

    摘要翻译: 本发明提供了一种独特的系统和方法,其有助于创建可以容易地由人类用户分割和解决的HIP挑战(HIP),但是对于非人类用户来说太难了。 更具体地说,该系统和方法利用了基于分段的各种独特的改变技术。 例如,系统和方法采用较大的弧或闭合不与HIP中已经放置的字符相交。 电弧的厚度可以通过HIP中字符的厚度来测量或确定。 除了增加厚度之外,弧可以延长,因为较长的弧往往类似于一些字符,并且可能难以侵蚀。 可用性图可以被生成并用于选择性地放置杂乱或闭塞,并且选择性地扭曲字符或字符序列以促进人类对字符的识别。

    Unfolded convolution for fast feature extraction
    6.
    发明授权
    Unfolded convolution for fast feature extraction 有权
    用于快速特征提取的展开卷积

    公开(公告)号:US07634137B2

    公开(公告)日:2009-12-15

    申请号:US11250819

    申请日:2005-10-14

    IPC分类号: G06K9/46

    CPC分类号: G06K9/4628 G06K2209/01

    摘要: Systems and methods are described that facilitate performing feature extraction across multiple received input features to reduce computational overhead associated with feature processing related to, for instance, optical character recognition. Input feature information can be unfolded and concatenated to generate an aggregated input matrix, which can be convolved with a kernel matrix to produce output feature information for multiple output features concurrently.

    摘要翻译: 描述了有助于在多个接收到的输入特征之间执行特征提取的系统和方法,以减少与例如光学字符识别相关的特征处理相关联的计算开销。 输入特征信息可以展开并连接以生成聚合输入矩阵,其可以与内核矩阵进行卷积以同时产生多个输出特征的输出特征信息。

    Interactive paper system
    7.
    发明授权
    Interactive paper system 有权
    互动纸系统

    公开(公告)号:US08115948B2

    公开(公告)日:2012-02-14

    申请号:US11379649

    申请日:2006-04-21

    IPC分类号: G06F3/12

    摘要: A printer, scanner device and methods for using same are described herein. A printer device may include a dedicated input that, when actuated, generates and sends a request to a computer for known data or a predetermined print job, e.g., schedule information from a personal information management (PIM) application. A scanner device may include another dedicated input that, when actuated, automatically scans a document fed to the device by the user and sends the scanned image to IM (or other) software on a computer, bypassing the need to manipulate the scanned image using scanner software. The device may be used with printed metapaper, which includes a barcode or other indicia identifying the metapaper and corresponds to a stored template image of the metapaper. When the metapaper is rescanned, the scan can be compared to the stored template information to identify changes and synchronize the changes with the IM software.

    摘要翻译: 本文描述了打印机,扫描仪装置及其使用方法。 打印机设备可以包括专用输入,其在被致动时,生成并向计算机发送已知数据或预定打印作业的请求,例如来自个人信息管理(PIM)应用的调度信息。 扫描仪装置可以包括另一个专用输入,其在被致动时自动地扫描由用户馈送到装置的文件,并将扫描的图像发送到计算机上的IM(或其他)软件,绕过使用扫描仪操纵扫描图像的需要 软件。 该设备可以与打印的元数据文件一起使用,其包括标识元分析器的条形码或其他标记,并且对应于元数据文件的存储的模板图像。 当重新扫描Metapaper时,可以将扫描与存储的模板信息进行比较,以识别更改并使IM软件同步更改。

    CLOAKING DETECTION UTILIZING POPULARITY AND MARKET VALUE
    8.
    发明申请
    CLOAKING DETECTION UTILIZING POPULARITY AND MARKET VALUE 有权
    利用人口和市场价值进行检测

    公开(公告)号:US20080154847A1

    公开(公告)日:2008-06-26

    申请号:US11613725

    申请日:2006-12-20

    IPC分类号: G06F17/30 G06F17/00

    CPC分类号: G06F17/30864 G06Q30/02

    摘要: The subject disclosure pertains to systems and methods that facilitate detection of cloaked web pages. Commercial value of search terms and/or queries can be indicative of the likelihood that web pages associated with the keywords or queries are cloaked. Commercial value can be determined based upon popularity of terms and/or advertisement market value as established based upon advertising revenue, fees and the like. Commercial value can be utilized in conjunction with term frequency difference analysis to identify a cloaked page automatically. In addition, commercial values of terms associated with web pages can be used to order or prioritize web pages for further analysis.

    摘要翻译: 主题公开涉及便于检测隐藏的网页的系统和方法。 搜索词和/或查询的商业价值可以表示与关键字或查询相关联的网页被隐藏的可能性。 商业价值可以根据基于广告收入,费用等建立的条款和/或广告市场价值的普及来确定。 商业价值可以与期限频率差分析一起使用,以自动识别隐藏页面。 此外,与网页相关联的术语的商业价值可用于对网页进行订购或优先排序以进一步分析。

    Interactive paper system
    9.
    发明授权
    Interactive paper system 有权
    互动纸系统

    公开(公告)号:US08797579B2

    公开(公告)日:2014-08-05

    申请号:US13365569

    申请日:2012-02-03

    IPC分类号: G06F3/12

    摘要: A printer, scanner device and methods for using same are described herein. A printer device may include a dedicated input that, when actuated, generates and sends a request to a computer for known data or a predetermined print job, e.g., schedule information from a personal information management (PIM) application. A scanner device may include another dedicated input that, when actuated, automatically scans a document fed to the device by the user and sends the scanned image to IM (or other) software on a computer, bypassing the need to manipulate the scanned image using scanner software. The device may be used with printed metapaper, which includes a barcode or other indicia identifying the metapaper and corresponds to a stored template image of the metapaper. When the metapaper is rescanned, the scan can be compared to the stored template information to identify changes and synchronize the changes with the IM software.

    摘要翻译: 本文描述了打印机,扫描仪装置及其使用方法。 打印机设备可以包括专用输入,其在被致动时,生成并向计算机发送已知数据或预定打印作业的请求,例如来自个人信息管理(PIM)应用的调度信息。 扫描仪装置可以包括另一个专用输入,当被致动时,它自动扫描由用户馈送到装置的文件,并将扫描的图像发送到计算机上的IM(或其他)软件,绕过使用扫描仪操纵扫描图像的需要 软件。 该设备可以与打印的元数据文件一起使用,其包括标识元分析器的条形码或其他标记,并且对应于元数据文件的存储的模板图像。 当重新扫描Metapaper时,可以将扫描与存储的模板信息进行比较,以识别更改并使IM软件同步更改。

    Bottom-up analysis of network sites
    10.
    发明授权
    Bottom-up analysis of network sites 有权
    网站自下而上的分析

    公开(公告)号:US08161130B2

    公开(公告)日:2012-04-17

    申请号:US12421644

    申请日:2009-04-10

    IPC分类号: G06F15/16

    摘要: An approach for identifying suspect network sites in a network environment entails using one or more malware analysis modules to identify distribution sites that host malicious content and/or benign content. The approach then uses a linking analysis module to identify landing sites that are linked to the distribution sites. These linked sites are identified as suspect sites for further analysis. This analysis can be characterized as “bottom up” because it is initiated by the detection of potentially problematic distribution sites. The approach can also perform linking analysis to identify a suspect network site based on a number of alternating paths between that network site and a set of distribution sites that are known to host malicious content. The approach can also train a classifier module to predict whether an unknown landing site is a malicious landing site or a benign landing site.

    摘要翻译: 在网络环境中识别可疑网络站点的方法需要使用一个或多个恶意软件分析模块来识别托管恶意内容和/或良性内容的分发站点。 然后,该方法使用链接分析模块来标识与分发站点相关联的着陆站点。 这些链接站点被确定为可疑站点进行进一步分析。 这种分析可以被描述为“自下而上”,因为它是通过检测潜在的有问题的分发站点而启动的。 该方法还可以执行链接分析,以基于网络站点与已知承载恶意内容的一组分发站点之间的多个交替路径来识别可疑网络站点。 该方法还可以训练分类器模块来预测未知的着陆点是否是恶意着陆点或良性着陆点。