Methods and apparatus for generating difference files
    21.
    发明授权
    Methods and apparatus for generating difference files 有权
    用于生成差异文件的方法和装置

    公开(公告)号:US08498965B1

    公开(公告)日:2013-07-30

    申请号:US12710060

    申请日:2010-02-22

    IPC分类号: G06F17/00

    CPC分类号: G06F8/658

    摘要: One embodiment relates to a computer-implemented method for generating difference data between reference and target files. A difference engine implemented using a computer receives the reference and target files. The difference engine performs a first procedure to generate difference data representing the difference between the reference and target files if the size of the reference file is less than a first threshold size. The difference engine performs a second procedure to generate the difference data if the size of the reference file is less than a second threshold size and greater than the first threshold. The difference engine performing a third procedure to generate difference data if the size of the reference file is greater than the second threshold size. Other embodiments relate to apparatus for generating difference data between reference and target files and for reconstructing the target files from the reference files using the difference data.

    摘要翻译: 一个实施例涉及用于在参考文件和目标文件之间生成差异数据的计算机实现的方法。 使用计算机实现的差异引擎接收引用和目标文件。 如果参考文件的大小小于第一阈值大小,则差分引擎执行第一过程以产生表示参考文档和目标文件之间的差异的差异数据。 如果参考文件的大小小于第二阈值大小并且大于第一阈值,差分引擎执行第二过程以生成差分数据。 差分引擎执行第三过程以在参考文件的大小大于第二阈值大小时产生差分数据。 其他实施例涉及用于在参考文件和目标文件之间生成差异数据的装置,并且用于使用差异数据从参考文件重新构建目标文件。

    Document fingerprinting with asymmetric selection of anchor points
    22.
    发明授权
    Document fingerprinting with asymmetric selection of anchor points 有权
    文档指纹与锚点的不对称选择

    公开(公告)号:US08359472B1

    公开(公告)日:2013-01-22

    申请号:US12731874

    申请日:2010-03-25

    申请人: Liwei Ren Qiuer Xu

    发明人: Liwei Ren Qiuer Xu

    IPC分类号: H04L9/32

    摘要: One embodiment relates to a computer-implemented process for generating document fingerprints. A document is normalized to create a normalized text string. A first hash function with a sliding hash window is applied to the normalized text string to generate an array of hash values. Candidate anchoring points are selected by applying a first filter to the array of hash values. The anchoring points are chosen by applying a second filter to the candidate anchoring points. Finally, a second hash function is applied to substrings located at the chosen anchoring points to generate hash values for use as fingerprints for the document. Other embodiments and aspects are also disclosed.

    摘要翻译: 一个实施例涉及用于生成文档指纹的计算机实现的过程。 文档被归一化以创建标准化的文本字符串。 具有滑动哈希窗口的第一个哈希函数被应用于规范化文本字符串以生成哈希值的数组。 通过对散列值阵列应用第一个过滤器来选择候选锚定点。 通过对候选锚定点应用第二过滤器来选择锚定点。 最后,将第二个散列函数应用于位于选定的锚定点处的子串,以生成用作文档指纹的哈希值。 还公开了其它实施例和方面。

    Cascading security architecture
    23.
    发明授权
    Cascading security architecture 有权
    级联安全架构

    公开(公告)号:US08051487B2

    公开(公告)日:2011-11-01

    申请号:US11413754

    申请日:2006-04-27

    IPC分类号: G06F7/04

    CPC分类号: G06F17/30011 G06F21/554

    摘要: A system and a method are disclosed for sensitive document management. The system includes one or more agents, a behavior analysis engine, a local policy engine, and a local matching service. The method identifies whether a document is sensitive, identifies behaviors applied to the document, determines whether the document contains sensitive information and determines whether to allow the identified behavior to continue based on security policies.

    摘要翻译: 公开了一种用于敏感文件管理的系统和方法。 系统包括一个或多个代理,行为分析引擎,本地策略引擎和本地匹配服务。 该方法识别文档是否敏感,识别应用于文档的行为,确定文档是否包含敏感信息,并确定是否允许基于安全策略继续识别的行为。

    Fingerprinting based entity extraction
    24.
    发明授权
    Fingerprinting based entity extraction 有权
    基于指纹的实体提取

    公开(公告)号:US07950062B1

    公开(公告)日:2011-05-24

    申请号:US11833936

    申请日:2007-08-03

    申请人: Liwei Ren Shu Huang

    发明人: Liwei Ren Shu Huang

    IPC分类号: G06F7/00 G06F17/00

    CPC分类号: G06F21/55

    摘要: A system (and a method) is disclosed for fingerprinting based entity extraction using a rolling hash technique. The system is configured to receive an input stream of a predetermined length comprising characters, and a hash table having indexed entries. The system isolates, through a defined fixed window length, a set of characters of the input stream. A hash key is generated and used to index into the hash table. The system compares the isolated set of characters of the input stream with the entry corresponding to the index into the hash table to determine whether there is an exact match with the entry. The system slides the fixed window length one character to isolate another set of characters of the input stream in response to no exact match from the comparison. Alternatively, the system stores the input stream in response to an exact match from the comparison.

    摘要翻译: 公开了一种使用滚动散列技术进行基于指纹的实体提取的系统(和方法)。 该系统被配置为接收包括字符的预定长度的输入流,以及具有索引条目的哈希表。 系统通过定义的固定窗口长度隔离输入流的一组字符。 生成哈希密钥并用于索引到哈希表。 系统将输入流的隔离字符集与与索引对应的条目与散列表进行比较,以确定是否与该条目完全匹配。 系统将固定窗口长度滑动一个字符,以隔离输入流的另一组字符,以响应于比较中的完全匹配。 或者,系统响应于来自比较的精确匹配来存储输入流。

    Matching engine for querying relevant documents
    25.
    发明申请
    Matching engine for querying relevant documents 有权
    匹配引擎查询相关文档

    公开(公告)号:US20060253439A1

    公开(公告)日:2006-11-09

    申请号:US11361447

    申请日:2006-02-24

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30616 Y10S707/917

    摘要: A system generates an output of documents having with a particular relevance range. The system receives an initial document comprising text, a list of documents for matching, each document comprising text, and a minimum substring match length. The system normalizes the text of the documents of the list of documents. The system searches common sub-strings between the text of the initial document and the text of each document of the list of documents. The system calculates a match percentage based on the search common sub-strings and outputs documents having a match percentage corresponding to a predetermined value. Also disclosed is a process for generating an output of documents within a particular relevance range.

    摘要翻译: 系统产生具有特定相关性范围的文档的输出。 系统接收包括文本的初始文档,用于匹配的文档的列表,每个文档包括文本和最小子串匹配长度。 系统将文档列表的文档的文本归一化。 系统在初始文档的文本和文档列表中的每个文档的文本之间搜索公共子字符串。 系统基于搜索公共子串计算匹配百分比,并输出具有与预定值相对应的匹配百分比的文档。 还公开了用于在特定相关范围内生成文档的输出的过程。

    File differencing and updating engines
    26.
    发明申请
    File differencing and updating engines 审中-公开
    文件差异和更新引擎

    公开(公告)号:US20050010576A1

    公开(公告)日:2005-01-13

    申请号:US10616615

    申请日:2003-07-09

    IPC分类号: G06F9/445 G06F17/00

    CPC分类号: G06F8/658

    摘要: A file differencing and updating system is provided that includes a file differencing component and a file updating component. The file differencing component, or file differencing engine, generates a difference file in a first processor-based or computer system from an original or old version and a new version of an electronic file. The file updating component, or file updating engine, generates a copy of the new file on a second processor-based or computer system using the difference file and the hosted copy of the original file.

    摘要翻译: 提供了一种文件差异和更新系统,其中包括文件差异组件和文件更新组件。 文件差分组件或文件差分引擎在第一基于处理器或计算机系统中的原始或旧版本和新版本的电子文件中生成差异文件。 文件更新组件或文件更新引擎使用差异文件和原始文件的托管副本在基于第二处理器或计算机系统上生成新文件的副本。

    Proxy database for element management system of telephone switching network

    公开(公告)号:US06512824B1

    公开(公告)日:2003-01-28

    申请号:US09366238

    申请日:1999-08-03

    IPC分类号: H04M700

    CPC分类号: H04M3/2263

    摘要: An element management system (“EMS”) interfaces between a telephone company computer or a terminal for use by a telephone company system administrator or customer service representative, and a telephone network element such as a central office or a group of central offices. In order to store and process subscriber data during the time periods when a telephone computer system is busy controlling telephone switching functions and therefore giving low priority to such data, the EMS contains a proxy database and maintains it between the periods when access to the telephone computer system is desired and available, without detrimentally involving the telephone computer system, the telephone switching network or any elements thereof. The EMS is capable of operating as the sole repository of subscriber data in a telephone computer system configured to operate in such an environment. The proxy database effectively mirrors subscriber data in one or more central offices and/or other network elements of the telephone computer system. By utilizing the proxy database, the EMS (i) provides relatively current subscriber data to telephone company personnel and systems, and (ii) accepts telephone network configuration commands from such personnel and systems, communicates the commands to the corresponding network elements, and upon receiving information indicating that the commands have been carried out, provides verification information to the client personnel and systems.

    Apparatus and methods for keyword proximity matching
    28.
    发明授权
    Apparatus and methods for keyword proximity matching 有权
    关键词近距离匹配的装置和方法

    公开(公告)号:US09203623B1

    公开(公告)日:2015-12-01

    申请号:US12642613

    申请日:2009-12-18

    IPC分类号: G06F17/30 H04L9/32

    CPC分类号: H04L9/32 G06F21/56

    摘要: One embodiment relates to an apparatus configured to match a list of keywords against a target document. The apparatus includes data storage configured to store computer-readable instruction code and data, and a processor configured to access the data storage and to execute said computer-readable instruction code. The apparatus further includes a keyword searcher and a keyword object generator. The keyword searcher is configured to receive the list of keywords and a textual string corresponding to the target document file, and search the textual string for instances of the keywords so as to generate a sequence of keyword instances. The keyword object generator implemented using the instruction code and configured to receive the sequence of keyword instances, and generate a keyword object, wherein the keyword object includes a range-dependent match function. Other embodiments and features are also disclosed.

    摘要翻译: 一个实施例涉及被配置为将目标文档的关键字列表进行匹配的装置。 该装置包括被配置为存储计算机可读指令代码和数据的数据存储器,以及被配置为访问数据存储器并执行所述计算机可读指令代码的处理器。 该装置还包括关键字搜索器和关键字对象生成器。 关键字搜索器被配置为接收关键字列表和对应于目标文档文件的文本字符串,并且搜索文本字符串以获得关键字的实例,以便生成关键字实例的序列。 所述关键字对象生成器使用所述指令代码实现并且被配置为接收所述关键字实例的序列,并且生成关键字对象,其中所述关键字对象包括范围相关匹配函数。 还公开了其它实施例和特征。

    Document fingerprinting for mobile phones
    29.
    发明授权
    Document fingerprinting for mobile phones 有权
    手机文件指纹识别

    公开(公告)号:US09146704B1

    公开(公告)日:2015-09-29

    申请号:US13227360

    申请日:2011-09-07

    摘要: One embodiment relates to a method for providing a service which matches document fingerprints against a database of document fingerprints. Target text data on a mobile phone device is obtained, and target document fingerprints are generated for the target text data using a fingerprint generator on the mobile phone device. The target document fingerprints are transmitted to a service cloud. A feedback message is received from the service cloud. The feedback message depends on results from matching the target document fingerprints against the database of document fingerprints. Other embodiments, aspects and features are also disclosed.

    摘要翻译: 一个实施例涉及一种用于提供将文档指纹与文档指纹数据库相匹配的服务的方法。 获得移动电话设备上的文本数据目标,并使用移动电话设备上的指纹生成器为目标文本数据生成目标文档指纹。 将目标文档指纹传输到服务云。 从服务云接收到反馈消息。 反馈消息取决于将目标文档指纹与文档指纹数据库匹配的结果。 还公开了其它实施例,方面和特征。

    Methods and apparatus for generating difference files
    30.
    发明授权
    Methods and apparatus for generating difference files 有权
    用于生成差异文件的方法和装置

    公开(公告)号:US08862555B1

    公开(公告)日:2014-10-14

    申请号:US13108511

    申请日:2011-05-16

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30162

    摘要: One embodiment relates to a computer-implemented method for generating difference data between reference and target files. A difference engine performs a first procedure to generate difference data representing the difference between the reference and target files if the reference and target files are sequences of sorted data records. The first procedure may compare a lexical order of a record from the reference file against a lexical order of a record from the target file. An entry may be added to a copy list if the records are the same, and an entry may be added to an add list if that the record from the reference file is lexically greater than the record from the target file. Another embodiment relates to an apparatus for generating difference data.

    摘要翻译: 一个实施例涉及用于在参考文件和目标文件之间生成差异数据的计算机实现的方法。 如果参考和目标文件是排序数据记录的序列,则差分引擎执行第一过程以产生表示参考文件和目标文件之间的差异的差异数据。 第一个过程可以将来自参考文件的记录的词汇顺序与来自目标文件的记录的词汇顺序进行比较。 如果记录相同,则可以将条目添加到副本列表中,并且如果来自参考文件的记录在词汇上大于来自目标文件的记录,则可以将条目添加到添加列表。 另一实施例涉及用于产生差分数据的装置。