On-the-fly pattern recognition with configurable bounds
    11.
    发明授权
    On-the-fly pattern recognition with configurable bounds 有权
    具有可配置边界的动态模式识别

    公开(公告)号:US08370374B1

    公开(公告)日:2013-02-05

    申请号:US13196480

    申请日:2011-08-02

    IPC分类号: G06F7/00 G06F17/30

    摘要: Some embodiments of on-the-fly pattern recognition with configurable bounds have been presented. In one embodiment, a pattern matching engine is configured based on user input, which may include values of one or more user configurable bounds on searching. Then the configured pattern matching engine is used to search for a set of features in an incoming string. A set of scores is updated based on the presence of any of the features in the string while searching for the features. Each score may indicate a likelihood of the content of the string being in a category. The search is terminated if the end of the string is reached or if the user configurable bounds are met. After terminating the search, the scores are output.

    摘要翻译: 已经提出了具有可配置界限的动态模式识别的一些实施例。 在一个实施例中,模式匹配引擎被配置为基于用户输入,其可以包括搜索上的一个或多个用户可配置边界的值。 然后,配置的模式匹配引擎用于搜索传入字符串中的一组要素。 基于在搜索特征时字符串中的任何特征的存在来更新一组分数。 每个分数可以指示字符串的内容在类别中的可能性。 如果达到字符串的结尾或满足用户可配置的界限,则搜索终止。 结束搜索后,输出得分。

    Method and apparatus for identifying data patterns in a file
    13.
    发明授权
    Method and apparatus for identifying data patterns in a file 有权
    用于识别文件中的数据模式的方法和装置

    公开(公告)号:US07835361B1

    公开(公告)日:2010-11-16

    申请号:US11112252

    申请日:2005-04-21

    摘要: A method and apparatus for identifying data patterns of a file are described herein. In one embodiment, an exemplary process includes, but is not limited to, receiving a data packet of a data stream containing a file segment of a file originated from an external host and destined to a protected host of a local area network (LAN), the file being transmitted via multiple file segments contained in multiple data packets of the data stream, and performing a data pattern analysis on the received data packet to determine whether the received data packet contains a predetermined data pattern, without waiting for a remainder of the data stream to arrive. Other methods and apparatuses are also described.

    摘要翻译: 本文描述了用于识别文件的数据模式的方法和装置。 在一个实施例中,示例性过程包括但不限于接收包含源自外部主机并发往局域网(LAN)的受保护主机的文件的文件段的数据流的数据分组, 所述文件通过包含在所述数据流的多个数据分组中的多个文件段进行传输,并且对所接收的数据分组执行数据模式分析,以确定所接收的数据分组是否包含预定的数据模式,而不等待剩余的数据 流到达。 还描述了其它方法和装置。

    Method and an apparatus to store content rating information
    14.
    发明授权
    Method and an apparatus to store content rating information 有权
    存储内容分级信息的方法和装置

    公开(公告)号:US07769766B1

    公开(公告)日:2010-08-03

    申请号:US10853447

    申请日:2004-05-24

    IPC分类号: G06F7/00 G06F17/30

    CPC分类号: G06F17/30899

    摘要: A method and an apparatus to store content rating information have been disclosed. In one embodiment, the method includes receiving a user request to access a web page, sending a domain name system (DNS) request to a first one of a plurality of DNS servers from a content filtering client to get content rating information of the web page in response to the user request, and receiving from the first one DNS server a DNS response containing the content rating information to the content filtering client. Other embodiments have been claimed and described.

    摘要翻译: 已经公开了存储内容分级信息的方法和装置。 在一个实施例中,该方法包括接收访问网页的用户请求,从内容过滤客户端向多个DNS服务器中的第一个发送域名系统(DNS)请求,以获得网页的内容分级信息 响应于所述用户请求,以及从所述第一DNS服务器向所述内容过滤客户端接收包含所述内容评级信息的DNS响应。 已经要求和描述了其它实施例。

    Method and apparatus for identifying data patterns in a file
    16.
    发明授权
    Method and apparatus for identifying data patterns in a file 有权
    用于识别文件中的数据模式的方法和装置

    公开(公告)号:US08584238B1

    公开(公告)日:2013-11-12

    申请号:US13587748

    申请日:2012-08-16

    IPC分类号: H04L29/06

    摘要: A method and apparatus for identifying data patterns of a file are described herein. In one embodiment, an exemplary process includes, but is not limited to, receiving a data packet of a data stream containing a file segment of a file originated from an external host and destined to a protected host of a local area network (LAN), the file being transmitted via multiple file segments contained in multiple data packets of the data stream, and performing a data pattern analysis on the received data packet to determine whether the received data packet contains a predetermined data pattern, without waiting for a remainder of the data stream to arrive. Other methods and apparatuses are also described.

    摘要翻译: 本文描述了用于识别文件的数据模式的方法和装置。 在一个实施例中,示例性过程包括但不限于接收包含源自外部主机并发往局域网(LAN)的受保护主机的文件的文件段的数据流的数据分组, 所述文件通过包含在所述数据流的多个数据分组中的多个文件段进行传输,并且对所接收的数据分组执行数据模式分析,以确定所接收的数据分组是否包含预定的数据模式,而不等待剩余的数据 流到达。 还描述了其它方法和装置。

    Efficient string search
    17.
    发明授权
    Efficient string search 有权
    高效的字符串搜索

    公开(公告)号:US08577669B1

    公开(公告)日:2013-11-05

    申请号:US13335743

    申请日:2011-12-22

    IPC分类号: G06F17/28

    摘要: Some embodiments of an efficient string search have been presented. In one embodiment, a string of bytes representing content written in a non-delimited language is received, wherein the content has been classified into a predetermined category. In a single pass through the string of bytes, a set of N-grams is searched for simultaneously. Statistical information on occurrences of the N-grams, if any, in the string of bytes is collected. In some embodiments, a model is generated based on the statistical information, where the model is usable by a content filter to classify content.

    摘要翻译: 已经提出了有效的字符串搜索的一些实施例。 在一个实施例中,接收表示以非分隔语言编写的内容的字节串,其中内容已被分类为预定类别。 在通过字符串的单次传递中,同时搜索一组N-gram。 收集字节串中N-gram出现的统计信息(如果有的话)。 在一些实施例中,基于统计信息生成模型,其中模型可由内容过滤器用于对内容进行分类。

    Training procedure for N-gram-based statistical content classification
    19.
    发明授权
    Training procedure for N-gram-based statistical content classification 有权
    基于N-gram的统计内容分类的训练程序

    公开(公告)号:US07917522B1

    公开(公告)日:2011-03-29

    申请号:US12822439

    申请日:2010-06-24

    IPC分类号: G06F7/00 G06F17/30

    CPC分类号: G06F17/30705

    摘要: A training procedure for N-gram based statistical document classification has been disclosed. In one embodiment, a set of N-grams is selected out of a second set of N-grams, each of the N-grams having a sequence of N bytes, where N is an integer. Then a statistical content classification model is generated based on occurrences of the N-grams, if any, in a set of training documents and a set of validation documents. The statistical content classification model is provided to content filters to classify content.

    摘要翻译: 已经公开了基于N-gram的统计文件分类的训练程序。 在一个实施例中,从第二组N-gram中选出一组N克,每个N克具有N个字节的序列,其中N是整数。 然后,根据一组训练文件和一组验证文件中的N-gram的出现(如果有的话)生成统计内容分类模型。 统计内容分类模型提供给内容过滤器以对内容进行分类。