METHOD OF PROCESSING AN IMAGE TO CLARIFY TEXT IN THE IMAGE
    1.
    发明申请
    METHOD OF PROCESSING AN IMAGE TO CLARIFY TEXT IN THE IMAGE 有权
    处理图像以在图像中清除文本的方法

    公开(公告)号:US20120188612A1

    公开(公告)日:2012-07-26

    申请号:US13013890

    申请日:2011-01-26

    IPC分类号: H04N1/40

    摘要: An image file representing at least a portion of a printed document is processed to highlight the differences between foreground material (e.g., text or other characters) from background. The method includes selecting a neighborhood of pixels, determining a weighted average of an attribute values (e.g., luminance) for each pixel, and modifying each pixel's value based on the weighted average. Graylevel scaling, error diffusion, and a bit level conversion are also performed each pixel ends up with either a first attribute value level (e.g., luminance of 0) or a second attribute value level (e.g., luminance of 255).

    摘要翻译: 处理表示打印文档的至少一部分的图像文件以突出显示来自背景的前景材料(例如,文本或其他字符)之间的差异。 该方法包括选择像素的邻域,确定每个像素的属性值(例如,亮度)的加权平均值,以及基于加权平均值修改每个像素的值。 还执行灰度缩放,误差扩散和位电平转换,每个像素以第一属性值级别(例如,亮度为0)或第二属性值级别(例如,亮度为255)结束。

    Method of processing an image to clarify text in the image
    2.
    发明授权
    Method of processing an image to clarify text in the image 有权
    处理图像以澄清图像中的文本的方法

    公开(公告)号:US08705134B2

    公开(公告)日:2014-04-22

    申请号:US13013890

    申请日:2011-01-26

    IPC分类号: H04N1/40 G06K15/00

    摘要: An image file representing at least a portion of a printed document is processed to highlight the differences between foreground material (e.g., text or other characters) from background. The method includes selecting a neighborhood of pixels, determining a weighted average of an attribute values (e.g., luminance) for each pixel, and modifying each pixel's value based on the weighted average. Graylevel scaling, error diffusion, and a bit level conversion are also performed each pixel ends up with either a first attribute value level (e.g., luminance of 0) or a second attribute value level (e.g., luminance of 255).

    摘要翻译: 处理表示打印文档的至少一部分的图像文件以突出显示来自背景的前景材料(例如,文本或其他字符)之间的差异。 该方法包括选择像素的邻域,确定每个像素的属性值(例如,亮度)的加权平均值,以及基于加权平均值修改每个像素的值。 还执行灰度缩放,误差扩散和位电平转换,每个像素以第一属性值级别(例如,亮度为0)或第二属性值级别(例如,亮度为255)结束。

    System and method for identifying and labeling fields of text associated with scanned business documents
    3.
    发明授权
    System and method for identifying and labeling fields of text associated with scanned business documents 有权
    用于识别和标记与扫描的业务文档相关的文本字段的系统和方法

    公开(公告)号:US07965891B2

    公开(公告)日:2011-06-21

    申请号:US12710573

    申请日:2010-02-23

    IPC分类号: G06K9/34

    CPC分类号: G06K9/00469

    摘要: A system for electronically distilling information from a business document uses a network scanner to electronically scan a platen area, having a business document thereon, to create a bitmap. A network server carries out a segmentation process to segment the scan generated bitmap into a bitmap object, the bitmap object corresponding to the scanned business document; a bitmap to text conversion process to convert the bitmap object into a block of text; a semantic recognition process to generate a structured representation of semantic entities corresponding to the scanned business document; and a document generation process to convert the structured representation into a structure text file. The semantic recognition process includes the processes of generating, for each line of text having a keyword therein, a terminal symbol corresponding to the keyword therein; generating, for each line of text not having a keyword therein and absent of numeric characters, an alphabetic terminal symbol; generating, for each line of text not having a keyword therein and having a numeric character therein, an alphanumeric terminal symbol; generating a string of terminal symbols from the generated terminal symbols; determining a probable parsing of the generated string of terminal symbols; labeling each text line, according to a determined function, with non-terminal symbols; and parsing the business document information text into fields of business document information text based upon the non-terminal symbol of each text line and the determined probable parsing of the generated string of terminal symbols.

    摘要翻译: 用于从商业文档电子地蒸馏信息的系统使用网络扫描器来电子扫描其上具有业务文档的压板区域以创建位图。 网络服务器执行分割过程,将扫描生成的位图分割成位图对象,对应于扫描的业务文档的位图对象; 将位图对象转换为文本块的文本转换过程的位图; 语义识别过程,用于生成对应于扫描的业务单据的语义实体的结构化表示; 以及将结构化表示转换成结构文本文件的文档生成处理。 语义识别处理包括对于其中具有关键词的每行文本生成与其中的关键词对应的终端符号的处理; 生成对于其中没有关键字的每行文本和不存在数字字符的字母的终端符号; 为每个不具有关键字的文本行和其中具有数字字符的每行文本生成字母数字终端符号; 从所生成的终端符号生成一串终端符号; 确定所生成的终端符号串的可能解析; 根据确定的功能标记每个文本行,具有非终端符号; 以及基于每个文本行的非终端符号以及确定的所生成的终端符号串的可能解析,将业务文档信息文本解析为商业文档信息文本的字段。

    SYSTEM AND METHOD FOR IDENTIFYING AND LABELING FIELDS OF TEXT ASSOCIATED WITH SCANNED BUSINESS DOCUMENTS
    4.
    发明申请
    SYSTEM AND METHOD FOR IDENTIFYING AND LABELING FIELDS OF TEXT ASSOCIATED WITH SCANNED BUSINESS DOCUMENTS 有权
    用于识别和标记与扫描业务文档相关联的文本字段的系统和方法

    公开(公告)号:US20100149606A1

    公开(公告)日:2010-06-17

    申请号:US12710573

    申请日:2010-02-23

    IPC分类号: G06F17/00 G06K9/34 H04N1/04

    CPC分类号: G06K9/00469

    摘要: A system for electronically distilling information from a business document uses a network scanner to electronically scan a platen area, having a business document thereon, to create a bitmap. A network server carries out a segmentation process to segment the scan generated bitmap into a bitmap object, the bitmap object corresponding to the scanned business document; a bitmap to text conversion process to convert the bitmap object into a block of text; a semantic recognition process to generate a structured representation of semantic entities corresponding to the scanned business document; and a document generation process to convert the structured representation into a structure text file. The semantic recognition process includes the processes of generating, for each line of text having a keyword therein, a terminal symbol corresponding to the keyword therein; generating, for each line of text not having a keyword therein and absent of numeric characters, an alphabetic terminal symbol; generating, for each line of text not having a keyword therein and having a numeric character therein, an alphanumeric terminal symbol; generating a string of terminal symbols from the generated terminal symbols; determining a probable parsing of the generated string of terminal symbols; labeling each text line, according to a determined function, with non-terminal symbols; and parsing the business document information text into fields of business document information text based upon the non-terminal symbol of each text line and the determined probable parsing of the generated string of terminal symbols.

    摘要翻译: 用于从商业文档电子地蒸馏信息的系统使用网络扫描器来电子扫描其上具有业务文档的压板区域以创建位图。 网络服务器执行分割过程,将扫描生成的位图分割成位图对象,对应于扫描的业务文档的位图对象; 将位图对象转换为文本块的文本转换过程的位图; 语义识别过程,用于生成对应于扫描的业务单据的语义实体的结构化表示; 以及将结构化表示转换成结构文本文件的文档生成处理。 语义识别处理包括对于其中具有关键词的每行文本生成与其中的关键词对应的终端符号的处理; 生成对于其中没有关键字的每行文本和不存在数字字符的字母的终端符号; 为每个不具有关键字的文本行和其中具有数字字符的每行文本生成字母数字终端符号; 从所生成的终端符号生成一串终端符号; 确定所生成的终端符号串的可能解析; 根据确定的功能标记每个文本行,具有非终端符号; 以及基于每个文本行的非终端符号以及确定的所生成的终端符号串的可能解析,将业务文档信息文本解析为商业文档信息文本的字段。

    System and method for identifying and labeling fields of text associated with scanned business documents

    公开(公告)号:US07689037B2

    公开(公告)日:2010-03-30

    申请号:US10970930

    申请日:2004-10-22

    IPC分类号: G06K9/34

    CPC分类号: G06K9/00469

    摘要: A system for electronically distilling information from a business document uses a network scanner to electronically scan a platen area, having a business document thereon, to create a bitmap. A network server carries out a segmentation process to segment the scan generated bitmap into a bitmap object, the bitmap object corresponding to the scanned business document; a bitmap to text conversion process to convert the bitmap object into a block of text; a semantic recognition process to generate a structured representation of semantic entities corresponding to the scanned business document; and a document generation process to convert the structured representation into a structure text file. The semantic recognition process includes the processes of generating, for each line of text having a keyword therein, a terminal symbol corresponding to the keyword therein; generating, for each line of text not having a keyword therein and absent of numeric characters, an alphabetic terminal symbol; generating, for each line of text not having a keyword therein and having a numeric character therein, an alphanumeric terminal symbol; generating a string of terminal symbols from the generated terminal symbols; determining a probable parsing of the generated string of terminal symbols; labeling each text line, according to a determined function, with non-terminal symbols; and parsing the business document information text into fields of business document information text based upon the non-terminal symbol of each text line and the determined probable parsing of the generated string of terminal symbols.

    SYSTEM AND METHOD FOR IDENTIFYING AND LABELING FIELDS OF TEXT ASSOCIATED WITH SCANNED BUSINESS DOCUMENTS

    公开(公告)号:US20100150397A1

    公开(公告)日:2010-06-17

    申请号:US12710568

    申请日:2010-02-23

    IPC分类号: G06K9/00 G06K9/34

    CPC分类号: G06K9/00469

    摘要: A system for electronically distilling information from a business document uses a network scanner to electronically scan a platen area, having a business document thereon, to create a bitmap. A network server carries out a segmentation process to segment the scan generated bitmap into a bitmap object, the bitmap object corresponding to the scanned business document; a bitmap to text conversion process to convert the bitmap object into a block of text; a semantic recognition process to generate a structured representation of semantic entities corresponding to the scanned business document; and a document generation process to convert the structured representation into a structure text file. The semantic recognition process includes the processes of generating, for each line of text having a keyword therein, a terminal symbol corresponding to the keyword therein; generating, for each line of text not having a keyword therein and absent of numeric characters, an alphabetic terminal symbol; generating, for each line of text not having a keyword therein and having a numeric character therein, an alphanumeric terminal symbol; generating a string of terminal symbols from the generated terminal symbols; determining a probable parsing of the generated string of terminal symbols; labeling each text line, according to a determined function, with non-terminal symbols; and parsing the business document information text into fields of business document information text based upon the non-terminal symbol of each text line and the determined probable parsing of the generated string of terminal symbols.

    System and method for identifying and labeling fields of text associated with scanned business documents
    7.
    发明授权
    System and method for identifying and labeling fields of text associated with scanned business documents 有权
    用于识别和标记与扫描的业务文档相关的文本字段的系统和方法

    公开(公告)号:US07860312B2

    公开(公告)日:2010-12-28

    申请号:US12710568

    申请日:2010-02-23

    IPC分类号: G06K9/34

    CPC分类号: G06K9/00469

    摘要: A system for electronically distilling information from a business document uses a network scanner to electronically scan a platen area, having a business document thereon, to create a bitmap. A network server carries out a segmentation process to segment the scan generated bitmap into a bitmap object, the bitmap object corresponding to the scanned business document; a bitmap to text conversion process to convert the bitmap object into a block of text; a semantic recognition process to generate a structured representation of semantic entities corresponding to the scanned business document; and a document generation process to convert the structured representation into a structure text file. The semantic recognition process includes the processes of generating, for each line of text having a keyword therein, a terminal symbol corresponding to the keyword therein; generating, for each line of text not having a keyword therein and absent of numeric characters, an alphabetic terminal symbol; generating, for each line of text not having a keyword therein and having a numeric character therein, an alphanumeric terminal symbol; generating a string of terminal symbols from the generated terminal symbols; determining a probable parsing of the generated string of terminal symbols; labeling each text line, according to a determined function, with non-terminal symbols; and parsing the business document information text into fields of business document information text based upon the non-terminal symbol of each text line and the determined probable parsing of the generated string of terminal symbols.

    摘要翻译: 用于从商业文档电子地蒸馏信息的系统使用网络扫描器来电子扫描其上具有业务文档的压板区域以创建位图。 网络服务器执行分割过程,将扫描生成的位图分割成位图对象,对应于扫描的业务文档的位图对象; 将位图对象转换为文本块的文本转换过程的位图; 语义识别过程,用于生成对应于扫描的业务单据的语义实体的结构化表示; 以及将结构化表示转换成结构文本文件的文档生成处理。 语义识别处理包括对于其中具有关键词的每行文本生成与其中的关键词对应的终端符号的处理; 生成对于其中没有关键字的每行文本和不存在数字字符的字母的终端符号; 为每个不具有关键字的文本行和其中具有数字字符的每行文本生成字母数字终端符号; 从所生成的终端符号生成一串终端符号; 确定所生成的终端符号串的可能解析; 根据确定的功能标记每个文本行,具有非终端符号; 以及基于每个文本行的非终端符号以及确定的所生成的终端符号串的可能解析,将业务文档信息文本解析为商业文档信息文本的字段。

    PARKING AVAILABILITY DETECTION WITH HUMAN SENSING
    8.
    发明申请
    PARKING AVAILABILITY DETECTION WITH HUMAN SENSING 审中-公开
    停车可用性检测与人类感觉

    公开(公告)号:US20130117077A1

    公开(公告)日:2013-05-09

    申请号:US13289600

    申请日:2011-11-04

    IPC分类号: G07B15/00

    摘要: Providing means for people to input their observations can reduce the need for sensor deployments because humans have excellent sensing abilities. One thing people tend to observe carefully is parking availability. Parking meters and pay stations can request people to enter their observations of parking availability and other environmental factors. The observations, being numerical in nature, can be processed to determine reasonable parking fees, likelihood of violators in an area, and the statistical, observed, or estimated dispersion of available parking with a geographic region.

    摘要翻译: 为人们提供意见的手段可以减少对传感器部署的需求,因为人类具有出色的感测能力。 人们倾向于仔细观察的一件事是停车可用性。 停车场和付费站可以要求人们输入他们对停车场可用性和其他环境因素的观察。 可以处理数值性质的观察结果,以确定合理的停车费用,违禁者在某地区的可能性,以及可用停车位与地理区域的统计,观察或估计的差异。

    Systems and methods to detect models and accounts with anomalous revenue from color impressions
    9.
    发明授权
    Systems and methods to detect models and accounts with anomalous revenue from color impressions 有权
    用于检测来自彩色印象的异常收入的模型和帐户的系统和方法

    公开(公告)号:US08352298B2

    公开(公告)日:2013-01-08

    申请号:US12702052

    申请日:2010-02-08

    IPC分类号: G06Q10/00

    CPC分类号: G06Q10/10

    摘要: Methods and systems for identifying device models or accounts exhibiting outlying behavior are disclosed. For a method of identifying a device model exhibiting outlying behavior, a processor may receive a color impression count, a monochrome impression count and either a device model for each of a plurality of devices. A proportion of color revenue may be determined for each device based on the color impression count and the monochrome impression count. The processor may determine, for each device model, a distribution of the proportion of color revenue for the one or more devices having the device model and may automatically identify one or more distributions of the proportion of color revenue exhibiting outlying behavior. Each distribution is associated with a device model.

    摘要翻译: 公开了用于识别显示偏离行为的装置模型或帐户的方法和系统。 对于识别具有偏离行为的设备模型的方法,处理器可以接收彩色印象计数,单色印象计数和针对多个设备中的每一个的设备模型。 可以基于颜色展示计数和单色印象计数来确定每个设备的颜色收入的一部分。 处理器可以针对每个设备模型确定具有设备模型的一个或多个设备的颜色收入的比例的分布,并且可以自动识别表现出偏离行为的颜色收入的比例的一个或多个分布。 每个分发与设备模型相关联。

    Method of analyzing documents
    10.
    发明授权
    Method of analyzing documents 有权
    文件分析方法

    公开(公告)号:US08266077B2

    公开(公告)日:2012-09-11

    申请号:US12241903

    申请日:2008-09-30

    申请人: John C. Handley

    发明人: John C. Handley

    IPC分类号: G06F17/00 G06N5/00

    摘要: A method for analyzing documents is disclosed. The method compares concepts consisting of groups of terms for similarity within a corpus of document, clusters documents that contain certain concept term sets together. It may also rank the documents within each cluster according to the frequency of term co-occurrence within the concepts.

    摘要翻译: 公开了分析文件的方法。 该方法比较了由文档语料库中的相似性组合组成的概念,将包含某些概念术语集的文档集成在一起。 它也可以根据概念内的术语共现的频率对每个集群内的文档进行排序。