-
公开(公告)号:US20130145255A1
公开(公告)日:2013-06-06
申请号:US13817366
申请日:2010-08-20
申请人: Li-Wei Zheng , Jian-Ming Jin , Suk Hwan Lim , Jian Fan , Hui-Man Hou , Shi-Jun Tian
发明人: Li-Wei Zheng , Jian-Ming Jin , Suk Hwan Lim , Jian Fan , Hui-Man Hou , Shi-Jun Tian
IPC分类号: G06F17/21
CPC分类号: G06F17/211 , G06F16/9535 , G06F16/986
摘要: A system and method for selectively filtering web page contents are disclosed. In one example embodiment a document object model (DOM) structure and visual information of the web page contents are generated. The document object model (DOM) structure and the visual information are analyzed to determine multiple web page content attributes. One or more filtering parameters are selected from the multiple web page content attributes. The web page is filtered based on the one or more filtering parameters.
摘要翻译: 公开了一种用于选择性地过滤网页内容的系统和方法。 在一个示例实施例中,生成文档对象模型(DOM)结构和网页内容的视觉信息。 分析文档对象模型(DOM)结构和视觉信息以确定多个网页内容属性。 从多个网页内容属性中选择一个或多个过滤参数。 基于一个或多个过滤参数对网页进行过滤。
-
公开(公告)号:US08867837B2
公开(公告)日:2014-10-21
申请号:US13812421
申请日:2010-07-30
申请人: Hui-Man Hou , Li-Wei Zheng , Jian-Ming Jin , Jian Fan , Suk Hwan Lim
发明人: Hui-Man Hou , Li-Wei Zheng , Jian-Ming Jin , Jian Fan , Suk Hwan Lim
IPC分类号: G06K9/34 , C07D309/28 , G06K9/00
CPC分类号: G06K9/00463 , C07D309/28
摘要: A system and method of detecting separator lines in a web page may include determining coordinates of visible web elements on a web page, generating an edge image of the web page based on the coordinates of the web elements, filtering edges belonging to non-separator line elements within the edge image, detecting horizontal lines within the edge image, detecting vertical lines within the edge image, and filtering short lines within the edge image. A system for detecting separator lines in a web page may include a memory device, and a processor communicatively coupled to the memory, in which the processor determines coordinates of visible web elements on a web page, generates an edge image of the web page based on the coordinates of the web elements, filters edges belonging to non-separator line elements within the edge image, detects horizontal lines within the edge image, detects vertical lines within the edge image, and filters short lines within the edge image.
摘要翻译: 检测网页中的分隔线的系统和方法可以包括确定网页上的可视网页元素的坐标,基于网页元素的坐标生成网页的边缘图像,过滤属于非分隔线的边 边缘图像内的元素,检测边缘图像内的水平线,检测边缘图像内的垂直线,以及过滤边缘图像内的短线。 用于检测网页中的分隔线的系统可以包括存储器设备和通信地耦合到存储器的处理器,其中处理器确定网页上的可视网页元素的坐标,基于网页生成网页的边缘图像 网页元素的坐标,属于边缘图像内的非分隔线元素的滤镜边缘,检测边缘图像内的水平线,检测边缘图像内的垂直线,并对边缘图像内的短线进行滤波。
-
3.
公开(公告)号:US20130124684A1
公开(公告)日:2013-05-16
申请号:US13812092
申请日:2010-07-30
申请人: Li-Wei Zheng , Jian Fan , Hui-Man Hou , Jian Ming Jin , Suk Hwan Lim
发明人: Li-Wei Zheng , Jian Fan , Hui-Man Hou , Jian Ming Jin , Suk Hwan Lim
IPC分类号: H04L29/08
CPC分类号: H04L29/0809 , G06F17/272
摘要: A method for detection of visual separators in web pages using code analysis includes receiving a web page and its associated web code by a web page analysis device and analyzing the web code to detect visual separators in the web page. A web page analysis device for visual separator detection in web pages is also provided.
摘要翻译: 使用代码分析来检测网页中的视觉分离器的方法包括通过网页分析设备接收网页及其相关联的网络代码,并分析网页代码以检测网页中的可视分隔符。 还提供了用于网页中的视觉分离器检测的网页分析装置。
-
公开(公告)号:US20130212498A1
公开(公告)日:2013-08-15
申请号:US13812800
申请日:2010-07-30
申请人: Suk Hwan Lim , Hui-Man Hou , Li-Wei Zheng , Jian-Ming Jin , Marie Bird Struckaman , Rachel L. Ramaswami , Hua Zhang , Yue Yuan
发明人: Suk Hwan Lim , Hui-Man Hou , Li-Wei Zheng , Jian-Ming Jin , Marie Bird Struckaman , Rachel L. Ramaswami , Hua Zhang , Yue Yuan
IPC分类号: G06F3/0481
CPC分类号: G06F3/0481 , G06F16/9577
摘要: A system and method of selecting content within a web page (110, 300) may include, with a processor (125), determining spatial coordinates of a plurality of nodes (210 through 285) within the web page (110, 300), recording coordinates of a drawn portion (610) of the web page (110, 300), and determining, with the processor (125), a number of corresponding regions (710, 910) for the drawn portion (610) of the web page (110, 300) based on the spatial coordinates of the nodes (210 through 285).
摘要翻译: 在网页(110,300)内选择内容的系统和方法可以包括:处理器(125)确定网页(110,300)内的多个节点(210至285)的空间坐标,记录 网页(110,300)的绘制部分(610)的坐标,以及使用处理器(125)确定网页的绘制部分(610)的多个相应区域(710,910) 基于所述节点(210至285)的空间坐标来生成。
-
公开(公告)号:US20130204867A1
公开(公告)日:2013-08-08
申请号:US13812434
申请日:2010-07-30
申请人: Suk Hwan Lim , Li-Wei Zheng , Jian-Ming Jin , Hui-Man Hou
发明人: Suk Hwan Lim , Li-Wei Zheng , Jian-Ming Jin , Hui-Man Hou
IPC分类号: G06F17/30
CPC分类号: G06F16/24578 , G06F17/2745
摘要: A system and method for selecting main content (350) from web pages includes receiving a web page (205) by a web page analysis device (105) and scoring sub-trees (209) within the web page (205). The single sub-tree (225) with the highest final score is selected as the main content (350) of the webpage (205).
摘要翻译: 用于从网页选择主内容(350)的系统和方法包括:通过网页分析设备(105)接收网页(205),并且在所述网页(205)内评分子树(209)。 选择具有最高分数的单个子树(225)作为网页的主要内容(350)(205)。
-
公开(公告)号:US08560940B2
公开(公告)日:2013-10-15
申请号:US13220351
申请日:2011-08-29
申请人: Hui-Man Hou , Jian-Ming Jin , Li-Mei Jiao , Suk Hwan Lim
发明人: Hui-Man Hou , Jian-Ming Jin , Li-Mei Jiao , Suk Hwan Lim
IPC分类号: G06F17/00
CPC分类号: G06F17/30536 , G06F17/2247 , G06F17/30702
摘要: An exemplary embodiment of the present may generate a DOM-tree and generate a signal based on the DOM-tree and a node list. The signal may be analyzed and nodes may be selected within the signal to form a periodic wave. Repeat patterns may be detected using the periodic wave and the nodes.
摘要翻译: 本示例性实施例可以生成DOM树并且基于DOM树和节点列表生成信号。 可以分析信号并且可以在信号内选择节点以形成周期波。 可以使用周期波和节点来检测重复模式。
-
公开(公告)号:US09047653B2
公开(公告)日:2015-06-02
申请号:US13818460
申请日:2010-08-24
申请人: Hui-Man Hou , Jian-Ming Jin , Yuhong Xiong
发明人: Hui-Man Hou , Jian-Ming Jin , Yuhong Xiong
CPC分类号: G06T5/00 , G06T3/4038 , G06T5/50 , G06T2207/20221
摘要: Disclosed is a method of blending stitched document image portions. The method identifies background pixels and foreground pixels on each boundary of the image portions. Pixels of the image portions are then modified based on a pixel value difference between corresponding background pixels on the respective boundary of the first and second portions.
摘要翻译: 公开了一种混合缝合的文档图像部分的方法。 该方法识别图像部分的每个边界上的背景像素和前景像素。 然后基于第一和第二部分的相应边界上的对应背景像素之间的像素值差来修改图像部分的像素。
-
公开(公告)号:US08856247B2
公开(公告)日:2014-10-07
申请号:US13258466
申请日:2009-08-18
申请人: Jian-Ming Jin , Yuhong Xiong , Hui-Man Hou , Wei Liu
发明人: Jian-Ming Jin , Yuhong Xiong , Hui-Man Hou , Wei Liu
CPC分类号: H04L12/5855 , H04L51/14
摘要: Proposed is the use of an email-stamp for representing an email address. By comprising information about one or more email addresses of a recipient, an email stamp may be processed in accordance with an optical recognition process so as to identify the email address of the recipient and enable an email to be automatically sent to the recipient.
摘要翻译: 建议使用电子邮件标签来表示电子邮件地址。 通过包括关于接收者的一个或多个电子邮件地址的信息,可以根据光学识别过程来处理电子邮件戳记,以便识别接收者的电子邮件地址,并使电子邮件能够自动发送给接收者。
-
公开(公告)号:US20130155463A1
公开(公告)日:2013-06-20
申请号:US13812104
申请日:2009-07-30
申请人: Jian-Ming Jin , Liwei Zheng , Xi Wang Zhuang , Suk Hvan Lim , Hui-Man Hou
发明人: Jian-Ming Jin , Liwei Zheng , Xi Wang Zhuang , Suk Hvan Lim , Hui-Man Hou
IPC分类号: G06F3/0484 , G06F3/12
CPC分类号: G06F3/04842 , G06F3/12 , G06F17/272
摘要: A method for selecting user desirable content from web pages includes receiving a web page, representing the web page as a Document Object Module (DOM) tree, computing visual and coordinate information of each Document Object Module (DOM) node within the Document Object Module (DOM) tree, determining the desirable Document Object Module (DOM) path, determining the desirable Document Object Module (DOM) node from the desirable Document Object Module (DOM) path, and selecting a single Document Object Module (DOM) node with the highest final score. The single Document Object Module (DOM) node with the highest final score is selected as the user desirable content of the webpage.
摘要翻译: 从网页中选择用户期望的内容的方法包括:接收网页,将网页表示为文档对象模块(DOM)树,计算文档对象模块(Document Object Module,DOM)模块内的每个文档对象模块(DOM)节点的视觉和坐标信息 DOM)树,确定期望的文档对象模块(DOM)路径,从期望的文档对象模块(DOM)路径确定期望的文档对象模块(DOM)节点,并且选择具有最高级别的文档对象模块(DOM)节点 最终得分。 选择具有最高分数的单个文档对象模块(DOM)节点作为用户期望的网页内容。
-
公开(公告)号:US20130031461A1
公开(公告)日:2013-01-31
申请号:US13220351
申请日:2011-08-29
申请人: Hui-Man Hou , Jian-Ming Jin , Li-Mei Jiao , Suk Hwan Lin
发明人: Hui-Man Hou , Jian-Ming Jin , Li-Mei Jiao , Suk Hwan Lin
IPC分类号: G06F17/00
CPC分类号: G06F17/30536 , G06F17/2247 , G06F17/30702
摘要: An exemplary embodiment of the present may generate a DOM-tree and generate a signal based on the DOM-tree and a node list. The signal may be analyzed and nodes may be selected within the signal to form a periodic wave. Repeat patterns may be detected using the periodic wave and the nodes.
摘要翻译: 本示例性实施例可以生成DOM树并且基于DOM树和节点列表生成信号。 可以分析信号并且可以在信号内选择节点以形成周期波。 可以使用周期波和节点来检测重复模式。
-
-
-
-
-
-
-
-
-