专利检索 ap:("Suk Hwan Lim" OR "Li-Wei Zheng" OR "Jian-Ming Jin" OR "Hui-Man Hou") AND inv:"Suk Hwan Lim" 第 1 页

1.

发明申请
Selecting Content Within a Web Page 审中-公开
标题翻译：在网页内选择内容

公开(公告)号：US20130212498A1

公开(公告)日：2013-08-15

申请号：US13812800

申请日：2010-07-30

申请人： Suk Hwan Lim , Hui-Man Hou , Li-Wei Zheng , Jian-Ming Jin , Marie Bird Struckaman , Rachel L. Ramaswami , Hua Zhang , Yue Yuan

发明人： Suk Hwan Lim , Hui-Man Hou , Li-Wei Zheng , Jian-Ming Jin , Marie Bird Struckaman , Rachel L. Ramaswami , Hua Zhang , Yue Yuan

IPC分类号： G06F3/0481

CPC分类号： G06F3/0481 , G06F16/9577

摘要： A system and method of selecting content within a web page (110, 300) may include, with a processor (125), determining spatial coordinates of a plurality of nodes (210 through 285) within the web page (110, 300), recording coordinates of a drawn portion (610) of the web page (110, 300), and determining, with the processor (125), a number of corresponding regions (710, 910) for the drawn portion (610) of the web page (110, 300) based on the spatial coordinates of the nodes (210 through 285).

摘要翻译： 在网页（110,300）内选择内容的系统和方法可以包括：处理器（125）确定网页（110,300）内的多个节点（210至285）的空间坐标，记录网页（110,300）的绘制部分（610）的坐标，以及使用处理器（125）确定网页的绘制部分（610）的多个相应区域（710,910）基于所述节点（210至285）的空间坐标来生成。

2.

发明申请
Selection of Main Content in Web Pages 审中-公开
标题翻译：网页主要内容的选择

公开(公告)号：US20130204867A1

公开(公告)日：2013-08-08

申请号：US13812434

申请日：2010-07-30

申请人： Suk Hwan Lim , Li-Wei Zheng , Jian-Ming Jin , Hui-Man Hou

发明人： Suk Hwan Lim , Li-Wei Zheng , Jian-Ming Jin , Hui-Man Hou

IPC分类号： G06F17/30

CPC分类号： G06F16/24578 , G06F17/2745

摘要： A system and method for selecting main content (350) from web pages includes receiving a web page (205) by a web page analysis device (105) and scoring sub-trees (209) within the web page (205). The single sub-tree (225) with the highest final score is selected as the main content (350) of the webpage (205).

摘要翻译： 用于从网页选择主内容（350）的系统和方法包括：通过网页分析设备（105）接收网页（205），并且在所述网页（205）内评分子树（209）。选择具有最高分数的单个子树（225）作为网页的主要内容（350）（205）。

3.

发明申请
SYSTEMS AND METHODS FOR FILTERING WEB PAGE CONTENTS 审中-公开
标题翻译：用于过滤网页内容的系统和方法

公开(公告)号：US20130145255A1

公开(公告)日：2013-06-06

申请号：US13817366

申请日：2010-08-20

申请人： Li-Wei Zheng , Jian-Ming Jin , Suk Hwan Lim , Jian Fan , Hui-Man Hou , Shi-Jun Tian

发明人： Li-Wei Zheng , Jian-Ming Jin , Suk Hwan Lim , Jian Fan , Hui-Man Hou , Shi-Jun Tian

IPC分类号： G06F17/21

CPC分类号： G06F17/211 , G06F16/9535 , G06F16/986

摘要： A system and method for selectively filtering web page contents are disclosed. In one example embodiment a document object model (DOM) structure and visual information of the web page contents are generated. The document object model (DOM) structure and the visual information are analyzed to determine multiple web page content attributes. One or more filtering parameters are selected from the multiple web page content attributes. The web page is filtered based on the one or more filtering parameters.

摘要翻译： 公开了一种用于选择性地过滤网页内容的系统和方法。在一个示例实施例中，生成文档对象模型（DOM）结构和网页内容的视觉信息。分析文档对象模型（DOM）结构和视觉信息以确定多个网页内容属性。从多个网页内容属性中选择一个或多个过滤参数。基于一个或多个过滤参数对网页进行过滤。

4.

发明授权
Detecting separator lines in a web page 有权
标题翻译：检测网页中的分隔线

公开(公告)号：US08867837B2

公开(公告)日：2014-10-21

申请号：US13812421

申请日：2010-07-30

申请人： Hui-Man Hou , Li-Wei Zheng , Jian-Ming Jin , Jian Fan , Suk Hwan Lim

发明人： Hui-Man Hou , Li-Wei Zheng , Jian-Ming Jin , Jian Fan , Suk Hwan Lim

IPC分类号： G06K9/34 , C07D309/28 , G06K9/00

CPC分类号： G06K9/00463 , C07D309/28

摘要： A system and method of detecting separator lines in a web page may include determining coordinates of visible web elements on a web page, generating an edge image of the web page based on the coordinates of the web elements, filtering edges belonging to non-separator line elements within the edge image, detecting horizontal lines within the edge image, detecting vertical lines within the edge image, and filtering short lines within the edge image. A system for detecting separator lines in a web page may include a memory device, and a processor communicatively coupled to the memory, in which the processor determines coordinates of visible web elements on a web page, generates an edge image of the web page based on the coordinates of the web elements, filters edges belonging to non-separator line elements within the edge image, detects horizontal lines within the edge image, detects vertical lines within the edge image, and filters short lines within the edge image.

摘要翻译： 检测网页中的分隔线的系统和方法可以包括确定网页上的可视网页元素的坐标，基于网页元素的坐标生成网页的边缘图像，过滤属于非分隔线的边边缘图像内的元素，检测边缘图像内的水平线，检测边缘图像内的垂直线，以及过滤边缘图像内的短线。用于检测网页中的分隔线的系统可以包括存储器设备和通信地耦合到存储器的处理器，其中处理器确定网页上的可视网页元素的坐标，基于网页生成网页的边缘图像网页元素的坐标，属于边缘图像内的非分隔线元素的滤镜边缘，检测边缘图像内的水平线，检测边缘图像内的垂直线，并对边缘图像内的短线进行滤波。

5.

发明申请
VISUAL SEPARATOR DETECTION IN WEB PAGES USING CODE ANALYSIS 审中-公开
标题翻译：使用代码分析的WEB页面中的视觉分离器检测

公开(公告)号：US20130124684A1

公开(公告)日：2013-05-16

申请号：US13812092

申请日：2010-07-30

申请人： Li-Wei Zheng , Jian Fan , Hui-Man Hou , Jian Ming Jin , Suk Hwan Lim

发明人： Li-Wei Zheng , Jian Fan , Hui-Man Hou , Jian Ming Jin , Suk Hwan Lim

IPC分类号： H04L29/08

CPC分类号： H04L29/0809 , G06F17/272

摘要： A method for detection of visual separators in web pages using code analysis includes receiving a web page and its associated web code by a web page analysis device and analyzing the web code to detect visual separators in the web page. A web page analysis device for visual separator detection in web pages is also provided.

摘要翻译： 使用代码分析来检测网页中的视觉分离器的方法包括通过网页分析设备接收网页及其相关联的网络代码，并分析网页代码以检测网页中的可视分隔符。还提供了用于网页中的视觉分离器检测的网页分析装置。

6.

发明授权
Detecting repeat patterns on a web page using signals 有权
标题翻译：使用信号检测网页上的重复模式

公开(公告)号：US08560940B2

公开(公告)日：2013-10-15

申请号：US13220351

申请日：2011-08-29

申请人： Hui-Man Hou , Jian-Ming Jin , Li-Mei Jiao , Suk Hwan Lim

发明人： Hui-Man Hou , Jian-Ming Jin , Li-Mei Jiao , Suk Hwan Lim

IPC分类号： G06F17/00

CPC分类号： G06F17/30536 , G06F17/2247 , G06F17/30702

摘要： An exemplary embodiment of the present may generate a DOM-tree and generate a signal based on the DOM-tree and a node list. The signal may be analyzed and nodes may be selected within the signal to form a periodic wave. Repeat patterns may be detected using the periodic wave and the nodes.

摘要翻译： 本示例性实施例可以生成DOM树并且基于DOM树和节点列表生成信号。可以分析信号并且可以在信号内选择节点以形成周期波。可以使用周期波和节点来检测重复模式。

7.

发明申请
Extraction of Content from a Web Page 审中-公开
标题翻译：从网页提取内容

公开(公告)号：US20130283148A1

公开(公告)日：2013-10-24

申请号：US13817656

申请日：2010-10-26

申请人： Suk Hwan Lim , Jian-Ming Jin , Li-Wei Zheng , Jian Fan , Eamonn O'Brien-Strain , Parag Joshi

发明人： Suk Hwan Lim , Jian-Ming Jin , Li-Wei Zheng , Jian Fan , Eamonn O'Brien-Strain , Parag Joshi

IPC分类号： G06F17/22

CPC分类号： G06F17/2247 , G06F16/986

摘要： A system and method are provided for extracting main content from a web page. Web page segmentation is performed on a web page to provide affinity-grouped segments. Descriptive features of at least one of the affinity-grouped segments are computed. At least one of the affinity-grouped segments is classified as a main body segment based on the computed descriptive features. Additional affinity-grouped segments are classified as to a document function based on the computed descriptive features. Classified affinity-grouped segments are assembled according to their classified document functions to provide the main content.

摘要翻译： 提供了一种用于从网页提取主要内容的系统和方法。在网页上执行网页分割以提供关联分组的段。计算至少一个亲和力分组段的描述性特征。基于所计算的描述特征，至少一个亲和度分组的段被分类为主体段。基于所计算的描述特征，附加的亲和组合段被分类为文档功能。分类的亲和度分组段根据其分类的文档功能进行组装以提供主要内容。

8.

发明申请
DETERMIINING SIMILARITY BETWEEN ELEMENTS OF AN ELECTRONIC DOCUMENT 审中-公开
标题翻译：消除电子文件元素之间的相似性

公开(公告)号：US20130091150A1

公开(公告)日：2013-04-11

申请号：US13805212

申请日：2010-06-30

申请人： Jian-Ming Jin , Suk Hwan Lim , Li-Wei Zheng , Jian Fan , Eamonn O'Brien-Strain , Yuhong Xiong , Jerry J. Liu

发明人： Jian-Ming Jin , Suk Hwan Lim , Li-Wei Zheng , Jian Fan , Eamonn O'Brien-Strain , Yuhong Xiong , Jerry J. Liu

IPC分类号： G06F17/30

CPC分类号： G06F16/24578 , G06F16/951

摘要： Disclosed is a computer-implemented method of determining smarty between first and second elements of an electronic document. The method uses a computer to calculate a plurality of measures of similarity between the first and second elements in at least two representations of the electronic document. A computer program product and system implementing this method are also disclosed.

摘要翻译： 公开了一种计算机实现的确定电子文档的第一和第二元素之间的智能的方法。该方法使用计算机来计算电子文档的至少两个表示中的第一和第二元素之间的多个相似度量度。还公开了一种实现该方法的计算机程序产品和系统。

9.

发明申请
SYSTEM AND METHOD FOR WEB PAGE SEGMENTATION USING ADAPTIVE THRESHOLD COMPUTATION 审中-公开
标题翻译：使用自适应阈值计算的网页分段的系统和方法

公开(公告)号：US20130061132A1

公开(公告)日：2013-03-07

申请号：US13696625

申请日：2010-05-19

申请人： Li-Wei Zheng , Jian-Ming Jin , Suk Hwan Lim , Yuhong Xiong , Jerry J. Liu

发明人： Li-Wei Zheng , Jian-Ming Jin , Suk Hwan Lim , Yuhong Xiong , Jerry J. Liu

IPC分类号： G06F17/00

CPC分类号： G06F17/2241 , G06K9/00449 , G06K9/325

摘要： A system and method for an adaptive threshold Web Page segmenting is disclosed. In one embodiment, a method performed by a physical computing system having one or more processors for segmenting a Web page including a plurality of nodes includes parsing content in the Web page into the plurality of nodes using the physical computing system, obtaining feature values between each pair of nodes using the physical computing system, estimating an adaptive threshold value using the obtained feature values using the physical computing system, and segmenting the Web page by comparing the feature values associated with each pair of nodes with the estimated adaptive threshold value.

摘要翻译： 公开了一种用于自适应阈值网页分割的系统和方法。在一个实施例中，具有用于分割包括多个节点的网页的一个或多个处理器的物理计算系统执行的方法包括使用物理计算系统将网页中的内容解析为多个节点，从而获得每个使用所述物理计算系统的一对节点，使用所述物理计算系统使用所获得的特征值来估计自适应阈值，以及通过将与每对节点相关联的特征值与所估计的自适应阈值进行比较来分割所述网页。

10.

发明申请
Segmenting a Web Page into Coherent Functional Blocks 审中-公开
标题翻译：将网页分割成相干功能块

公开(公告)号：US20130275854A1

公开(公告)日：2013-10-17

申请号：US13635410

申请日：2010-04-19

申请人： Suk Hwan Lim , Jian-Ming Jin , Li-Wei Zheng , Eamonn O'Brien-Strain , Jian Fan

发明人： Suk Hwan Lim , Jian-Ming Jin , Li-Wei Zheng , Eamonn O'Brien-Strain , Jian Fan

IPC分类号： G06F17/22

CPC分类号： G06F17/2247 , G06F17/2705

摘要： Segmenting a web page (110) into coherent function blocks (705-1 to 705-8) includes parsing content from the web page (110) into multiple coherent, collectively exhaustive nodes (405-1 to 405-37); calculating at least one matrix (500, 600, 605-1 to 605-4) of affinity values between each of the nodes (405-1 to 405-37); and clustering the nodes (405-1 to 405-37) into functional blocks (705-1 to 705-8) based on the affinity values in the at least one matrix (500, 600, 605-1 to 605-4).

摘要翻译： 将网页（110）分段成相干功能块（705-1至705-8）包括将来自网页（110）的内容解析为多个相干，共同穷举的节点（405-1至405-37）; 计算每个节点（405-1至405-37）之间的亲和度值的至少一个矩阵（500,600,605-1至605-4）; 以及基于所述至少一个矩阵（500,600,605-1至605-4）中的所述亲和度值将所述节点（405-1至405-37）聚类成功能块（705-1至705-8）。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类