发明授权
US08086040B2 Text representation method and apparatus 失效
文本表示方法和装置

Text representation method and apparatus
摘要:
A text-like data representation technique and a text-like data representation apparatus are disclosed that may: acquire image data from a scanned image; segment text regions from the image data; further extract each connected component in the text regions; form clusters based on the connected components; group each connected component in the text regions into one of the clusters with similar or identical characters; generate a high-resolution representative for each cluster; generate a vector representation for each high-resolution representative; and code the text as text data by associating each connected component with its vectorized high-resolution representative, and location in the document.
公开/授权文献
信息查询
0/0