Automatic Detection and Transfer of Relevant Image Data to Content Collections

    公开(公告)号:US20210240757A1

    公开(公告)日:2021-08-05

    申请号:US17170697

    申请日:2021-02-08

    Abstract: This application is directed to a method for automatically identifying and transferring relevant image data. The method includes obtaining a plurality of content items from a personal content collection and determining attributes based on the plurality of content items. The method includes generating a plurality of relevance rules. The method further includes obtaining unclassified content items and determining for a first unclassified content item a plurality of aggregate relevance scores using the plurality of relevance rules. The method include determining whether a first aggregate relevance score and/or a second aggregate relevance score satisfy threshold score. The method includes, in accordance with a determination that the first aggregate relevance score and the second aggregate relevance score do not satisfy the threshold score, forgoing determining the attributes corresponding to the first unclassified content item, and storing the first unclassified content item in a candidate list of the personal content collection.

    Automatic scanning of document stack with a camera

    公开(公告)号:US10136011B2

    公开(公告)日:2018-11-20

    申请号:US15438625

    申请日:2017-02-21

    Abstract: Automatically scanning multiple document sheets with a camera includes receiving a video stream while the camera is pointed at the multiple document sheets, detecting presence of a first top page of the multiple document sheets based on the video stream, taking a still photograph of the first top page in response to detecting presence of the first top page, detecting presence of a second top page based on the video stream by confirming that the second top page is different from the first top page and by waiting a predetermined amount of time for an image of the second top page to stabilize, and taking a still photograph of the second top page in response to detecting presence of the second top page. Detecting the pages may include determining that the camera is pointing at the stack of documents and a detected page is not obstructed.

    Automatic scanning of document stack with a camera
    3.
    发明授权
    Automatic scanning of document stack with a camera 有权
    用相机自动扫描文件堆叠

    公开(公告)号:US09578195B1

    公开(公告)日:2017-02-21

    申请号:US15002444

    申请日:2016-01-21

    Abstract: Automatically scanning multiple document sheets with a camera includes receiving a video stream while the camera is pointed at the multiple document sheets, detecting presence of a first top page of the multiple document sheets based on the video stream, taking a still photograph of the first top page in response to detecting presence of the first top page, detecting presence of a second top page based on the video stream by confirming that the second top page is different from the first top page and by waiting a predetermined amount of time for an image of the second top page to stabilize, and taking a still photograph of the second top page in response to detecting presence of the second top page. Detecting the pages may include determining that the camera is pointing at the stack of documents and a detected page is not obstructed.

    Abstract translation: 使用相机自动扫描多张文档纸张包括:当相机指向多张原稿时接收视频流,基于视频流检测多张原稿的第一首页的存在,拍摄第一张顶部的静态照片 响应于检测到第一首页的存在,通过确认第二首页不同于第一首页,基于视频流来检测第二首页的存在,并且通过等待预定的时间量 响应于检测到第二首页的存在,稳定第二首页并且拍摄第二首页的静态照片。 检测页面可以包括确定相机正在指向文档堆栈,并且检测到的页面不被阻挡。

    Custom drawings as content access identifiers
    4.
    发明授权
    Custom drawings as content access identifiers 有权
    自定义图纸作为内容访问标识符

    公开(公告)号:US09235768B1

    公开(公告)日:2016-01-12

    申请号:US14064654

    申请日:2013-10-28

    CPC classification number: G06K9/00852 G06K9/00496 H04N1/00328

    Abstract: Providing access to digitally published data includes creating a note having at least a portion that is handwritten by a first user, converting handwriting of the note into a content access identifier that varies according to the portion that is handwritten by the first user, associating the content access identifier with the digitally published data, and making the digitally published data available to a second user by making the note available to the second user. The digitally published data may be written to a public database and/or a private database. A portion of the note may be pre-printed. A pre-printed distinguishing pattern on the note may indicate that handwritten content corresponds to a content access identifier. The pre-printed portion may be a regular dotted pattern. The note may have a known identifiable color and size.

    Abstract translation: 提供对数字发布的数据的访问包括创建具有至少一部分由第一用户手写的笔记,将笔记的笔迹转换成根据第一用户手写的部分而变化的内容访问标识符,将内容相关联 具有数字发布的数据的访问标识符,以及通过使该备注可供第二用户使数字发布的数据可供第二用户使用。 数字发布的数据可以写入公共数据库和/或私人数据库。 笔记的一部分可能是预印的。 笔记上的预打印区分图案可以指示手写内容对应于内容访问标识符。 预打印部分可以是常规点状图案。 该笔记可能具有已知的可识别的颜色和大小。

    FAST IDENTIFICATION OF IMAGES IN DOCUMENTS
    6.
    发明公开

    公开(公告)号:US20230326223A1

    公开(公告)日:2023-10-12

    申请号:US18334264

    申请日:2023-06-13

    CPC classification number: G06V30/413 G06T7/60 G06T3/40 G06V30/414

    Abstract: Methods and systems for detecting images in documents are described. A method implemented by an electronic device having one or more processors for determining whether a document is an image includes partitioning a document into a plurality of cells. The method includes scaling each of the cells to a standardized number of pixels to provide a corresponding snippet for each of the cells, classifying the snippets, using a neural network, to determine a set of cells classified as text, and determining a volume of text for the document based on a sum of an amount of text in each cell of the set of cells. The method further includes in response to a determination that the volume of text for the document is below a predetermined threshold, determining that the document is an image.

    EXTRACTING STRUCTURED DATA FROM HANDWRITTEN AND AUDIO NOTES

    公开(公告)号:US20230099963A1

    公开(公告)日:2023-03-30

    申请号:US18077143

    申请日:2022-12-07

    Abstract: This application is directed to recognizing unstructured information based on hints provided by structured information. A computer system obtains unstructured information collected from a handwritten or audio source, and identifies one or more terms from the unstructured information. The one or more terms includes a first term that is ambiguous. The computer system performs a recognition operation on the first term to derive a first plurality of candidate terms for the first term, and obtains first contextual information from an information template associated with the unstructured information. In accordance with the first contextual information, the computer system selects a first answer term from the first plurality of candidate terms, such that the first term is recognized as the first answer term.

    Building training data and similarity relations for semantic space

    公开(公告)号:US10755183B1

    公开(公告)日:2020-08-25

    申请号:US15416611

    申请日:2017-01-26

    Abstract: Selecting data from a source text corpus for training a semantic data analysis system includes selecting an item of the text corpus, validating the item, extracting at least one section of the item, determining a length of each of the at least one section of the item, and subdividing each of the sections having a length greater than a predetermined amount into a plurality of fragments that are deemed to be similar. The predetermined amount may be approximately twice a size of a fragment. A fragment may have approximately 100 words or between 40 and 60 words. Fragments from different items may be deemed to be dissimilar. Sections having a length less than the predetermined amount may be ignored. Validating the item may include parsing editorial notes and other accompanying data. The source text corpus may be Wikipedia. The item may be an article.

    Coordinated piecewise Bezier vectorization

    公开(公告)号:US10743035B2

    公开(公告)日:2020-08-11

    申请号:US16279856

    申请日:2019-02-19

    Abstract: This application is directed to vectoring a raster image in which an electronic device detects a contour of a component in the raster image, builds tangent vectors for each point of the contour and identifies a plurality of segmentation points on the contour. One or more points of sharp angle are identified on the contour in accordance with a determination that each point of sharp angle corresponds to two distinct tangent vectors and that an angle between the two distinct tangent vectors falls below a predefined threshold. A respective one of the segmentation points is positioned at each identified point of shape angle. The electronic device approximates a piecewise smooth fitting curve (e.g., a piecewise Bezier curve) having two or more fitting segments to connect the plurality of segmentation points on the contour. The piecewise smooth fitting curve is thereby provided to vectorize the raster image.

    FAST IDENTIFICATION OF TEXT INTENSIVE PAGES FROM PHOTOGRAPHS

    公开(公告)号:US20190318163A1

    公开(公告)日:2019-10-17

    申请号:US16455543

    申请日:2019-06-27

    Abstract: Methods and systems for training a neural network to distinguish between text documents and image documents are described. A corpus of text and image documents is obtained. A page of a text document is scanned by shifting a text window to a plurality of locations. In accordance with a determination that the text in the window at a respective location meets text line criteria, the text in the window is stored as a respective text snippet. A plurality of image windows are superimposed over at least one page of an image document. In accordance with a determination that the content of a respective image window meets image criteria, content of the image window is stored as a respective image snippet. The respective text snippet and the respective image snippet are provided to a classifier.

Patent Agency Ranking