-
公开(公告)号:US20210240757A1
公开(公告)日:2021-08-05
申请号:US17170697
申请日:2021-02-08
Applicant: EVERNOTE CORPORATION
Inventor: Dylan Marriott , Daniel Nicolae , Ruben Bakker , Alexander Pashintsev , Zdzislaw Pawel Losvik , Eugene Livshitz , Vitaly Glazkov , Boris Gorbatov , Ilya Buryak
IPC: G06F16/51 , G06F16/583 , G06F16/58 , G06F16/55
Abstract: This application is directed to a method for automatically identifying and transferring relevant image data. The method includes obtaining a plurality of content items from a personal content collection and determining attributes based on the plurality of content items. The method includes generating a plurality of relevance rules. The method further includes obtaining unclassified content items and determining for a first unclassified content item a plurality of aggregate relevance scores using the plurality of relevance rules. The method include determining whether a first aggregate relevance score and/or a second aggregate relevance score satisfy threshold score. The method includes, in accordance with a determination that the first aggregate relevance score and the second aggregate relevance score do not satisfy the threshold score, forgoing determining the attributes corresponding to the first unclassified content item, and storing the first unclassified content item in a candidate list of the personal content collection.
-
公开(公告)号:US10136011B2
公开(公告)日:2018-11-20
申请号:US15438625
申请日:2017-02-21
Applicant: EVERNOTE CORPORATION
Inventor: Alexander Pashintsev , Boris Gorbatov , Eugene Livshitz
Abstract: Automatically scanning multiple document sheets with a camera includes receiving a video stream while the camera is pointed at the multiple document sheets, detecting presence of a first top page of the multiple document sheets based on the video stream, taking a still photograph of the first top page in response to detecting presence of the first top page, detecting presence of a second top page based on the video stream by confirming that the second top page is different from the first top page and by waiting a predetermined amount of time for an image of the second top page to stabilize, and taking a still photograph of the second top page in response to detecting presence of the second top page. Detecting the pages may include determining that the camera is pointing at the stack of documents and a detected page is not obstructed.
-
公开(公告)号:US09578195B1
公开(公告)日:2017-02-21
申请号:US15002444
申请日:2016-01-21
Applicant: Evernote Corporation
Inventor: Alexander Pashintsev , Boris Gorbatov , Eugene Livshitz
CPC classification number: H04N1/00782 , H04N1/00251 , H04N1/00689 , H04N2201/0084 , H04N2201/0096
Abstract: Automatically scanning multiple document sheets with a camera includes receiving a video stream while the camera is pointed at the multiple document sheets, detecting presence of a first top page of the multiple document sheets based on the video stream, taking a still photograph of the first top page in response to detecting presence of the first top page, detecting presence of a second top page based on the video stream by confirming that the second top page is different from the first top page and by waiting a predetermined amount of time for an image of the second top page to stabilize, and taking a still photograph of the second top page in response to detecting presence of the second top page. Detecting the pages may include determining that the camera is pointing at the stack of documents and a detected page is not obstructed.
Abstract translation: 使用相机自动扫描多张文档纸张包括:当相机指向多张原稿时接收视频流,基于视频流检测多张原稿的第一首页的存在,拍摄第一张顶部的静态照片 响应于检测到第一首页的存在,通过确认第二首页不同于第一首页,基于视频流来检测第二首页的存在,并且通过等待预定的时间量 响应于检测到第二首页的存在,稳定第二首页并且拍摄第二首页的静态照片。 检测页面可以包括确定相机正在指向文档堆栈,并且检测到的页面不被阻挡。
-
公开(公告)号:US09235768B1
公开(公告)日:2016-01-12
申请号:US14064654
申请日:2013-10-28
Applicant: EVERNOTE CORPORATION
Inventor: Alexander Pashintsev , Keith Lang , Juan Carlos Jimenez , Eugene Livshitz
CPC classification number: G06K9/00852 , G06K9/00496 , H04N1/00328
Abstract: Providing access to digitally published data includes creating a note having at least a portion that is handwritten by a first user, converting handwriting of the note into a content access identifier that varies according to the portion that is handwritten by the first user, associating the content access identifier with the digitally published data, and making the digitally published data available to a second user by making the note available to the second user. The digitally published data may be written to a public database and/or a private database. A portion of the note may be pre-printed. A pre-printed distinguishing pattern on the note may indicate that handwritten content corresponds to a content access identifier. The pre-printed portion may be a regular dotted pattern. The note may have a known identifiable color and size.
Abstract translation: 提供对数字发布的数据的访问包括创建具有至少一部分由第一用户手写的笔记,将笔记的笔迹转换成根据第一用户手写的部分而变化的内容访问标识符,将内容相关联 具有数字发布的数据的访问标识符,以及通过使该备注可供第二用户使数字发布的数据可供第二用户使用。 数字发布的数据可以写入公共数据库和/或私人数据库。 笔记的一部分可能是预印的。 笔记上的预打印区分图案可以指示手写内容对应于内容访问标识符。 预打印部分可以是常规点状图案。 该笔记可能具有已知的可识别的颜色和大小。
-
公开(公告)号:US12020175B2
公开(公告)日:2024-06-25
申请号:US17001311
申请日:2020-08-24
Applicant: Evernote Corporation
Inventor: Eugene Livshitz , Alexander Pashintsev , Boris Gorbatov
IPC: G06F17/00 , G06F16/33 , G06F40/205 , G06N5/04 , G06N20/00
CPC classification number: G06N5/04 , G06F16/334 , G06F16/3344 , G06F40/205 , G06N20/00
Abstract: A method and system for selecting data from a source text corpus for training a semantic data analysis system. The method includes selecting an item of the text corpus, wherein the item includes at least one section. The method includes extracting a section of the at least one section of the item. The method also includes determining a length of the section of the at least one section of the item. Based on the length of the section being greater than a predetermined amount, the method includes subdividing the section into a plurality of fragments. Each fragment of the plurality of fragments is deemed to be similar to each other. Further, the method includes building a training set based on the plurality of fragments. The training set is used to train the semantic data analysis system.
-
公开(公告)号:US20230326223A1
公开(公告)日:2023-10-12
申请号:US18334264
申请日:2023-06-13
Applicant: Evernote Corporation
Inventor: Alexander Pashintsev , Boris Gorbatov , Eugene Livshitz , Vitaly Glazkov
IPC: G06V30/413 , G06T7/60 , G06T3/40 , G06V30/414
CPC classification number: G06V30/413 , G06T7/60 , G06T3/40 , G06V30/414
Abstract: Methods and systems for detecting images in documents are described. A method implemented by an electronic device having one or more processors for determining whether a document is an image includes partitioning a document into a plurality of cells. The method includes scaling each of the cells to a standardized number of pixels to provide a corresponding snippet for each of the cells, classifying the snippets, using a neural network, to determine a set of cells classified as text, and determining a volume of text for the document based on a sum of an amount of text in each cell of the set of cells. The method further includes in response to a determination that the volume of text for the document is below a predetermined threshold, determining that the document is an image.
-
公开(公告)号:US20230099963A1
公开(公告)日:2023-03-30
申请号:US18077143
申请日:2022-12-07
Applicant: Evernote Corporation
Inventor: Chris O'Neill , Andrew Malcolm , Alexander Pashintsev , Eugene Livshitz , Boris Gorbatov , Natalia Galaktionova , IIya Buryak
IPC: G06F40/186 , G06F40/279
Abstract: This application is directed to recognizing unstructured information based on hints provided by structured information. A computer system obtains unstructured information collected from a handwritten or audio source, and identifies one or more terms from the unstructured information. The one or more terms includes a first term that is ambiguous. The computer system performs a recognition operation on the first term to derive a first plurality of candidate terms for the first term, and obtains first contextual information from an information template associated with the unstructured information. In accordance with the first contextual information, the computer system selects a first answer term from the first plurality of candidate terms, such that the first term is recognized as the first answer term.
-
公开(公告)号:US10755183B1
公开(公告)日:2020-08-25
申请号:US15416611
申请日:2017-01-26
Applicant: Evernote Corporation
Inventor: Eugene Livshitz , Alexander Pashintsev , Boris Gorbatov
IPC: G06N5/04 , G06N20/00 , G06F16/33 , G06F40/205
Abstract: Selecting data from a source text corpus for training a semantic data analysis system includes selecting an item of the text corpus, validating the item, extracting at least one section of the item, determining a length of each of the at least one section of the item, and subdividing each of the sections having a length greater than a predetermined amount into a plurality of fragments that are deemed to be similar. The predetermined amount may be approximately twice a size of a fragment. A fragment may have approximately 100 words or between 40 and 60 words. Fragments from different items may be deemed to be dissimilar. Sections having a length less than the predetermined amount may be ignored. Validating the item may include parsing editorial notes and other accompanying data. The source text corpus may be Wikipedia. The item may be an article.
-
公开(公告)号:US10743035B2
公开(公告)日:2020-08-11
申请号:US16279856
申请日:2019-02-19
Applicant: Evernote Corporation
Inventor: Eugene Livshitz , Ilia Buriak , Natalia Galaktionova , Alexander Pashintsev , Boris Gorbatov
Abstract: This application is directed to vectoring a raster image in which an electronic device detects a contour of a component in the raster image, builds tangent vectors for each point of the contour and identifies a plurality of segmentation points on the contour. One or more points of sharp angle are identified on the contour in accordance with a determination that each point of sharp angle corresponds to two distinct tangent vectors and that an angle between the two distinct tangent vectors falls below a predefined threshold. A respective one of the segmentation points is positioned at each identified point of shape angle. The electronic device approximates a piecewise smooth fitting curve (e.g., a piecewise Bezier curve) having two or more fitting segments to connect the plurality of segmentation points on the contour. The piecewise smooth fitting curve is thereby provided to vectorize the raster image.
-
公开(公告)号:US20190318163A1
公开(公告)日:2019-10-17
申请号:US16455543
申请日:2019-06-27
Applicant: EVERNOTE CORPORATION
Inventor: Alexander Pashintsev , Boris Gorbatov , Eugene Livshitz , Vitaly Glazkov
Abstract: Methods and systems for training a neural network to distinguish between text documents and image documents are described. A corpus of text and image documents is obtained. A page of a text document is scanned by shifting a text window to a plurality of locations. In accordance with a determination that the text in the window at a respective location meets text line criteria, the text in the window is stored as a respective text snippet. A plurality of image windows are superimposed over at least one page of an image document. In accordance with a determination that the content of a respective image window meets image criteria, content of the image window is stored as a respective image snippet. The respective text snippet and the respective image snippet are provided to a classifier.
-
-
-
-
-
-
-
-
-