-
公开(公告)号:US09436883B2
公开(公告)日:2016-09-06
申请号:US14816943
申请日:2015-08-03
Applicant: A9.com, Inc.
Inventor: Xiaofan Lin , Adam Wiggen Kraft , Yu Lou , Douglas Ryan Gray , Colin Jon Taylor
CPC classification number: G06K9/18 , G06K9/00456 , G06K9/00523 , G06K9/228 , G06K2209/01 , G06T7/11
Abstract: Various embodiments provide methods and systems for identifying text in an image by applying suitable text detection parameters in text detection. The suitable text detection parameters can be determined based on parameter metric feedback from one or more text identification subtasks, such as text detection, text recognition, preprocessing, character set mapping, pattern matching and validation. In some embodiments, the image can be defined into one or more image regions by performing glyph detection on the image. Text detection parameters applying to each of the one or more image regions can be adjusted based on measured one or more parameter metrics in the respective image region.
-
公开(公告)号:US09256795B1
公开(公告)日:2016-02-09
申请号:US13842433
申请日:2013-03-15
Applicant: A9.com, Inc.
Inventor: Douglas Ryan Gray , Xiaofan Lin , Arnab Sanat Kumar Dhua , Yu Lou
IPC: G06K9/20
CPC classification number: G06K9/325 , G06F3/14 , G06F17/24 , G06K9/00671 , G06K9/2054 , G06K9/2072 , G06K9/6215 , G06K9/6857 , G06K9/723 , G06K2209/01
Abstract: Various embodiments enable the identification of semi-structured text entities in an imager. The identification of the text entities is a relatively simple problem when the text is stored in a computer and free of errors, but much more challenging if the source is the output of an optical character recognition (OCR) engine from a natural scene image. Accordingly, output from an OCR engine is analyzed to isolate a character string indicative of a text entity. Each character of the string is then assigned to a character class to produce a character class string and the text entity of the string is identified based in part on a pattern of the character class string.
Abstract translation: 各种实施例使得能够在成像器中识别半结构化文本实体。 当文本存储在计算机中并且没有错误时,文本实体的识别是相对简单的问题,但是如果源是来自自然场景图像的光学字符识别(OCR)引擎的输出,则更具挑战性。 因此,分析来自OCR引擎的输出以隔离指示文本实体的字符串。 然后将字符串的每个字符分配给字符类以产生字符类字符串,并且部分地基于字符类字符串的模式来标识字符串的文本实体。
-
公开(公告)号:US09179061B1
公开(公告)日:2015-11-03
申请号:US14103758
申请日:2013-12-11
Applicant: A9.com, Inc.
Inventor: Adam Wiggen Kraft , Kathy Wing Lam Ma , Xiaofan Lin , Arnab Sanat Kumar Dhua , Yu Lou
IPC: H04N5/232
CPC classification number: H04N5/23222 , G06F3/0482 , G06F3/04842 , G06F17/24 , G06F17/2715 , G06K9/18 , G06Q30/0603 , G06Q30/0625 , G06T7/194 , G06T7/70 , G06T15/00 , G06T15/08 , G06T2210/22 , G06T2215/16 , H04N1/00 , H04N7/183
Abstract: Various approaches provide for detecting and recognizing text to enable a user to perform various functions or tasks. For example, a user could point a camera at an object with text, in order to capture an image of that object. The camera can be integrated with a portable computing device that is capable of taking the image and processing the image (or providing the image for processing) to recognize, identify, and/or isolate the text in order to send the image of the object as well as recognized text to an application, function, or system, such as an electronic marketplace.
Abstract translation: 各种方法提供用于检测和识别文本以使用户能够执行各种功能或任务。 例如,用户可以将相机指向具有文本的对象,以便捕获该对象的图像。 相机可以与便携式计算设备集成,该便携式计算设备能够拍摄图像并处理图像(或提供图像进行处理)以识别,识别和/或隔离文本,以便将对象的图像作为 以及作为应用程序,功能或系统(如电子市场)的公认文本。
-
公开(公告)号:US09043349B1
公开(公告)日:2015-05-26
申请号:US13688772
申请日:2012-11-29
Applicant: A9.com, Inc.
Inventor: Xiaofan Lin , Arnab Sanat Kumar Dhua , Douglas Ryan Gray , Yu Lou
CPC classification number: G06K9/18 , G06F17/30253 , G06K9/325 , G06K9/6292 , G06K2209/01
Abstract: Various embodiments enable a device to perform tasks such as processing an image to recognize and locate text in the image, and providing the recognized text an application executing on the device for performing a function (e.g., calling a number, opening an internet browser, etc.) associated with the recognized text. In at least one embodiment, processing the image includes substantially simultaneously or concurrently processing the image with at least two recognition engines, such as at least two optical character recognition (OCR) engines, running in a multithreaded mode. In at least one embodiment, the recognition engines can be tuned so that their respective processing speeds are roughly the same. Utilizing multiple recognition engines enables processing latency to be close to that of using only one recognition engine.
Abstract translation: 各种实施例使得设备能够执行诸如处理图像以识别和定位图像中的文本的任务,以及向识别的文本提供在设备上执行的用于执行功能的应用(例如,呼叫号码,打开因特网浏览器等等) 。)与识别的文本相关联。 在至少一个实施例中,处理图像基本上同时或同时用至少两个识别引擎处理图像,例如以多线程模式运行的至少两个光学字符识别(OCR)引擎。 在至少一个实施例中,可以对识别引擎进行调整,使得它们各自的处理速度大致相同。 利用多个识别引擎使处理等待时间接近于仅使用一个识别引擎。
-
公开(公告)号:US10963924B1
公开(公告)日:2021-03-30
申请号:US14203260
申请日:2014-03-10
Applicant: A9.com, Inc.
Inventor: Douglas Ryan Gray , Arnab Sanat Kumar Dhua , Xiaofan Lin , Zhijiang Mark Lu
Abstract: A computing device can obtain data describing at least one document, the at least one document referencing at least one media object, wherein a portion of the at least one media object includes one or more characters. The computing device can obtain data describing the one or more characters in the at least one media object in the at least one document. The computing device can generate an updated copy of the at least one document that includes the data describing the one or more characters in the at least one media object. The computing device can present, on a display screen of the computing device and through an interface, the updated copy of the at least one document, wherein the one or more characters in the at least one media object are able to be selected or searched.
-
公开(公告)号:US10445569B1
公开(公告)日:2019-10-15
申请号:US15251832
申请日:2016-08-30
Applicant: A9.com, Inc.
Inventor: Xiaofan Lin , Son Dinh Tran
Abstract: Approaches provide for recognizing and locating text represented in image data. For example, image data that includes representations of text can be obtained. A width-focused recognition engine can be configured to analyze the image data to determine a base-set of words. The base-set of words can be associated with logical structure information that describes a geometric relationship between words in the base-set of words. A set of bounding boxes that includes one or more base words can be determined, as well as a confidence value for each base word. A depth-focused recognition engine can be configured to analyze the image data to determine a focused-set of words, the focused-set of words associated with a set of bounding boxes and confidence values for respective words. A set of merged words can be determined from a set of overlapping bounding boxes that overlap a threshold amount. The set of merged words can include at least a portion of the base-set of words and/or the focused-set of words and are selected based at least in part on respective confidence values of words in the set of overlapping bounding boxes. Thereafter, a final set of words that includes the merged set of words and appended words can be determined.
-
公开(公告)号:US10121229B2
公开(公告)日:2018-11-06
申请号:US14984805
申请日:2015-12-30
Applicant: A9.com, Inc.
Inventor: Douglas Ryan Gray , Colin Jon Taylor , Xiaofan Lin
IPC: G06T5/00 , H04N5/232 , H04N5/272 , G06F17/30 , G06K9/00 , G06T5/30 , G06T11/60 , G06T7/11 , G06T7/194
Abstract: Systems and approaches are provided for optimizing self-portraiture. The background of the self-portrait can be enhanced by image registration or stitching techniques of images captured using one or more conventional cameras. Multiple standard resolution images can be stitched together to generate a panoramic or a composite image of a higher resolution. Foreground elements, such as one or more representations of users, can also be enhanced in various ways. The representations of the users can be composited to exclude undesirable elements, such as image data of one of the users extending her arm to capture the self-portrait. An ideal pose of the users can automatically be selected and other image enhancements, such as histogram optimization, brightness and contrast optimization, color-cast correction, or reduction or removal of noise, can automatically be performed to minimize user effort in capturing self-portraits.
-
公开(公告)号:US10038839B2
公开(公告)日:2018-07-31
申请号:US15611405
申请日:2017-06-01
Applicant: A9.com, Inc.
Inventor: Adam Wiggen Kraft , Kathy Wing Lam Ma , Xiaofan Lin , Arnab Sanat Kumar Dhua , Yu Lou
IPC: H04N5/225 , H04N5/232 , G06T15/08 , G06T7/194 , H04N7/18 , G06F17/27 , G06K9/18 , G06Q30/06 , G06F17/24 , G06F3/0482 , G06T15/00 , H04N1/00 , G06T7/70 , G06F3/0484
CPC classification number: H04N5/23222 , G06F3/0482 , G06F3/04842 , G06F17/24 , G06F17/2715 , G06K9/18 , G06Q30/0603 , G06Q30/0625 , G06T7/194 , G06T7/70 , G06T15/00 , G06T15/08 , G06T2210/22 , G06T2215/16 , H04N1/00 , H04N7/183
Abstract: Various approaches provide for detecting and recognizing text to enable a user to perform various functions or tasks. For example, a user could point a camera at an object with text, in order to capture an image of that object. The camera can be integrated with a portable computing device that is capable of taking the image and processing the image (or providing the image for processing) to recognize, identify, and/or isolate the text in order to send the image of the object as well as recognized text to an application, function, or system, such as an electronic marketplace.
-
公开(公告)号:US09934526B1
公开(公告)日:2018-04-03
申请号:US13929689
申请日:2013-06-27
Applicant: A9.com, Inc.
Inventor: Arnab Sunat Kumar Dhua , Douglas Ryan Gray , Xiaofan Lin , Yu Lou , Adam Wiggen Kraft , Sunil Ramesh
CPC classification number: G06Q30/0623
Abstract: Various embodiments enable a process to automatically attempt to select the most relevant words associated with products available for purchase from an electronic marketplace from an image frame. For example, an image frame containing text can be obtained and analyzed with an optical character recognition. The recognized words can then be preprocessed using various filtering and scoring techniques to narrow down a volume of text to a few relevant query terms. These query terms can then be sent to a search engine associated with the electronic marketplace to return relevant products to a user.
-
公开(公告)号:US09582913B1
公开(公告)日:2017-02-28
申请号:US14037275
申请日:2013-09-25
Applicant: A9.com, Inc.
Inventor: Adam Wiggen Kraft , Arnab Sanat Kumar Dhua , Douglas Ryan Gray , Xiaofan Lin , Yu Lou , Sunil Ramesh , Colin Jon Taylor , David Creighton Mott
IPC: G06T11/60
CPC classification number: G06T11/60 , G06K9/00577 , G06T19/006
Abstract: Various embodiments enable a computing device to perform tasks such as highlighting words in an augmented reality view that are important to a user. For example, word lists can be generated and the user, by pointing a camera of a computing device at a volume of text, can cause words from the word list within the volume of text to be highlighted in a live field of view of the camera displayed thereon. Accordingly, users can quickly identify textual information that is meaningful to them in an Augmented Reality view to aid the user in sifting through real-world text.
Abstract translation: 各种实施例使得计算设备能够执行诸如突出显示对用户重要的增强现实视图中的单词的任务。 例如,可以生成单词列表,并且通过将计算设备的相机指向一定量的文本,用户可以使文本体内的单词列表中的单词在摄像机的实时视野中突出显示 在其上显示。 因此,用户可以在增强现实视图中快速识别对他们有意义的文本信息,以帮助用户筛选真实世界的文本。
-
-
-
-
-
-
-
-
-