-
公开(公告)号:US11861893B2
公开(公告)日:2024-01-02
申请号:US17929468
申请日:2022-09-02
Inventor: Chiayu Lin
IPC: G06V20/13 , G06T7/11 , G06T7/70 , G06V20/10 , G06V20/40 , G06V30/14 , G06F18/24 , G06V10/44 , G06V20/17 , G06V20/62 , G08C19/36 , G06V30/10
CPC classification number: G06V20/13 , G06F18/24 , G06T7/11 , G06T7/70 , G06V10/44 , G06V20/10 , G06V20/17 , G06V20/46 , G06V20/62 , G06V30/1444 , G06V30/1448 , G06T2207/10016 , G06V30/10 , G08C19/36
Abstract: According to one embodiment, a reading support system includes a processing device. The processing device includes an extractor and a type determiner. The extractor extracts a plurality of regions from a candidate region. The candidate region is a candidate of a region in which a meter is imaged. The regions respectively include a plurality of characters of the meter. The type determiner determines a type of the meter based on positions of the regions.
-
公开(公告)号:US20230394860A1
公开(公告)日:2023-12-07
申请号:US17832642
申请日:2022-06-04
Applicant: Zoom Video Communications, Inc.
Inventor: Renjie Tao , Ling Tsou
IPC: G06V30/19 , G06V10/62 , G06V20/62 , G06V20/40 , G06V10/70 , G06V30/14 , G06F16/783 , G06F16/738
CPC classification number: G06V30/19013 , G06V10/62 , G06V20/62 , G06V20/41 , G06V10/768 , G06V30/1444 , G06F16/7844 , G06F16/738 , H04L65/403
Abstract: Methods and systems provide for video-based search results within a communication session. In one embodiment, the system receives video content of a communication session with a number of participants; extracts, via optical character recognition (“OCR”), textual content from the frames of the video content, each piece of textual content including a timestamp representing a temporal location of the frame within the video content; receives, from a client device associated with a user, a request to search for specified text within the video content; in response to receiving the request, determines one or more matching pieces of textual content which match to the specified text; and presents, to the client device, the matching pieces of textual content.
-
33.
公开(公告)号:US20230385298A1
公开(公告)日:2023-11-30
申请号:US18203096
申请日:2023-05-30
Applicant: Hank AI, Inc.
Inventor: Sergey A. Razin , Jack Neil , Samuel Hartzog , Stéphane Charette
IPC: G06F16/25 , G06F16/93 , G06V30/416 , G06V30/14
CPC classification number: G06F16/254 , G06F16/256 , G06V30/1444 , G06V30/416 , G06F16/93
Abstract: Embodiments of the innovation relate to a data extraction device, comprising a controller having a processor and memory. The controller is configured receive an unstructured data file comprising a set of documents; apply the unstructured data file to a document identification model to identify a data element identifier and an associated data element of each document of the set of documents; apply an optical character recognition engine to the identified data element identifier and associated identified data element to generate a structured data element identifier and an associated structured data element, the structured data element identifier and the associated structured data element configured as machine-identifiable characters; embed the structured data element identifier and associated structured data element as metadata with the unstructured data file; and store the unstructured data file and metadata in a database.
-
公开(公告)号:US11823471B2
公开(公告)日:2023-11-21
申请号:US17795446
申请日:2021-01-20
Applicant: Microsoft Technology Licensing, LLC
Inventor: Wenping Hu , Qiang Huo
CPC classification number: G06V20/63 , G06V30/1444 , G06V30/19147 , G06V30/287 , G06V30/293 , G06V30/10
Abstract: According to implementations of the subject matter described herein, there is provided a solution for text recognition in an image. In this solution, a target text line area, which is expected to include a text to be recognized, is determined from an image. Probability distribution information of a character model element(s) present in the target text line area is determined using a single character model. The single character model is trained based on training text line areas and respective ground-truth texts in the training text line areas. Texts in the training text line areas are arranged in different orientations, and/or the ground-truth texts comprise texts are related to various languages (e.g., texts related to a Latin and an Eastern languages). The text in the target text line area can be determined based on the determined probability distribution information. The single character model enables more efficient and convenient text recognition.
-
公开(公告)号:US11769465B1
公开(公告)日:2023-09-26
申请号:US17453710
申请日:2021-11-05
Applicant: Optum, Inc.
Inventor: Jon Kevin Muse , Gregory J. Boss , Komal Khatri
IPC: G09G5/10 , G06T11/60 , G06T7/20 , G06V30/14 , G09G5/02 , G09G5/30 , G06T11/00 , H04N1/60 , H04N5/57 , H04N9/64 , H04N9/69 , H04N9/73 , H04N9/77 , H04N13/144 , H04N19/167
CPC classification number: G09G5/10 , G06T7/20 , G06T11/60 , G06V30/1444 , G06T2207/10016 , G09G2320/0666 , G09G2320/0686 , G09G2354/00 , G09G2360/144 , G09G2360/16 , G09G2380/08
Abstract: A computing system includes a storage device and processing circuitry. The processing circuitry is configured to obtain an image frame that comprises a plurality of pixels that form a pixel array. Additionally, the processing circuitry is configured to determine that a region of the image frame belongs to a trigger content type. Based on determining that the region of the image frame belongs to the trigger content type, the processing circuitry is configured to modify the region of the image frame to adjust a luminance of pixels of the region of the image frame based on part on an ambient light level in a viewing area of the user; and output, for display by a display device in the viewing area of the user, a version of the image frame that contains the modified region.
-
公开(公告)号:US20230274081A1
公开(公告)日:2023-08-31
申请号:US18106926
申请日:2023-02-07
Applicant: Arlen Fan , Yuxin Ma , Ross Maciejewski
Inventor: Arlen Fan , Yuxin Ma , Ross Maciejewski
IPC: G06F40/169 , G06V30/14 , G06V30/19 , G06F40/30
CPC classification number: G06F40/169 , G06V30/1444 , G06V30/19173 , G06F40/30
Abstract: The present disclosure describes examples of a computer-implemented framework that helps to detect deception in charts and/or associated articles through textual and visual annotations.
-
公开(公告)号:US20230265640A1
公开(公告)日:2023-08-24
申请号:US17679680
申请日:2022-02-24
Applicant: Caterpillar Inc.
Inventor: Christopher Wright , Justin Steinlage
CPC classification number: E02F9/2033 , E02F9/2029 , E02F9/2253 , E02F9/265 , E02F9/262 , G06V20/58 , G06V30/1444 , G06V10/82 , G06T7/73 , G06T2207/10012 , G06T2207/10028 , G06T2207/20084 , G06T2207/30252
Abstract: A work machine, a first method of defining a virtual 3D exclusion zone, and a second method of preventing collisions involving a work machine are disclosed. The work machine comprises a body, an implement arm, an imaging assembly, and an electrohydraulic assembly configured to prevent the implement arm from intersecting a 3D exclusion zone. The first method comprises scanning a local environment, generating a virtual 3D representation, identifying key structures, and generating a virtual 3D exclusion zone encompassing the key structures. The second method comprises defining a virtual 3D exclusion zone, monitoring a motion of the work machine, and adjusting the motion to avoid an intersection between the work machine and the 3D exclusion zone. The 3D exclusion zone may be implemented for a number of machines, environments, and key structures without unduly removing control from a human operator.
-
公开(公告)号:US20230260302A1
公开(公告)日:2023-08-17
申请号:US18300131
申请日:2023-04-13
Applicant: PAYPAL, INC.
Inventor: Xiaodong Yu , Hewen Wang
IPC: G06V30/18 , G06Q20/40 , G06Q10/0833 , G06F16/901 , G06N3/04 , G06Q10/083 , G06V10/44 , G06V10/46 , G06F18/214 , G06V30/14
CPC classification number: G06V30/18181 , G06Q20/407 , G06Q10/0833 , G06F16/9024 , G06N3/04 , G06Q10/0838 , G06V10/457 , G06V10/464 , G06F18/214 , G06V30/1444
Abstract: Methods and systems are presented for extracting categorizable information from an image using a graph that models data within the image. Upon receiving an image, a data extraction system identifies characters in the image. The data extraction system then generates bounding boxes that enclose adjacent characters that are related to each other in the image. The data extraction system also creates connections between the bounding boxes based on locations of the bounding boxes. A graph is generated based on the bounding boxes and the connections such that the graph can accurately represent the data in the image. The graph is provided to a graph neural network that is configured to analyze the graph and produce an output. The data extraction system may categorize the data in the image based on the output.
-
39.
公开(公告)号:US11721229B2
公开(公告)日:2023-08-08
申请号:US16756468
申请日:2019-09-11
Applicant: Hangzhou Dana Technology Inc.
Inventor: Fan Shi , Tao He , Huan Luo , Mingquan Chen
IPC: G06N3/08 , G06F16/245 , G09B7/02 , G06F40/284 , G06N3/04 , G09B19/02 , G06V30/40 , G06F18/21 , G06V30/14 , G06V30/148 , G06V30/19 , G06V10/82 , G06V30/413
CPC classification number: G09B7/02 , G06F16/245 , G06F18/21 , G06F40/284 , G06N3/04 , G06N3/08 , G06V10/82 , G06V30/1444 , G06V30/153 , G06V30/19173 , G06V30/40 , G06V30/413 , G09B19/025
Abstract: The present disclosure provides a question correction method and device for oral calculation questions. The feature vector of the question to be searched is obtained according to the content of token in the stem of each question to be searched, and then the feature vector of each question to be searched is used to search for the target test paper that matches the test paper to be searched in the question bank. For the question to be searched in the form of oral calculation question, a second search is performed in the target test paper based on the feature vector of the question, and the search criterion is the minimum shortest editing distance. If the question type of the matched target question is also an oral calculation question, it is determined that the question to be searched is the oral calculation question to be corrected, then a preset oral calculation engine is used to calculate the oral calculation question to be corrected and the calculation result is output as the answer to the oral calculation question to be corrected. By applying the solution provided by the present disclosure, the accuracy of correction on oral calculation questions can be improved.
-
公开(公告)号:US11720605B1
公开(公告)日:2023-08-08
申请号:US17876069
申请日:2022-07-28
Applicant: Intuit Inc.
Inventor: Tharathorn Rimchala , Yingxin Wang
IPC: G06F16/28 , G06V30/14 , G06F16/93 , G06F16/2457
CPC classification number: G06F16/287 , G06F16/24578 , G06F16/93 , G06V30/1444
Abstract: A visual-based classification model influenced by text features as a result of the outputs of a text-based classification model is disclosed. A system receives one or more documents to be classified based on one or more visual features and provides the one or more documents to a student classification model, which is a visual-based classification model. The system also classifies, by the student classification model, the one or more documents into one or more document types based on one or more visual features. The one or more visual features are generated by the student classification model that is trained based on important text identified by a teacher classification model for the one or more document types, with the teacher classification model being a text-based classification model. Generating training data and training the student classification model based on the training data are also described.
-
-
-
-
-
-
-
-
-