Leveraging image context for improved glyph classification
    11.
    发明授权
    Leveraging image context for improved glyph classification 有权
    利用图像上下文来改进字形分类

    公开(公告)号:US09576196B1

    公开(公告)日:2017-02-21

    申请号:US14463746

    申请日:2014-08-20

    CPC classification number: G06K9/3258 G06K2209/01

    Abstract: A system to recognize text or symbols contained in a captured image using machine learning models leverages context information about the image to improve accuracy. Contextual information is determined for the entire image, or spatial regions of the images, and is provided to a machine learning model when a determination is made as to whether a region does or does not contain text or symbols. Associating features related to the larger context with features extracted from regions potentially containing text or symbolic content provides an incremental improvement of results obtained using machine learning techniques.

    Abstract translation: 使用机器学习模型识别拍摄图像中包含的文本或符号的系统利用关于图像的上下文信息来提高准确度。 为整个图像或图像的空间区域确定上下文信息,并且当确定区域是否包含文本或符号时,将其提供给机器学习模型。 将与较大背景相关的特征与从潜在地包含文本或符号内容的区域提取的特征相关联的特征提供了使用机器学习技术获得的结果的渐进改进。

    Text orientation estimation in camera captured OCR
    13.
    发明授权
    Text orientation estimation in camera captured OCR 有权
    相机拍摄的OCR中的文本方向估计

    公开(公告)号:US09224061B1

    公开(公告)日:2015-12-29

    申请号:US14464365

    申请日:2014-08-20

    CPC classification number: G06K9/3208 G06K9/3258 G06K2209/01

    Abstract: A system estimates text orientation in images captured using a handheld camera prior detecting text in the image. Text orientation is estimated based on edges detected within the image, and the image is rotated based on the estimated orientation. Text detection and processing is then performed on the rotated image. Non-text features along a periphery of the image may be sampled to assure that clutter will not undermine the estimation of orientation.

    Abstract translation: 系统估计在检测图像中的文本之前使用手持相机拍摄的图像中的文本方向。 基于在图像内检测到的边缘估计文本取向,并且基于估计的方向旋转图像。 然后对旋转的图像执行文本检测和处理。 可以对图像周边的非文本特征进行采样,以确保杂波不会破坏取向的估计。

    TASK-BASED IMAGE MASKING
    14.
    发明申请

    公开(公告)号:US20220405528A1

    公开(公告)日:2022-12-22

    申请号:US17740533

    申请日:2022-05-10

    Abstract: Techniques for masking images based on a particular task are described. A system masks portions of an image that are not relevant to a particular task, thus, reducing the amount of data used by applications for image processing tasks. For example, images to be processed using a hair color classification model are masked so that only portions that show the person's hair are available for the model to analyze. The system configures different masker components to mask images for different tasks. A masker component can be implemented at a user device to mask images prior to sending to an application/task-specific model.

    Three-dimensional mesh generation
    15.
    发明授权

    公开(公告)号:US11417061B1

    公开(公告)日:2022-08-16

    申请号:US17159853

    申请日:2021-01-27

    Abstract: Devices and techniques are generally described for three dimensional mesh generation. In various examples, first two-dimensional (2D) image data representing a human may be received. In various further examples, bounding box data identifying a location of the human in the first 2D image data and joint data identifying locations of joints of the human may be received. Second 2D image data representing a cropped portion of the human may be generated using the bounding box data and the joint data. A three-dimensional (3D) mesh prediction model may be used to determine a pose, a shape, and a projection matrix for the human. The 3D mesh prediction model may be used to determine a transformed projection matrix for the portion of the human represented in the second 2D image data.

    Task-based image masking
    16.
    发明授权

    公开(公告)号:US11334773B2

    公开(公告)日:2022-05-17

    申请号:US16913837

    申请日:2020-06-26

    Abstract: Techniques for masking images based on a particular task are described. A system masks portions of an image that are not relevant to a particular task, thus, reducing the amount of data used by applications for image processing tasks. For example, images to be processed using a hair color classification model are masked so that only portions that show the person's hair are available for the model to analyze. The system configures different masker components to mask images for different tasks. A masker component can be implemented at a user device to mask images prior to sending to an application/task-specific model.

    Sharpness-based frame selection for OCR
    17.
    发明授权
    Sharpness-based frame selection for OCR 有权
    用于OCR的基于锐度的帧选择

    公开(公告)号:US09576210B1

    公开(公告)日:2017-02-21

    申请号:US14500005

    申请日:2014-09-29

    Abstract: A system to select video frames for optical character recognition (OCR) based on feature metrics associated with blur and sharpness. A device captures a video frame including text characters. An edge detection filter is applied to the frame to determine gradient features in perpendicular directions. An “edge map” is created from the gradient features, and points along edges in the edge map are identified. Edge transition widths are determined at each of the edge points based in local intensity minimum and maximum on opposite sides of the respective edge point in the frame. Sharper edges have smaller edge transition widths than blurry images. Statistics are determined from the edge transition widths, and the statistics are processed by a trained classifier to determine if the frame is or is not sufficiently sharp for text processing.

    Abstract translation: 基于与模糊和锐度相关联的特征度量来选择用于光学字符识别(OCR)的视频帧的系统。 设备捕获包含文本字符的视频帧。 将边缘检测滤波器应用于帧以确定垂直方向上的梯度特征。 从梯度特征创建“边缘图”,并且识别沿着边缘图中边缘的点。 基于帧中相应边缘点的相对侧上的局部强度最小值和最大值,在每个边缘点处确定边缘过渡宽度。 更亮的边缘具有比模糊图像更小的边缘过渡宽度。 根据边缘转换宽度确定统计量,并且由训练有素的分类器处理统计信息,以确定帧是否为文本处理不够清晰。

    Sharpness-based frame selection for OCR
    18.
    发明授权
    Sharpness-based frame selection for OCR 有权
    用于OCR的基于锐度的帧选择

    公开(公告)号:US09418316B1

    公开(公告)日:2016-08-16

    申请号:US14500208

    申请日:2014-09-29

    CPC classification number: G06K9/3258 G06K9/6231 G06K2209/01

    Abstract: A process for training and optimizing a system to select video frames for optical character recognition (OCR) based on feature metrics associated with blur and sharpness. A set of image frames are subjectively labelled based on a comparison of each frame before and after binarization to determine to what degree text is recognizable in the binary image. A plurality of different sharpness feature metrics are generated based on the original frame. A classifier is then trained using the feature metrics and the subjective labels. The feature metrics are then tested for accuracy and/or correlation with subjective labelling data. The set of feature metrics may be refined based on which metrics produce the best results.

    Abstract translation: 基于与模糊和锐度相关的特征量度,训练和优化系统以选择用于光学字符识别(OCR)的视频帧的过程。 基于二值化之前和之后的每个帧的比较来主观地标记一组图像帧,以确定二进制图像中文本是可识别的。 基于原始帧生成多个不同的锐度特征度量。 然后使用特征指标和主观标签对分类器进行训练。 然后测试特征度量的准确性和/或与主观标记数据的相关性。 可以基于哪些度量产生最佳结果来改进特征度量集合。

Patent Agency Ranking