IMAGE DETECTION METHOD AND APPARATUS
    81.
    发明公开

    公开(公告)号:US20240054760A1

    公开(公告)日:2024-02-15

    申请号:US18378405

    申请日:2023-10-10

    摘要: An image detection method and apparatus are disclosed. The method includes: performing feature extraction processing on the image to obtain a feature representation subset of the image; generating attention weights corresponding to the at least two sub-image features; performing weighting aggregation processing on the at least two sub-image features according to the attention weights to obtain a first feature vector; performing clustering sampling processing on the at least two sub-image features to obtain at least two classification clusters comprising sampled sub-image features; determining a block sparse self-attention for each of the sampled sub-image features according to the at least two classification clusters and a block sparse matrix; determining a second feature vector according to at least two block sparse self-attentions respectively corresponding to the at least two classification clusters; and determining a classification result of the image according to the first feature vector and the second feature vector.

    METHOD AND SYSTEM FOR DETERMINING COMPLETE ICON

    公开(公告)号:US20240046665A1

    公开(公告)日:2024-02-08

    申请号:US18268630

    申请日:2021-12-22

    摘要: A method for determining a complete icon includes: acquiring an image, and delineating determination regions in a peripheral region of the image; scanning and counting a number of pixels of a first color corresponding to an icon in the whole image and a number of pixels of a second color corresponding to an auxiliary identifier in each determination region; determining, if the number of pixels of the first color is less than or equal to a first threshold, or the number of pixels of the second color in one or more determination regions is less than or equal to a second threshold, that the icon is incomplete; and determining, if the number of pixels of the first color is greater than the first threshold, and the number of pixels of the second color in each of the determination regions is greater than the second threshold, that the icon is complete.

    SYSTEM FOR OPTIMIZING VISION TRANSFORMER BLOCKS

    公开(公告)号:US20240046630A1

    公开(公告)日:2024-02-08

    申请号:US18359774

    申请日:2023-07-26

    摘要: A system for optimizing a vision transformer block for use with mobile vision transformers utilized for tasks, such as image classification, segmentation, and objected detection is disclosed. The system includes incorporating a 1×1 convolutional layer in place of a 3×3 convolutional layer in a fusion block of the vision transformer block to reduce constraints on scaling neural network size. Additionally, the system includes fusing local and global representations in the fusion block of the vision transformer block instead of fusing input features and global representations. Furthermore, the system includes fusing input features in the fusion block by adding the input features to the output of the 1×1 convolutional layer of the fusion block. Moreover, the system includes substituting a 3×3 convolutional layer in the local representation block of the vision transformer block with a depthwise-separable 3×3 convolutional layer. The optimized transformer block enhances image classification, segmentation, and object detection.

    AUTOMATED DETECTION AND RECOMMENDATIONS RELATED TO ACCESSIBILITY FEATURE COMPLIANCE IN PHYSICAL ENVIRONMENTS

    公开(公告)号:US20240029428A1

    公开(公告)日:2024-01-25

    申请号:US17870632

    申请日:2022-07-21

    摘要: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining whether a physical environment includes structural features that comply with accessibility guidelines. In one example method, input data, which can include image data representing an image of a particular portion of a physical environment, can be received from a client device. The image data can be input to a trained accessibility feature detection model, which can be trained to detect a particular structural feature and determine whether it meets a first accessibility guideline for the particular structural feature. The data output by the model can be used to determine whether the image data includes the particular structural feature that meets the first accessibility guideline, and based on this determination, an accessibility report can be generated and provided for display on the client device.