REDUCING ARCHITECTURAL COMPLEXITY OF CONVOLUTIONAL NEURAL NETWORKS VIA CHANNEL PRUNING

    公开(公告)号:US20190251441A1

    公开(公告)日:2019-08-15

    申请号:US15895795

    申请日:2018-02-13

    CPC classification number: G06N3/082 G06N3/0481

    Abstract: The architectural complexity of a neural network is reduced by selectively pruning channels. A cost metric for a convolution layer is determined. The cost metric indicates a resource cost per channel for the channels of the layer. Training the neural network includes, for channels of the layer, updating a channel-scaling coefficient based on the cost metric. The channel-scaling coefficient linearly scales the output of the channel. A constant channel is identified based on the channel-scaling coefficients. The neural network is updated by pruning the constant channel. Model weights are updated via a stochastic gradient descent of a training loss function evaluated on training data. The channel-scaling coefficients are updated via an iterative-thresholding algorithm that penalizes a batch normalization loss function based on the cost metric for the layer and a norm of the channel-scaling coefficients. When the layer is batch normalized, the channel-scaling coefficients are batch normalization scaling coefficients.

    NEURAL NETWORK BASED FACE DETECTION AND LANDMARK LOCALIZATION

    公开(公告)号:US20190147224A1

    公开(公告)日:2019-05-16

    申请号:US15815635

    申请日:2017-11-16

    Abstract: Approaches are described for determining facial landmarks in images. An input image is provided to at least one trained neural network that determines a face region (e.g., bounding box of a face) of the input image and initial facial landmark locations corresponding to the face region. The initial facial landmark locations are provided to a 3D face mapper that maps the initial facial landmark locations to a 3D face model. A set of facial landmark locations are determined from the 3D face model. The set of facial landmark locations are provided to a landmark location adjuster that adjusts positions of the set of facial landmark locations based on the input image. The input image is presented on a user device using the adjusted set of facial landmark locations.

    ACCURATE TAG RELEVANCE PREDICTION FOR IMAGE SEARCH

    公开(公告)号:US20170236055A1

    公开(公告)日:2017-08-17

    申请号:US15094633

    申请日:2016-04-08

    Abstract: Embodiments of the present invention provide an automated image tagging system that can predict a set of tags, along with relevance scores, that can be used for keyword-based image retrieval, image tag proposal, and image tag auto-completion based on user input. Initially, during training, a clustering technique is utilized to reduce cluster imbalance in the data that is input into a convolutional neural network (CNN) for training feature data. In embodiments, the clustering technique can also be utilized to compute data point similarity that can be utilized for tag propagation (to tag untagged images). During testing, a diversity based voting framework is utilized to overcome user tagging biases. In some embodiments, bigram re-weighting can down-weight a keyword that is likely to be part of a bigram based on a predicted tag set.

    SEARCHING UNTAGGED IMAGES WITH TEXT-BASED QUERIES
    6.
    发明申请
    SEARCHING UNTAGGED IMAGES WITH TEXT-BASED QUERIES 审中-公开
    使用基于文本的查询搜索未经处理的图像

    公开(公告)号:US20170004383A1

    公开(公告)日:2017-01-05

    申请号:US14788113

    申请日:2015-06-30

    Abstract: In various implementations, a personal asset management application is configured to perform operations that facilitate the ability to search multiple images, irrespective of the images having characterizing tags associated therewith or without, based on a simple text-based query. A first search is conducted by processing a text-based query to produce a first set of result images used to further generate a visually-based query based on the first set of result images. A second search is conducted employing the visually-based query that was based on the first set of result images received in accordance with the first search conducted and based on the text-based query. The second search can generate a second set of result images, each having visual similarity to at least one of the images generated for the first set of result images.

    Abstract translation: 在各种实现中,个人资产管理应用被配置为执行操作,其便于搜索多个图像的能力,而不管基于简单的基于文本的查询,具有与其相关联的或不具有特征标签的图像。 通过处理基于文本的查询以产生用于基于第一组结果图像进一步生成基于视觉的查询的第一组结果图像来进行第一搜索。 使用基于基于根据所进行的第一次搜索接收的第一组结果图像并基于基于文本的查询的基于视觉的查询进行第二搜索。 第二搜索可以产生第二组结果图像,每个结果图像与对于第一组结果图像生成的图像中的至少一个图像具有视觉相似性。

    LEARNING IMAGE CATEGORIZATION USING RELATED ATTRIBUTES
    10.
    发明申请
    LEARNING IMAGE CATEGORIZATION USING RELATED ATTRIBUTES 有权
    使用相关属性学习图像分类

    公开(公告)号:US20160034788A1

    公开(公告)日:2016-02-04

    申请号:US14447296

    申请日:2014-07-30

    CPC classification number: G06T7/33 G06K9/627 G06N3/0454

    Abstract: A first set of attributes (e.g., style) is generated through pre-trained single column neural networks and leveraged to regularize the training process of a regularized double-column convolutional neural network (RDCNN). Parameters of the first column (e.g., style) of the RDCNN are fixed during RDCNN training Parameters of the second column (e.g., aesthetics) are fine-tuned while training the RDCNN and the learning process is supervised by the label identified by the second column (e.g., aesthetics). Thus, features of the images may be leveraged to boost classification accuracy of other features by learning a RDCNN.

    Abstract translation: 通过预训练的单列神经网络产生第一组属性(例如,样式),并且利用正则化的双列卷积神经网络(RDCNN)的训练过程。 在RDCNN训练期间RDCNN的第一列(例如,样式)的参数是固定的在第二列的参数(例如,美学)中进行微调,同时训练RDCNN,学习过程由第二列标识的标签 (如美学)。 因此,可以利用图像的特征来通过学习RDCNN来提高其他特征的分类精度。

Patent Agency Ranking