-
公开(公告)号:US20180268548A1
公开(公告)日:2018-09-20
申请号:US15458887
申请日:2017-03-14
Applicant: ADOBE SYSTEMS INCORPORATED
Inventor: Zhe Lin , Xin Lu , Xiaohui Shen , Jimei Yang , Chenxi Liu
Abstract: The invention is directed towards segmenting images based on natural language phrases. An image and an n-gram, including a sequence of tokens, are received. An encoding of image features and a sequence of token vectors are generated. A fully convolutional neural network identifies and encodes the image features. A word embedding model generates the token vectors. A recurrent neural network (RNN) iteratively updates a segmentation map based on combinations of the image feature encoding and the token vectors. The segmentation map identifies which pixels are included in an image region referenced by the n-gram. A segmented image is generated based on the segmentation map. The RNN may be a convolutional multimodal RNN. A separate RNN, such as a long short-term memory network, may iteratively update an encoding of semantic features based on the order of tokens. The first RNN may update the segmentation map based on the semantic feature encoding.
-
公开(公告)号:US10089742B1
公开(公告)日:2018-10-02
申请号:US15458887
申请日:2017-03-14
Applicant: ADOBE SYSTEMS INCORPORATED
Inventor: Zhe Lin , Xin Lu , Xiaohui Shen , Jimei Yang , Chenxi Liu
Abstract: The invention is directed towards segmenting images based on natural language phrases. An image and an n-gram, including a sequence of tokens, are received. An encoding of image features and a sequence of token vectors are generated. A fully convolutional neural network identifies and encodes the image features. A word embedding model generates the token vectors. A recurrent neural network (RNN) iteratively updates a segmentation map based on combinations of the image feature encoding and the token vectors. The segmentation map identifies which pixels are included in an image region referenced by the n-gram. A segmented image is generated based on the segmentation map. The RNN may be a convolutional multimodal RNN. A separate RNN, such as a long short-term memory network, may iteratively update an encoding of semantic features based on the order of tokens. The first RNN may update the segmentation map based on the semantic feature encoding.
-
公开(公告)号:US20190035083A1
公开(公告)日:2019-01-31
申请号:US16116609
申请日:2018-08-29
Applicant: Adobe Systems Incorporated
Inventor: Zhe Lin , Xin Lu , Xiaohui Shen , Jimei Yang , Chenxi Liu
CPC classification number: G06T7/11 , G06F17/2715 , G06F17/277 , G06F17/2785 , G06K9/00664 , G06K9/6274 , G06N3/0445 , G06N3/0454 , G06N3/084 , G06T2207/20084 , G06T2207/20101 , G10L15/26
Abstract: The invention is directed towards segmenting images based on natural language phrases. An image and an n-gram, including a sequence of tokens, are received. An encoding of image features and a sequence of token vectors are generated. A fully convolutional neural network identifies and encodes the image features. A word embedding model generates the token vectors. A recurrent neural network (RNN) iteratively updates a segmentation map based on combinations of the image feature encoding and the token vectors. The segmentation map identifies which pixels are included in an image region referenced by the n-gram. A segmented image is generated based on the segmentation map. The RNN may be a convolutional multimodal RNN. A separate RNN, such as a long short-term memory network, may iteratively update an encoding of semantic features based on the order of tokens. The first RNN may update the segmentation map based on the semantic feature encoding.
-
-