发明授权
- 专利标题: Align-to-ground, weakly supervised phrase grounding guided by image-caption alignment
-
申请号: US16855362申请日: 2020-04-22
-
公开(公告)号: US11238631B2公开(公告)日: 2022-02-01
- 发明人: Karan Sikka , Ajay Divakaran , Samyak Datta
- 申请人: SRI International
- 申请人地址: US CA Menlo Park
- 专利权人: SRI International
- 当前专利权人: SRI International
- 当前专利权人地址: US CA Menlo Park
- 代理机构: Moser Taboada
- 主分类号: G06T11/60
- IPC分类号: G06T11/60 ; G06F16/56 ; G06F16/51
摘要:
A method, apparatus and system for visual grounding of a caption in an image include projecting at least two parsed phrases of the caption into a trained semantic embedding space, projecting extracted region proposals of the image into the trained semantic embedding space, aligning the extracted region proposals and the at least two parsed phrases, aggregating the aligned region proposals and the at least two parsed phrases to determine a caption-conditioned image representation and projecting the caption-conditioned image representation and the caption into a semantic embedding space to align the caption-conditioned image representation and the caption. The method, apparatus and system can further include a parser for parsing the caption into the at least two parsed phrases and a region proposal module for extracting the region proposals from the image.
公开/授权文献
信息查询