Abstract:
The present disclosure provides systems and methods to reduce computational costs associated with convolutional neural networks. In addition, the present disclosure provides a class of efficient models termed “MobileNets” for mobile and embedded vision applications. MobileNets are based on a straight-forward architecture that uses depthwise separable convolutions to build light weight deep neural networks. The present disclosure further provides two global hyper-parameters that efficiently trade-off between latency and accuracy. These hyper-parameters allow the entity building the model to select the appropriately sized model for the particular application based on the constraints of the problem. MobileNets and associated computational cost reduction techniques are effective across a wide range of applications and use cases.
Abstract:
The present disclosure provides systems and methods to reduce computational costs associated with convolutional neural networks. In addition, the present disclosure provides a class of efficient models termed “MobileNets” for mobile and embedded vision applications. MobileNets are based on a straight-forward architecture that uses depthwise separable convolutions to build light weight deep neural networks. The present disclosure further provides two global hyper-parameters that efficiently trade-off between latency and accuracy. These hyper-parameters allow the entity building the model to select the appropriately sized model for the particular application based on the constraints of the problem. MobileNets and associated computational cost reduction techniques are effective across a wide range of applications and use cases.
Abstract:
Methods, systems, and apparatus are provided for determining location information for images. In one aspect, a method includes obtaining landmark location data from content depicted in an image and corresponding confidence scores. Also, the method includes obtaining caption location data from user input and corresponding confidence scores, and obtaining metadata location data from data provided by an image capturing device. Further, the method includes identifying location pairs from the landmark, caption, and metadata location data, and generating, for each location pair, a geographic consistency score. Additionally, the method includes selecting a location pair based on the geographic consistency scores, and selecting an image location for the image from the selected location pair. Moreover, the method includes determining an image location score based on a confidence score for one of the locations in the selected location pair, and associating the image location and image location score with the image.