-
公开(公告)号:US20180300623A1
公开(公告)日:2018-10-18
申请号:US15489234
申请日:2017-04-17
Applicant: Microsoft Technology Licensing, LLC
Inventor: Zhengyou Zhang , Dinei Afonso Florencio , Sasa Junuzovic , Yinpeng Chen
Abstract: A central server receives a venue identification query from a client device in the venue and a test data set including information collected from the venue. The central server then queries a classifier to identify the venue based on the test data. The classifier returns an identity value (venue ID) and a confidence value for the venue ID. When the confidence value is less than a threshold value, the central server obtains additional data from the client device until the venue is identified. The central server associates the venue ID with the test data set, including the additional data, and adds the test data set to training data for the classifier.
-
公开(公告)号:US20180247132A1
公开(公告)日:2018-08-30
申请号:US15445416
申请日:2017-02-28
Applicant: Microsoft Technology Licensing, LLC
Inventor: Zicheng Liu , Yinpeng Chen , Sean E Anderson , Zhengyou Zhang
CPC classification number: G06K9/00771 , G06K9/00369 , G06K9/4642 , G06T11/60 , G06T2210/12 , H04N5/23238
Abstract: Systems and methods for person counting are disclosed. A method may include retrieving an image frame from a plurality of image frames captured by a camera. The image frame may be split into a grid of a plurality of cells of a pre-determined cell dimensions. The pre-determined cell dimensions may be based on dimensions of the retrieved image frame and reference dimensions of training images of a person detection classifier. At least a portion of the plurality of cells may be rearranged to generate a new image. The new image may be padded with at least one padding strip to adjust dimensions of the new image to the reference dimensions of the training images. Person detection may be performed using the new image and the person detection classifier to obtain a number of persons detected within the new image.
-
公开(公告)号:US09971490B2
公开(公告)日:2018-05-15
申请号:US14597138
申请日:2015-01-14
Applicant: Microsoft Technology Licensing, LLC
Inventor: Yinpeng Chen , Zicheng Liu , Zhengyou Zhang
IPC: G06F3/048 , G06F3/0484 , G06F3/01 , G06F3/0488 , H04N13/02 , H04N9/04 , G06F21/32
CPC classification number: G06F3/04847 , G06F3/017 , G06F3/0484 , G06F3/04842 , G06F3/0488 , G06F3/04883 , G06F21/32 , G06F2203/04101 , H04N9/04 , H04N13/271
Abstract: The description relates to interactions with a display device. In one example, the interactions can include detecting a user proximate to a display and detecting a non-touch control gesture performed by the user proximate to the display. The example can also include presenting a graphical user interface (GUI) on the display that includes options associated with the control gesture. The example can also include receiving user input selecting one of the options and receiving additional user input from the user to interact with the GUI via the selected one of the options.
-
公开(公告)号:US12148131B2
公开(公告)日:2024-11-19
申请号:US17733634
申请日:2022-04-29
Applicant: Microsoft Technology Licensing, LLC
Inventor: Dongdong Chen , Xiyang Dai , Yinpeng Chen , Mengchen Liu , Lu Yuan
Abstract: The disclosure herein describes generating an inpainted image from a masked image using a patch-based encoder and an unquantized transformer. An image including a masked region and an unmasked region is received, and the received image is divided into a plurality of patches including masked patches. The plurality of patches is encoded into a plurality of feature vectors, wherein each patch is encoded to a feature vector. Using a transformer, a predicted token is generated for each masked patch using a feature vector encoded from the masked patch, and a quantized vector of the masked patch is determined using generated predicted token and a masked patch-specific codebook. The determined quantized vector of the masked patch is included into a set of quantized vectors associated with the plurality of patches, and an output image is generated from the set of quantized vectors using a decoder.
-
公开(公告)号:US12106531B2
公开(公告)日:2024-10-01
申请号:US17383362
申请日:2021-07-22
Applicant: Microsoft Technology Licensing, LLC
Inventor: Lijuan Wang , Zicheng Liu , Ying Jin , Hongli Deng , Kun Luo , Pei Yu , Yinpeng Chen
CPC classification number: G06V10/22 , G06T7/70 , G06V40/10 , G06T2207/30196
Abstract: To improve the accuracy and efficiency of object detection through computer digital image analysis, the detection of some objects can inform the sub-portion of the digital image to which subsequent computer digital image analysis is directed to detect other objects. In such a manner object detection can be made more efficient by limiting the image area of a digital image that is analyzed. Such efficiencies can represent both computational efficiencies and communicational efficiencies arising due to the smaller quantity of digital image data that is analyzed. Additionally, the detection of some objects can render the detection of other objects more accurate by adjusting confidence thresholds based on the detection of those related objects. Relationships between objects can be utilized to inform both the image area on which subsequent object detection is performed and the confidence level of such subsequent object detection.
-
公开(公告)号:US11238300B2
公开(公告)日:2022-02-01
申请号:US16688956
申请日:2019-11-19
Applicant: Microsoft Technology Licensing, LLC
Inventor: Nikolaos Karianakis , Zicheng Liu , Yinpeng Chen
Abstract: An object re-identifier. For each of a plurality of frames of a video, a quality of the frame is assessed and a confidence that a previously-recognized object is present in the frame is determined. The determined confidence for the frame is weighted based on the assessed quality of the frame such that frames with higher relative quality are weighted more heavily than frames with lower relative quality. An overall confidence that the previously-recognized object is present in the video is assessed based on the weighted determined confidences.
-
公开(公告)号:US20170090560A1
公开(公告)日:2017-03-30
申请号:US14866534
申请日:2015-09-25
Applicant: Microsoft Technology Licensing, LLC
Inventor: Yinpeng Chen , Sasa Junuzovic , Zhengyou Zhang , Zicheng Liu
IPC: G06F3/01 , G06K9/00 , G06F3/041 , G06F17/24 , G06F3/0346
Abstract: The large display interaction implementations described herein combine mobile devices with people tracking to enable new interactions including making a non-touch-sensitive display touch-sensitive and allowing personalized interactions with the display. One implementation tracks one or more mobile computing device users relative to a large computer-driven display, and configures content displayed on the display based on a distance a given mobile computing device user is from the display. Another implementation personalizes user interactions with a large display. One or more mobile computing device users are tracked relative to a display. The identity of each of the one or more mobile computing device users is obtained. Content displayed on the display is configured based on a distance an identified mobile computing device user is from the display and the identity of the user that provides the content.
-
公开(公告)号:US12223412B2
公开(公告)日:2025-02-11
申请号:US17123697
申请日:2020-12-16
Applicant: Microsoft Technology Licensing, LLC
Inventor: Yinpeng Chen , Xiyang Dai , Mengchen Liu , Dongdong Chen , Lu Yuan , Zicheng Liu , Ye Yu , Mei Chen , Yunsheng Li
Abstract: A computer device for automatic feature detection comprises a processor, a communication device, and a memory configured to hold instructions executable by the processor to instantiate a dynamic convolution neural network, receive input data via the communication network, and execute the dynamic convolution neural network to automatically detect features in the input data. The dynamic convolution neural network compresses the input data from an input space having a dimensionality equal to a predetermined number of channels into an intermediate space having a dimensionality less than the number of channels. The dynamic convolution neural network dynamically fuses the channels into an intermediate representation within the intermediate space and expands the intermediate representation from the intermediate space to an expanded representation in an output space having a higher dimensionality than the dimensionality of the intermediate space. The features in the input data are automatically detected based on the expanded representation.
-
公开(公告)号:US11989956B2
公开(公告)日:2024-05-21
申请号:US17222879
申请日:2021-04-05
Applicant: Microsoft Technology Licensing, LLC
Inventor: Xiyang Dai , Yinpeng Chen , Bin Xiao , Dongdong Chen , Mengchen Liu , Lu Yuan , Lei Zhang
CPC classification number: G06V20/64 , G02B27/0172 , G06T3/06 , G06T3/40
Abstract: Systems and methods for object detection generate a feature pyramid corresponding to image data, and rescaling the feature pyramid to a scale corresponding to a median level of the feature pyramid, wherein the rescaled feature pyramid is a four-dimensional (4D) tensor. The 4D tensor is reshaped into a three-dimensional (3D) tensor having individual perspectives including scale features, spatial features, and task features corresponding to different dimensions of the 3D tensor. The 3D tensor is used with a plurality of attention layers to update a plurality of feature maps associated with the image data. Object detection is performed on the image data using the updated plurality of feature maps.
-
公开(公告)号:US10678326B2
公开(公告)日:2020-06-09
申请号:US14866534
申请日:2015-09-25
Applicant: Microsoft Technology Licensing, LLC
Inventor: Yinpeng Chen , Sasa Junuzovic , Zhengyou Zhang , Zicheng Liu
IPC: G09G1/00 , G06F3/01 , G06F3/0346 , G06F3/041 , G06K9/00 , G08C17/00 , H04N21/466 , H04N21/422 , H04N21/414 , H04M1/725 , G06F40/169
Abstract: The large display interaction implementations described herein combine mobile devices with people tracking to enable new interactions including making a non-touch-sensitive display touch-sensitive and allowing personalized interactions with the display. One implementation tracks one or more mobile computing device users relative to a large computer-driven display, and configures content displayed on the display based on a distance a given mobile computing device user is from the display. Another implementation personalizes user interactions with a large display. One or more mobile computing device users are tracked relative to a display. The identity of each of the one or more mobile computing device users is obtained. Content displayed on the display is configured based on a distance an identified mobile computing device user is from the display and the identity of the user that provides the content.
-
-
-
-
-
-
-
-
-