-
公开(公告)号:US20160358015A1
公开(公告)日:2016-12-08
申请号:US15238413
申请日:2016-08-16
Applicant: A9.com, Inc.
Inventor: Arnab Sanat Kumar Dhua , Gautam Bhargava , Douglas Ryan Gray , Sunil Ramesh , Colin Jon Taylor
IPC: G06K9/00
CPC classification number: G06K9/00288 , G06F16/784 , G06K9/00221 , G06K9/00228 , G06K9/00261 , G06K9/00711
Abstract: Disclosed are various embodiments for detection of cast members in video content such as movies, television shows, and other programs. Data indicating cast members who appear in a video program is obtained. Each cast member is associated with a reference image depicting a face of the cast member. A frame is obtained from the video program, and a face is detected in the frame. The frame can correspond to a scene in the video program. The detected face in the frame is recognized as being a particular cast member based at least in part on the reference image depicting the cast member. An association between the cast member and the frame is generated in response to the detected face in the frame being recognized as the cast member.
Abstract translation: 公开了用于在诸如电影,电视节目和其他节目的视频内容中检测演员的各种实施例。 获取表示出现在视频节目中的演员的数据。 每个铸造构件与描绘铸件的表面的参考图像相关联。 从视频节目获得帧,并且在帧中检测到一个脸部。 该帧可以对应于视频节目中的场景。 至少部分地基于描绘铸件的参考图像将框架中检测到的面部识别为特定铸件。 响应于被识别为铸件的框架中检测到的面而产生铸造构件和框架之间的关联。
-
公开(公告)号:US20160133299A1
公开(公告)日:2016-05-12
申请号:US14997351
申请日:2016-01-15
Applicant: A9.com, Inc.
Inventor: Ismet Zeki Yalniz , Adam Carlson , Douglas Ryan Gray , Colin Jon Taylor
IPC: G11B27/30 , G11B27/34 , G11B27/036
CPC classification number: G11B27/3072 , G11B27/031 , G11B27/036 , G11B27/10 , G11B27/3081 , G11B27/34
Abstract: Various embodiments identify differences between frame sequences of a video. For example, to determine a difference between two versions of a video, a fingerprint of each frame of the two versions is generated. From the fingerprints, a run-length encoded representation of each version is generated. The fingerprints which appear only once (i.e., unique fingerprints) in the entire video are identified from each version and compared to identify matching unique fingerprints across versions. The matching unique fingerprints are sorted and filtered to determine split points, which are used to align the two versions of the video. Accordingly, each version is segmented into smaller frame sequences using the split points. Once segmented, the individual frames of each segment are aligned across versions using a dynamic programming algorithm. After aligning the segments at a frame level, the segments are reassembled to generate a global alignment output.
Abstract translation: 各种实施例识别视频的帧序列之间的差异。 例如,为了确定视频的两个版本之间的差异,生成两个版本的每个帧的指纹。 从指纹中,生成每个版本的游程长度编码表示。 从每个版本识别整个视频中仅出现一次的指纹(即,唯一指纹),并进行比较以识别跨越版本的匹配的唯一指纹。 匹配的唯一指纹被分类和过滤以确定分割点,其用于对准视频的两个版本。 因此,使用分割点将每个版本分割成较小的帧序列。 一旦分段,每个段的各个帧在版本之间使用动态规划算法对齐。 在帧级别对齐段之后,重新组合段以产生全局对准输出。
-
公开(公告)号:US09292739B1
公开(公告)日:2016-03-22
申请号:US14105084
申请日:2013-12-12
Applicant: A9.com, Inc.
Inventor: Douglas Ryan Gray , Colin Jay Taylor , Xiaofan Lin , Adam Wiggen Kraft , Yu Lou , Arnab Sanat Kumar Dhua
CPC classification number: G06K9/033 , G06K9/228 , G06K9/6292 , G06K2009/2045 , G06K2209/01
Abstract: Various embodiments enable text aggregation from multiple image frames of text. Accordingly, in order to stitch newly scanned areas of a document together, text in a respective image is recognized and analyzed using an algorithm to identify pairs of corresponding words in other images. Upon identifying a minimum number of matching pairs between two respective images, a mapping between the same can be determined based at least in part on a geometric correspondence between respective identified pairs. Based on this mapping, the recognized text of the two images can be merged by adding words of one image to the other using the matching word pairs as alignment data points.
Abstract translation: 各种实施例使得能够从文本的多个图像帧进行文本聚合。 因此,为了将文档的新扫描区域一起缝合,使用用于识别其他图像中的对应词对的算法来识别和分析各个图像中的文本。 在识别两个相应图像之间的匹配对的最小数量时,可以至少部分地基于相应的识别对之间的几何对应来确定相同之间的映射。 基于该映射,可以通过使用匹配词对作为对齐数据点将一个图像的单词添加到另一个图像来合并两个图像的识别文本。
-
公开(公告)号:US09247129B1
公开(公告)日:2016-01-26
申请号:US14015884
申请日:2013-08-30
Applicant: A9.com, Inc.
Inventor: Douglas Ryan Gray , Colin Jon Taylor , Xiaofan Lin
IPC: H04N5/232
CPC classification number: G06T5/00 , G06F17/30256 , G06K9/00221 , G06T5/30 , G06T7/11 , G06T7/194 , G06T11/60 , G06T2207/10004 , G06T2207/20221 , G06T2207/30201 , H04N5/23222 , H04N5/23238 , H04N5/272
Abstract: Systems and approaches are provided for optimizing self-portraiture. The background of the self-portrait can be enhanced by image registration or stitching techniques of images captured using one or more conventional cameras. Multiple standard resolution images can be stitched together to generate a panoramic or a composite image of a higher resolution. Foreground elements, such as one or more representations of users, can also be enhanced in various ways. The representations of the users can be composited to exclude undesirable elements, such as image data of one of the users extending her arm to capture the self-portrait. An ideal pose of the users can automatically be selected and other image enhancements, such as histogram optimization, brightness and contrast optimization, color-cast correction, or reduction or removal of noise, can automatically be performed to minimize user effort in capturing self-portraits.
Abstract translation: 提供了系统和方法来优化自画像。 可以通过使用一个或多个传统照相机捕获的图像的图像配准或拼接技术来增强自画像的背景。 可以将多个标准分辨率图像拼接在一起以产生更高分辨率的全景或合成图像。 诸如用户的一个或多个表示的前景元素也可以以各种方式增强。 用户的表示可以被合成以排除不期望的元素,例如延伸她的手臂以捕获自画像的其中一个用户的图像数据。 可以自动选择用户的理想姿势,并且可以自动执行其他图像增强功能,如直方图优化,亮度和对比度优化,色差校正,或减少或消除噪点,以尽量减少用户拍摄自画像的工作量 。
-
公开(公告)号:US20230298073A1
公开(公告)日:2023-09-21
申请号:US18200806
申请日:2023-05-23
Applicant: A9.com, Inc.
Inventor: Douglas Ryan Gray , Arnab Sanat Kumar Dhua , Xiaofan Lin , Zhijiang Mark Lu
IPC: G06Q30/0241 , G06F40/14
CPC classification number: G06Q30/0277 , G06F40/14
Abstract: A computing device can obtain data describing at least one document, the at least one document referencing at least one media object, wherein a portion of the at least one media object includes one or more characters. The computing device can obtain data describing the one or more characters in the at least one media object in the at least one document. The computing device can generate an updated copy of the at least one document that includes the data describing the one or more characters in the at least one media object. The computing device can present, on a display screen of the computing device and through an interface, the updated copy of the at least one document, wherein the one or more characters in the at least one media object are able to be selected or searched.
-
公开(公告)号:US20210174401A1
公开(公告)日:2021-06-10
申请号:US17170010
申请日:2021-02-08
Applicant: A9.com, Inc.
Inventor: Douglas Ryan Gray , Arnab Sanat Kumar Dhua , Xiaofan Lin , Zhijiang Mark Lu
Abstract: A computing device can obtain data describing at least one document, the at least one document referencing at least one media object, wherein a portion of the at least one media object includes one or more characters. The computing device can obtain data describing the one or more characters in the at least one media object in the at least one document. The computing device can generate an updated copy of the at least one document that includes the data describing the one or more characters in the at least one media object. The computing device can present, on a display screen of the computing device and through an interface, the updated copy of the at least one document, wherein the one or more characters in the at least one media object are able to be selected or searched.
-
公开(公告)号:US10956784B2
公开(公告)日:2021-03-23
申请号:US16222318
申请日:2018-12-17
Applicant: A9.com, Inc.
Inventor: Douglas Ryan Gray , Alexander Li Honda , Edward Hsiao
Abstract: An image creation and editing tool can use the data produced from training a neural network to add stylized representations of an object to an image. An object classification will correspond to an object representation, and pixel values for the object representation can be added to, or blended with, the pixel values of an image in order to add a visualization of a type of object to the image. Such an approach can be used to add stylized representations of objects to existing images or create new images based on those representations. The visualizations can be used to create patterns and textures as well, as may be used to paint or fill various regions of an image. Such patterns can enable regions to be filled where image data has been deleted, such as to remove an undesired object, in a way that appears natural for the contents of the image.
-
公开(公告)号:US10157332B1
公开(公告)日:2018-12-18
申请号:US15174628
申请日:2016-06-06
Applicant: A9.com, Inc.
Inventor: Douglas Ryan Gray , Alexander Li Honda , Edward Hsiao
Abstract: An image creation and editing tool can use the data produced from training a neural network to add stylized representations of an object to an image. An object classification will correspond to an object representation, and pixel values for the object representation can be added to, or blended with, the pixel values of an image in order to add a visualization of a type of object to the image. Such an approach can be used to add stylized representations of objects to existing images or create new images based on those representations. The visualizations can be used to create patterns and textures as well, as may be used to paint or fill various regions of an image. Such patterns can enable regions to be filled where image data has been deleted, such as to remove an undesired object, in a way that appears natural for the contents of the image.
-
公开(公告)号:US09875258B1
公开(公告)日:2018-01-23
申请号:US14973578
申请日:2015-12-17
Applicant: A9.com, Inc.
Inventor: Edward Hsiao , Douglas Ryan Gray
CPC classification number: G06F17/30277 , G06F17/30253 , G06F17/30256 , G06F17/30259 , G06F17/30271 , G06F17/30554 , G06F17/30864 , G06F17/30976 , G06N3/08
Abstract: Approaches include using a machine learning-based approach to generating search strings and refinements based on a specific item represented in an image. For example, a classifier that is trained on descriptions of images can be provided. An image that includes a representation of an item of interest is obtained. The image is analyzed using the classifier algorithm to determine a first term representing a visual characteristic of the image. Then, the image is analyzed again to determine a second term representing another visual characteristic of the image based at least in part on the first term. Additional terms can be determined to generate a description of the image, including characteristics of the item of interest. Based on the determined characteristics of the item of interest, a search query and one or more refinements can be generated.
-
公开(公告)号:US09436883B2
公开(公告)日:2016-09-06
申请号:US14816943
申请日:2015-08-03
Applicant: A9.com, Inc.
Inventor: Xiaofan Lin , Adam Wiggen Kraft , Yu Lou , Douglas Ryan Gray , Colin Jon Taylor
CPC classification number: G06K9/18 , G06K9/00456 , G06K9/00523 , G06K9/228 , G06K2209/01 , G06T7/11
Abstract: Various embodiments provide methods and systems for identifying text in an image by applying suitable text detection parameters in text detection. The suitable text detection parameters can be determined based on parameter metric feedback from one or more text identification subtasks, such as text detection, text recognition, preprocessing, character set mapping, pattern matching and validation. In some embodiments, the image can be defined into one or more image regions by performing glyph detection on the image. Text detection parameters applying to each of the one or more image regions can be adjusted based on measured one or more parameter metrics in the respective image region.
-
-
-
-
-
-
-
-
-