-
公开(公告)号:US12211274B2
公开(公告)日:2025-01-28
申请号:US17716996
申请日:2022-04-08
Applicant: The Toronto-Dominion Bank
Inventor: Junwei Ma , Satya Krishna Gorti , Maksims Volkovs , Guangwei Yu
IPC: G06V20/40 , G06V10/764 , G06V10/774 , G06V10/82 , G06V20/70
Abstract: A video localization system localizes actions in videos based on a classification model and an actionness model. The classification model is trained to make predictions of which segments of a video depict an action and to classify the actions in the segments. The actionness model predicts whether any action is occurring in each segment, rather than predicting a particular type of action. This reduces the likelihood that the video localization system over-relies on contextual information in localizing actions in video. Furthermore, the classification model and the actionness model are trained based on weakly-labeled data, thereby reducing the cost and time required to generate training data for the video localization system.
-
公开(公告)号:US20250131718A1
公开(公告)日:2025-04-24
申请号:US18988381
申请日:2024-12-19
Applicant: THE TORONTO-DOMINION BANK
Inventor: Junwei Ma , Satya Krishna Gorti , Maksims Volkovs , Guangwei Yu
IPC: G06V20/40 , G06V10/764 , G06V10/774 , G06V10/82 , G06V20/70
Abstract: A video localization system localizes actions in videos based on a classification model and an actionness model. The classification model is trained to make predictions of which segments of a video depict an action and to classify the actions in the segments. The actionness model predicts whether any action is occurring in each segment, rather than predicting a particular type of action. This reduces the likelihood that the video localization system over-relies on contextual information in localizing actions in video. Furthermore, the classification model and the actionness model are trained based on weakly-labeled data, thereby reducing the cost and time required to generate training data for the video localization system.
-
3.
公开(公告)号:US12223549B2
公开(公告)日:2025-02-11
申请号:US17747819
申请日:2022-05-18
Applicant: THE TORONTO-DOMINION BANK
Inventor: Jean-Christophe Bouëtté , Jimmy Lévesque , Marc Poulin , Satya Krishna Gorti , Keyu Long , Nicolas Gervais , Jennifer Bouchard
Abstract: A data processing system comprising: inputting a tiled image of a vehicle including four different angle views of the vehicle combined into a single image to a first machine learning model (e.g. CNN), the model trained based on historical image data to predict a first likelihood of total loss vehicle; inputting a multi-fusion of images each into a second set of machine learning models; the multi-fusion of images including a set of separate and distinct images for each of the views input separately into the second set of machine learning models, and extracting features to predict a second likelihood of total loss vehicle; inputting tabular data relating to the vehicle into a third machine learning model to predict a third likelihood of total loss vehicle for the vehicle; and aggregating the first, second and third likelihood of total loss vehicle to determine the overall likelihood of total loss.
-
公开(公告)号:US20240386326A1
公开(公告)日:2024-11-21
申请号:US18665288
申请日:2024-05-15
Applicant: The Toronto-Dominion Bank
Inventor: Guangwei YU , Maksims Volkovs , Satya Krishna Gorti , Baiju Hasmukhrai Devani
IPC: G06N20/00
Abstract: The disclosed embodiments include computer-implemented processes and systems that establish configurable pipelines for training and deploying machine-learning processes in distributed computing environments. By way of example, an apparatus may execute sequentially a plurality of application engines within a training pipeline in accordance with first configuration data, and the executed application engines may cause the at least one processor to perform operations that train a machine-learning process based on corresponding ones of a plurality of partitioned datasets. Based on artifact data associated with the sequential execution of the application engines, the apparatus may generate elements of explainability data that characterize the training of the machine-learning process within the training pipeline and in accordance with second configuration data, and transmit the explainability data to a computing system. The computer system may generate at least a portion of the second configuration data.
-
公开(公告)号:US20230351753A1
公开(公告)日:2023-11-02
申请号:US17894738
申请日:2022-08-24
Applicant: THE TORONTO-DOMINION BANK
Inventor: Satya Krishna Gorti , Junwei Ma , Guangwei Yu , Maksims Volkovs , Keyvan Golestan Irani , Noël Vouitsis
IPC: G06V20/40
Abstract: A text-video recommendation model determines relevance of a text to a video in a text-video pair (e.g., as a relevance score) with a text embedding and a text-conditioned video embedding. The text-conditioned video embedding is a representation of the video used for evaluating the relevance of the video to the text, where the representation itself is a function of the text it is evaluated for. As such, the input text may be used to weigh or attend to different frames of the video in determining the text-conditioned video embedding. The representation of the video may thus differ for different input texts for comparison. The text-conditioned video embedding may be determined in various ways, such as with a set of the most-similar frames to the input text (the top-k frames) or may be based on an attention function based on query, key, and value projections.
-
公开(公告)号:US20220335718A1
公开(公告)日:2022-10-20
申请号:US17716996
申请日:2022-04-08
Applicant: The Toronto-Dominion Bank
Inventor: Junwei Ma , Satya Krishna Gorti , Maksims Volkovs , Guangwei Yu
IPC: G06V20/40 , G06V10/764 , G06V10/774 , G06V20/70 , G06V10/82
Abstract: A video localization system localizes actions in videos based on a classification model and an actionness model. The classification model is trained to make predictions of which segments of a video depict an action and to classify the actions in the segments. The actionness model predicts whether any action is occurring in each segment, rather than predicting a particular type of action. This reduces the likelihood that the video localization system over-relies on contextual information in localizing actions in video. Furthermore, the classification model and the actionness model are trained based on weakly-labeled data, thereby reducing the cost and time required to generate training data for the video localization system.
-
-
-
-
-