-
1.
公开(公告)号:US20230273940A1
公开(公告)日:2023-08-31
申请号:US17682187
申请日:2022-02-28
发明人: Guanghua Shu , Taesik Na , Zhihong Xu , Wideet Shende , Manmeet Singh , Tejaswi Tenneti , Reza Sadri
IPC分类号: G06F16/28 , G06F16/22 , G06F16/2455 , G06F11/34
CPC分类号: G06F16/283 , G06F16/2228 , G06F16/24556 , G06F16/285 , G06F11/3409
摘要: An online system maintains item embeddings for items. As a number of items maintained by the online system increases, maintaining a single index of the item embeddings is increasingly difficult. To increase scalability, the online system partitions item embeddings into multiple indices, with each index corresponding to a value of a specific attribute maintained by the online system for items. For example, an online system generates indices that each correspond to a different warehouse offering items. To expedite retrieval of item embeddings, the online system allocates each index to one of a number of shards. When the online system receives a query, the online system determines an embedding for the query and retrieves an index from a shard based on metadata received with the query. Based on distances between the query for the embedding and the item embeddings in the retrieved index, the online system selects one or more items.
-
2.
公开(公告)号:US20240362523A1
公开(公告)日:2024-10-31
申请号:US18140203
申请日:2023-04-27
发明人: Guanghua Shu , Reza Sadri , Jacob Jensen , Sahil Khanna
IPC分类号: G06N20/00
CPC分类号: G06N20/00
摘要: A system maintains a data store for managing machine-learning (ML) models and features that are used by the models. The system generates a graph including nodes for each model and a node for each feature, and edges linking models and features that are used by the models. For a new model to be trained, the system receives a proposed feature corresponding to a node in the graph, and identifies one or more candidate features corresponding to nodes in the graph based in part on relevancy scores between the proposed feature with other features corresponding to nodes in the graph. The system presents in a user interface a suggestion to use one or more candidate features with the new model. Responsive to receiving a user selection of at least one candidate feature, the system causes the new model to be trained using the selected candidate feature and the proposed feature.
-
3.
公开(公告)号:US20240362455A1
公开(公告)日:2024-10-31
申请号:US18140210
申请日:2023-04-27
发明人: Guanghua Shu , Reza Sadri , Jacob Jensen , Sahil Khanna
摘要: A feature management system (the “system”) receives information about a new machine learning (ML) model to be trained. The information includes metadata about the new model. The system applies a trained feature prediction model to the information about the new model and metadata about a plurality of features. The feature prediction model is trained to predict a probability that each of the plurality of features should be selected as an input feature for the new model. The feature management system identifies one or more candidate features based on an output probability score of the feature prediction model. The system presents in a user interface a suggestion to use the one or more candidate features with the new model. The system selects at least one candidate feature and causes the new model to be trained using a set of input features, including the selected candidate feature.
-
4.
公开(公告)号:US20230252032A1
公开(公告)日:2023-08-10
申请号:US17666531
申请日:2022-02-07
发明人: Taesik Na , Zhihong Xu , Guanghua Shu , Tejaswi Tenneti , Haixun Wang
IPC分类号: G06F16/2457 , G06F16/242
CPC分类号: G06F16/24578 , G06F16/2438
摘要: An online system maintains various items and maintains values for different attributes of the items, as well as an item embedding for each item. When the online system receives a query for retrieving one or more items, the online system generates an embedding for the query. Based on measures of similarity between the embedding for the query and item embeddings, the online system selects a set of items. The online system identifies a specific attribute of items and generates a whitelist of values for the specific attribute based on measures of similarity between item embeddings for items in the selected set and the embedding for the query. The online system removes items having values for the selected attribute outside of the whitelist of values from the selected set of items to identify items more likely to be relevant to the query.
-
-
-