Selection of outlier-detection programs specific to dataset meta-features

    公开(公告)号:US11762730B2

    公开(公告)日:2023-09-19

    申请号:US17150890

    申请日:2021-01-15

    Applicant: Adobe Inc.

    Inventor: Ryan Rossi

    Abstract: Embodiments described herein involve selecting outlier-detection programs that are specific to meta-features of datasets. For instance, a computing system constructs a performance vector from a U vector and a reference V matrix. Vector elements of the performance vector identify estimated performance values of various outlier-detection programs with respect to an input dataset. The U vector is generated using meta-features of the input dataset. The reference V matrix is generated from a training process in which performance values of the various outlier-detection programs with respect to training input datasets are used to obtain the reference V matrix via a UV decomposition. The computing system selects an outlier-detection program having a greater estimated performance value in the performance vector as compared to other outlier-detection programs' respective estimated performance values.

    Graph convolutional networks with motif-based attention

    公开(公告)号:US11544535B2

    公开(公告)日:2023-01-03

    申请号:US16297024

    申请日:2019-03-08

    Applicant: Adobe Inc.

    Abstract: Various embodiments describe techniques for making inferences from graph-structured data using graph convolutional networks (GCNs). The GCNs use various pre-defined motifs to filter and select adjacent nodes for graph convolution at individual nodes, rather than merely using edge-defined immediate-neighbor adjacency for information integration at each node. In certain embodiments, the graph convolutional networks use attention mechanisms to select a motif from multiple motifs and select a step size for each respective node in a graph, in order to capture information from the most relevant neighborhood of the respective node.

    KNOWLEDGE-DERIVED SEARCH SUGGESTION

    公开(公告)号:US20220253477A1

    公开(公告)日:2022-08-11

    申请号:US17170520

    申请日:2021-02-08

    Applicant: ADOBE INC.

    Abstract: The present disclosure describes systems and methods for information retrieval. Embodiments of the disclosure provide a retrieval network that leverages external knowledge to provide reformulated search query suggestions, enabling more efficient network searching and information retrieval. For example, a search query from a user (e.g., a query mention of a knowledge graph entity that is included in a search query from a user) may be added to a knowledge graph as a surrogate entity via entity linking. Embedding techniques are then invoked on the updated knowledge graph (e.g., the knowledge graph that includes additional edges between surrogate entities and other entities of the original knowledge graph), and entities neighboring the surrogate entity are retrieved based on the embedding (e.g., based on a computed distance between the surrogate entity and candidate entities in the embedding space). Search results can then be ranked and displayed based on relevance to the neighboring entity.

    Systems and methods for estimating typed graphlets in large data

    公开(公告)号:US11343325B2

    公开(公告)日:2022-05-24

    申请号:US17008339

    申请日:2020-08-31

    Applicant: Adobe Inc.

    Abstract: A system and method for fast, accurate, and scalable typed graphlet estimation. The system and method utilizes typed edge sampling and typed path sampling to estimate typed graphlet counts in large graphs in a small fraction of the computing time of existing systems. The obtained unbiased estimates of typed graphlets are highly accurate, and have applications in the analysis, mining, and predictive modeling of massive real-world networks. During operation, the system obtains a dataset indicating nodes and edges of a graph. The system samples a portion of the graph and counts a number of graph features in the sampled portion of the graph. The system then computes an occurrence frequency of a typed graphlet pattern and a total number of typed graphlets associated with the typed graphlet pattern in the graph.

    System for identifying typed graphlets

    公开(公告)号:US11170048B2

    公开(公告)日:2021-11-09

    申请号:US16451956

    申请日:2019-06-25

    Applicant: Adobe Inc.

    Abstract: A system is disclosed for identifying and counting typed graphlets in a heterogeneous network. A methodology implementing techniques for the disclosed system according to an embodiment includes identifying typed k-node graphlets occurring between any two selected nodes of a heterogeneous network, wherein the nodes are connected by one or more edges. The identification is based on combinatorial relationships between (k−1)-node typed graphlets occurring between the two selected nodes of the heterogeneous network. Identification of 3-node typed graphlets is based on computation of typed triangles, typed 3-node stars, and typed 3-paths associated with each edge connecting the selected nodes. The method further includes maintaining a count of the identified k-node typed graphlets and storing those graphlets with non-zero counts. The identified graphlets are employed for applications including visitor stitching, user profiling, outlier detection, and link prediction.

    DYNAMICALLY DETERMINING SCHEMA LABELS USING A HYBRID NEURAL NETWORK ENCODER

    公开(公告)号:US20210232908A1

    公开(公告)日:2021-07-29

    申请号:US16751755

    申请日:2020-01-24

    Applicant: Adobe Inc.

    Abstract: Systems, methods, and non-transitory computer-readable media are disclosed for dynamically determining schema labels for columns regardless of information availability within the columns. For example, the disclosed systems can identify a column that contains an arbitrary amount of information (e.g., a header-only column, a cell-only column, or a whole column). Additionally, the disclosed systems can generate a vector embedding for an arbitrary input column by selectively using a header neural network and/or a cell neural network based on whether the column includes a header label and/or whether the column includes a populated column cell. Furthermore, the disclosed systems can compare the column vector embedding to schema vector embeddings of candidate schema labels in a d-dimensional space to determine a schema label for the column.

    Open-domain trending hashtag recommendations

    公开(公告)号:US12050647B2

    公开(公告)日:2024-07-30

    申请号:US17877469

    申请日:2022-07-29

    Applicant: Adobe Inc.

    CPC classification number: G06F16/9024 G06N3/045 G06Q50/01

    Abstract: Techniques for recommending hashtags, including trending hashtags, are disclosed. An example method includes accessing a graph. The graph includes video nodes representing videos, historical hashtag nodes representing historical hashtags, and edges indicating associations among the video nodes and the historical hashtag nodes. A trending hashtag is identified. An edge is added to the graph between a historical hashtag node representing a historical hashtag and a trending hashtag node representing the trending hashtag, based on a semantic similarity between the historical hashtag and the trending hashtag. A new video node representing a new video is added to the video nodes of the graph. A graph neural network (GNN) is applied to the graph, and the GNN predicts a new edge between the trending hashtag node and the new video node. The trending hashtag is recommended for the new video based on prediction of the new edge.

    Trait expansion techniques in binary matrix datasets

    公开(公告)号:US11899693B2

    公开(公告)日:2024-02-13

    申请号:US17677323

    申请日:2022-02-22

    Applicant: Adobe Inc.

    CPC classification number: G06F16/285

    Abstract: A cluster generation system identifies data elements, from a first binary record, that each have a particular value and correspond to respective binary traits. A candidate description function describing the binary traits is generated, the candidate description function including a model factor that describes the data elements. Responsive to determining that a second record has additional data elements having the particular value and corresponding to the respective binary traits, the candidate description function is modified to indicate that the model factor describes the additional elements. The candidate description function is also modified to include a correction factor describing an additional binary trait excluded from the respective binary traits. Based on the modified candidate description function, the cluster generation system generates a data summary cluster, which includes a compact representation of the binary traits of the data elements and additional data elements.

    Facilitating generation and presentation of advanced insights

    公开(公告)号:US11829705B1

    公开(公告)日:2023-11-28

    申请号:US17949903

    申请日:2022-09-21

    Applicant: ADOBE INC.

    CPC classification number: G06F40/106 G06F40/40

    Abstract: Methods, computer systems, computer-storage media, and graphical user interfaces are provided for facilitating generation and presentation of insights. In one implementation, a set of data is used to generate a data visualization. A candidate insight associated with the data visualization is generated, the candidate insight being generated in text form based on a text template and comprising a descriptive insight, a predictive insight, an investigative, or a prescriptive insight. A set of natural language insights is generated, via a machine learning model. The natural language insights represent the candidate insight in a text style that is different from the text template. A natural language insight having the text style corresponding with a desired text style is selected for presenting the candidate insight and, thereafter, the selected natural language insight and data visualization are providing for display via a graphical user interface.

    Generating visual data stories
    30.
    发明授权

    公开(公告)号:US11775582B2

    公开(公告)日:2023-10-03

    申请号:US18069561

    申请日:2022-12-21

    Applicant: Adobe Inc.

    Abstract: This disclosure describes one or more embodiments of systems, non-transitory computer-readable media, and methods that intelligently and automatically analyze input data and generate visual data stories depicting graphical visualizations from data insights determined from the input data. For example, the disclosed systems automatically extract data insights utilizing an in-depth statistical analysis of dataset groups from data-attribute categories within the input data. Based on the data insights, the disclosed systems can automatically generate exportable visual data stories to visualize the data insights, provide textual or audio-based natural language summaries of the data insights, and animate such data insights in videos. In some embodiments, the disclosed systems generate a visual-data-story graph comprising nodes representing visual data stories and edges representing similarities between the visual data stories. Based on the visual-data-story graph, the disclosed systems can select a relevant visual data story to display on a graphical user interface.

Patent Agency Ranking