Virtual Browser For Retail Crawling Engine

    公开(公告)号:US20230096332A1

    公开(公告)日:2023-03-30

    申请号:US17486556

    申请日:2021-09-27

    摘要: A computer system extracts product data from a website and correlates product records from multiple sources to one another as corresponding to the same product. A website is crawled efficiently by rendering webpages using a virtual browser that ignores blacklisted elements, extracts data from objects without rendering, and suppressing retrieval of remote resources. Data is extracted according to engine control statements including a selector and extractor. A website may be crawled repeatedly and changes in extracted data may be detected and flagged. Engine control statements may be automatically changed in response to detecting a change in the configuration of the website. Images of product records may be correlated with one another by first comparing text of the product records and selecting images for comparison based on composition. Images are compared using a machine learning model. Images determined to be similar may be presented to a human for a correlation decision.

    Data correlation system and method

    公开(公告)号:US11907310B2

    公开(公告)日:2024-02-20

    申请号:US17486567

    申请日:2021-09-27

    摘要: A computer system extracts product data from a website and correlates product records from multiple sources to one another as corresponding to the same product. A website is crawled efficiently by rendering webpages using a virtual browser that ignores blacklisted elements, extracts data from objects without rendering, and suppressing retrieval of remote resources. Data is extracted according to engine control statements including a selector and extractor. A website may be crawled repeatedly and changes in extracted data may be detected and flagged. Engine control statements may be automatically changed in response to detecting a change in the configuration of the website. Images of product records may be correlated with one another by first comparing text of the product records and selecting images for comparison based on composition. Images are compared using a machine learning model. Images determined to be similar may be presented to a human for a correlation decision.

    Dynamic Filter Recommendations
    3.
    发明申请

    公开(公告)号:US20210182287A1

    公开(公告)日:2021-06-17

    申请号:US16712590

    申请日:2019-12-12

    申请人: The Yes Platform

    摘要: A user preference hierarchy is determined from user response to images. Images may be tagged using machine learning models trained to determine values for images. Products are clustered according to product vectors. Images of products within a cluster are clustered according to composition and groups of images are selected from image clusters for soliciting feedback regarding user preference for products of a cluster. Feedback is used to train a user preference model to estimate affinity for a product vector. A user may provide feedback regarding a price point and products are weighted according to a distribution about the price point. The distribution may be asymmetrical according to direction of movement of the price point. Filters may be dynamically defined and presented to a user based on popularity and frequency of occurrence of attribute-value pairs of search results and based on feedback regarding the search results.

    DYNAMIC FILTER RECOMMENDATIONS
    4.
    发明公开

    公开(公告)号:US20230342365A1

    公开(公告)日:2023-10-26

    申请号:US18341589

    申请日:2023-06-26

    摘要: A user preference hierarchy is determined from user response to images. Images may be tagged using machine learning models trained to determine values for images. Products are clustered according to product vectors. Images of products within a cluster are clustered according to composition and groups of images are selected from image clusters for soliciting feedback regarding user preference for products of a cluster. Feedback is used to train a user preference model to estimate affinity for a product vector. A user may provide feedback regarding a price point and products are weighted according to a distribution about the price point. The distribution may be asymmetrical according to direction of movement of the price point. Filters may be dynamically defined and presented to a user based on popularity and frequency of occurrence of attribute-value pairs of search results and based on feedback regarding the search results.

    Data Correlation System And Method

    公开(公告)号:US20230096058A1

    公开(公告)日:2023-03-30

    申请号:US17486567

    申请日:2021-09-27

    摘要: A computer system extracts product data from a website and correlates product records from multiple sources to one another as corresponding to the same product. A website is crawled efficiently by rendering webpages using a virtual browser that ignores blacklisted elements, extracts data from objects without rendering, and suppressing retrieval of remote resources. Data is extracted according to engine control statements including a selector and extractor. A website may be crawled repeatedly and changes in extracted data may be detected and flagged. Engine control statements may be automatically changed in response to detecting a change in the configuration of the website. Images of product records may be correlated with one another by first comparing text of the product records and selecting images for comparison based on composition. Images are compared using a machine learning model. Images determined to be similar may be presented to a human for a correlation decision.

    Data Extraction Approach For Retail Crawling Engine

    公开(公告)号:US20230095711A1

    公开(公告)日:2023-03-30

    申请号:US17486562

    申请日:2021-09-27

    摘要: A computer system extracts product data from a website and correlates product records from multiple sources to one another as corresponding to the same product. A website is crawled efficiently by rendering webpages using a virtual browser that ignores blacklisted elements, extracts data from objects without rendering, and suppressing retrieval of remote resources. Data is extracted according to engine control statements including a selector and extractor. A website may be crawled repeatedly and changes in extracted data may be detected and flagged. Engine control statements may be automatically changed in response to detecting a change in the configuration of the website. Images of product records may be correlated with one another by first comparing text of the product records and selecting images for comparison based on composition. Images are compared using a machine learning model. Images determined to be similar may be presented to a human for a correlation decision.

    Cluster and image-based feedback system

    公开(公告)号:US11386301B2

    公开(公告)日:2022-07-12

    申请号:US16563230

    申请日:2019-09-06

    申请人: The Yes Platform

    摘要: Images are tagged with values in an image data hierarchy that is most subjective at its top level and least subjective at its bottom level, such as a hierarchy including style, type, and features for clothing. A user preference hierarchy is determined from user response to images that are tagged. Tagged images may be generated by processing them with machine learning models trained to determine values for images. Product records including images and other data are analyzed to generate attribute vectors that are encoded to generate product vectors. Products are clustered according to their product vectors. Images of products within a cluster are clustered according to composition and groups of images are selected from image clusters for soliciting feedback regarding user preference for products of a cluster. Feedback is used to train a user preference model to estimate user affinity for a product having a given product vector.

    Price-Based User Feedback System
    8.
    发明申请

    公开(公告)号:US20210118020A1

    公开(公告)日:2021-04-22

    申请号:US16658979

    申请日:2019-10-21

    申请人: The Yes Platform

    IPC分类号: G06Q30/02

    摘要: A user preference hierarchy is determined from user response to images that are tagged. Tagged images may be generated by processing them with machine learning models trained to determine values for images. Product records including images and other data are analyzed to generate attribute vectors that are encoded to generate product vectors. Products are clustered according to their product vectors. Images of products within a cluster are clustered according to composition and groups of images are selected from image clusters for soliciting feedback regarding user preference for products of a cluster. Feedback is used to train a user preference model to estimate affinity for a product having a given product vector. A user may provide feedback regarding a price point and products are weighted according to a distribution having a highest value at the price point. The distribution may be asymmetrical according to direction of movement of the price point.

    Dynamic filter recommendations
    9.
    发明授权

    公开(公告)号:US11727014B2

    公开(公告)日:2023-08-15

    申请号:US16712590

    申请日:2019-12-12

    摘要: A user preference hierarchy is determined from user response to images. Images may be tagged using machine learning models trained to determine values for images. Products are clustered according to product vectors. Images of products within a cluster are clustered according to composition and groups of images are selected from image clusters for soliciting feedback regarding user preference for products of a cluster. Feedback is used to train a user preference model to estimate affinity for a product vector. A user may provide feedback regarding a price point and products are weighted according to a distribution about the price point. The distribution may be asymmetrical according to direction of movement of the price point. Filters may be dynamically defined and presented to a user based on popularity and frequency of occurrence of attribute-value pairs of search results and based on feedback regarding the search results.