Product synthesis from multiple sources
    3.
    发明授权
    Product synthesis from multiple sources 有权
    从多个来源的产品综合

    公开(公告)号:US08352473B2

    公开(公告)日:2013-01-08

    申请号:US12764676

    申请日:2010-04-21

    IPC分类号: G06Q10/00 G06Q30/00

    摘要: Methods and systems for automatically synthesizing product information from multiple data sources into an on-line catalog are disclosed, and in particular, for automatically synthesizing the product information based on attribute-value pairs. Information for a product may be obtained, via entity extraction, feed ingestion, and other mechanisms, from a plurality of structured and unstructured data sources having different taxonomies and schemas. Product information may additionally or alternatively be obtained or derived based on popularity data. The product information may be cleansed, segmented and normalized. The product information may be clustered so closest products, attribute names and attribute values are associated. A representative value for an attribute name may be determined, and the on-line catalog may be updated so that entries are comprehensive, meaningful and useful to a catalog user. Updates from at least 500 million different data sources may be scheduled to occur as frequently as several times daily.

    摘要翻译: 公开了用于将产品信息从多个数据源自动合成到在线目录中的方法和系统,特别地,用于基于属性值对自动合成产品信息。 可以通过实体提取,饲料摄取和其他机制从具有不同分类和模式的多个结构化和非结构化数据源获得信息。 产品信息可以另外地或替代地基于流行度数据获得或导出。 产品信息可以被清洁,分段和归一化。 产品信息可能被聚集,因此最接近的产品,属性名称和属性值相关联。 可以确定属性名称的代表值,并且可以更新在线目录,使得条目对目录用户是全面的,有意义的和有用的。 可能会安排从至少5亿个不同数据源进行更新,频繁发生,每天多次。

    PRODUCT SYNTHESIS FROM MULTIPLE SOURCES
    4.
    发明申请
    PRODUCT SYNTHESIS FROM MULTIPLE SOURCES 有权
    多源产品合成

    公开(公告)号:US20110264598A1

    公开(公告)日:2011-10-27

    申请号:US12764676

    申请日:2010-04-21

    IPC分类号: G06Q10/00 G06Q30/00

    摘要: Methods and systems for automatically synthesizing product information from multiple data sources into an on-line catalog are disclosed, and in particular, for automatically synthesizing the product information based on attribute-value pairs. Information for a product may be obtained, via entity extraction, feed ingestion, and other mechanisms, from a plurality of structured and unstructured data sources having different taxonomies and schemas. Product information may additionally or alternatively be obtained or derived based on popularity data. The product information may be cleansed, segmented and normalized. The product information may be clustered so closest products, attribute names and attribute values are associated. A representative value for an attribute name may be determined, and the on-line catalog may be updated so that entries are comprehensive, meaningful and useful to a catalog user. Updates from at least 500 million different data sources may be scheduled to occur as frequently as several times daily.

    摘要翻译: 公开了用于将产品信息从多个数据源自动合成到在线目录中的方法和系统,特别地,用于基于属性值对自动合成产品信息。 可以通过实体提取,饲料摄取和其他机制从具有不同分类和模式的多个结构化和非结构化数据源获得信息。 产品信息可以另外地或替代地基于流行度数据获得或导出。 产品信息可以被清洁,分段和归一化。 产品信息可能被聚集,因此最接近的产品,属性名称和属性值相关联。 可以确定属性名称的代表值,并且可以更新在线目录,使得条目对目录用户是全面的,有意义的和有用的。 可能会安排从至少5亿个不同数据源进行更新,频繁发生,每天多次。