Flexible and scalable artificial intelligence and analytics platform with advanced content analytics and data ingestion

    公开(公告)号:US12236288B2

    公开(公告)日:2025-02-25

    申请号:US18339245

    申请日:2023-06-22

    Abstract: Disclosed is a flexible and scalable artificial intelligence and analytics platform with advanced content analytics and content ingestion. Disparate contents can be ingested into a content analytics system of the platform through a content ingestion pipeline operated by a sophisticated text mining engine. Prior to persistence, editorial metadata can be extracted and semantic metadata inferred to gain insights across the disparate contents. The editorial metadata and the semantic metadata can be dynamically mapped, as the disparate contents are crawled from disparate sources, to an internal ingestion pipeline document conforming to a uniform mapping schema that specifies master metadata of interest. For persistence, the semantic metadata in the internal ingestion pipeline document can be mapped to metadata tables conforming to a single common data model of a central repository. In this way, ingested metadata can be leveraged across the platform, for instance, for trend analysis, mood detection, model building, etc.

    Systems and methods for intelligent content filtering and persistence

    公开(公告)号:US11803600B2

    公开(公告)日:2023-10-31

    申请号:US17510583

    申请日:2021-10-26

    CPC classification number: G06F16/9535 G06F16/9558 G06F40/295 G06F40/30

    Abstract: A source content processor receives content from a crawler and calls a text mining engine. The text mining engine mines the content and provides metadata about the content. The source content processor applies a source content filtering rule to the content utilizing the metadata from the text mining engine. The source content filtering rule is previously built based on at least one of a named entity, a category, or a sentiment. The source content processor determines whether to persist the content according to a result from applying the source content filtering rule to the content and either stores the content in a data store or deletes the contents from the data ingestion pipeline such that the content is not persisted anywhere. Embodiments disclosed herein can significantly reduce the amount of irrelevant content through the data ingestion pipeline, prior to data persistence.

Patent Agency Ranking