Abstract:
Disclosed are methods, apparatus, systems, and computer-readable storage media for identifying a topic for a text. In some implementations, one or more servers maintain a plurality of data entries in one or more database tables storing text data, each data entry of a first portion of the data entries including: a text sequence, a topic, and a text-to-topic association score indicating a number of times that the text sequence appears in a processed text associated with the topic, each data entry of a second portion of the data entries including a total word score indicating a number of times that a respective text sequence appears in one or more processed texts. The one or more servers may receive an incoming text and identify a topic for the incoming text by processing the text sequences of the incoming text in relation to the data entries in the database tables.
Abstract:
Disclosed are methods, apparatus, systems, and computer-readable storage media for identifying a topic for a text. In some implementations, one or more servers maintain a plurality of data entries in one or more database tables storing text data, each data entry of a first portion of the data entries including: a text sequence, a topic, and a text-to-topic association score indicating a number of times that the text sequence appears in a processed text associated with the topic, each data entry of a second portion of the data entries including a total word score indicating a number of times that a respective text sequence appears in one or more processed texts. The one or more servers may receive an incoming text and identify a topic for the incoming text by processing the text sequences of the incoming text in relation to the data entries in the database tables.