摘要:
A method and system for automatically identifying an optimal set of attributes of entities included in a networked system. Entity types are ranked based on information gain. A first classification accuracy relative to a first entity type is determined. The first entity type is the top-ranked entity type or a first aggregate entity type. A second entity type is selected base on the ranking. A database join of a first set of attributes associated with the first entity type and a second set of attributes associated with the second entity type is performed. A second classification accuracy relative to a second aggregate entity type generated by the join is determined. In response to determining that the second classification accuracy is not greater than the first classification accuracy, an optimal set of attributes contributing to a problem in the networked system is identified as the first set of attributes.
摘要:
A method and system for automatically identifying an optimal set of attributes of entities included in a networked system. Entity types are ranked based on information gain. A first classification accuracy relative to a first entity type is determined. The first entity type is the top-ranked entity type or a first aggregate entity type. A second entity type is selected based on the ranking. A database join of a first set of attributes associated with the first entity type and a second set of attributes associated with the second entity type is performed. A second classification accuracy relative to a second aggregate entity type generated by the join is determined. In response to determining that the second classification accuracy is not greater than the first classification accuracy, an optimal set of attributes contributing to a problem in the networked system is identified as the first set of attributes.
摘要:
A method and system for automatically identifying an optimal set of attributes of entities included in a networked system. Entity types are ranked based on information gain. A first classification accuracy relative to a first entity type is determined. The first entity type is the top-ranked entity type or a first aggregate entity type. A second entity type is selected basal on the ranking. A database join of a first set of attributes associated with the first entity type and a second set of attributes associated with the second entity type is performed. A second classification accuracy relative to a second aggregate entity type generated by the join is determined. In response to determining that the second classification accuracy is not greater than the first classification accuracy, an optimal set of attributes contributing to a problem in the networked system is identified as the first set of attributes.
摘要:
A method and system for automatically identifying an optimal set of attributes of entities included in a networked system. Entity types are ranked based on information gain. A first classification accuracy relative to a first entity type is determined. The first entity type is the top-ranked entity type or a first aggregate entity type. A second entity type is selected base on the ranking. A database join of a first set of attributes associated with the first entity type and a second set of attributes associated with the second entity type is performed. A second classification accuracy relative to a second aggregate entity type generated by the join is determined. In response to determining that the second classification accuracy is not greater than the first classification accuracy, an optimal set of attributes contributing to a problem in the networked system is identified as the first set of attributes.