Abstract:
This paper introduces the basic ideas and research status of the existing K-nearest neighbor method, and improve the low classification accuracy when all kinds of data sets are distributed unbalanced. In the improved K-nearest neighbor method, the class representation and sample representation are introduced, so that the nearest neighbor samples, which were selected by K-nearest neighbor classification in the similarity calculation, were more representative of its class, thus reducing the false positive rate. The validity of the improved method is proved by experiments.