Multi-Class Text Classification using Hyperparameter Tuned Models
There are a lot of applications like Chatbot, websites, business analytics etc. that requires text classification. Text classification refers to an automatic classification of text based on predefined classes. This study aims to classify text documents. The proposed system comprises of four steps: Dataset format creation, exploratory data analysis, feature engineering and model training. Hyperparameters of SVM, KNN and Multinomial logistic regression are tuned using random search and grid search. Classification is performed using 3-fold cross validation with fifty iterations. Experimental results show that the methods are effective in terms of classification. Train and test accuracy are calculated. Model whose difference between train and test accuracy is minimum is selected as the best model. However, it is observed that SVM out performs other classifiers. Additionally, misclassified articles are also calculated and analyzed. Keywords - Feature Extraction, Hyperparameter Tuning, Natural Language Processing, Chi-Squared Test, SVM.