This information is not captured by the survey's closed-ended questions and, given its free-text quality, it is unstructured information that is generally not used in a systematic or quantitative way by the University with which this project was developed.
The objective of this project was to apply natural language processing (NLP), machine learning and data mining methodologies to the free-text comments of courses and teachers evaluations at a university to automatically classify the comments according to their degree of polarity (positive-negative) and according to dimensions and aspects of pedagogy addressed. In addition, topic models were implemented to analyze the topics addressed by students in their comments. The former makes it possible to generate quantitative indicators of the quality of courses and teachers for different relevant dimensions of teaching and to provide feedback to teachers and adjust courses according to such indicators. The second allows identifying recurrent topics that students address in their free-text comments and that are not taken into account in the closed part of the survey, so that quantitative metrics of such dimensions are not generated.
We worked with about 7.000 comments made by students to courses and teachers during one year. Of these, 2.000 records were manually marked indicating their polarity (positive-negative on a scale of 0 to 3) and the comments were classified into 20 predefined, non-mutually exclusive categories, corresponding to the aspects of teaching addressed in each one. This was done in conjunction with the University's Faculty of Education. The texts were preprocessed and topic models were constructed using Latent Dirichlet Allocation (LDA). The texts were vectorized using Bag of Words and LDA, and machine learning methodologies (Naive Bayes, Logit, SVM, Boosting trees) were trained and their predictive capacity was evaluated to choose the best model.
Automation for the classification of comments according to their degree of polarity (positive-negative). Generation of quantitative indicators of course and teacher quality for different relevant dimensions of teaching and identification of recurrent themes that students address in their free-text comments.
In Projects, you can learn about real applications of our services, use cases and examples of them.