Analisis Komparatif Linear Regression dan Decision Tree untuk Prediksi Skor QS World University Rankings 2025

Dyah Puspita Sari; Hafiyyan Putra Pratama

doi:10.62712/juktisi.v5i1.1094

Authors

Dyah Puspita Sari Universitas Pendidikan Indonesia
Hafiyyan Putra Pratama Universitas Pendidikan Indonesia

DOI:

https://doi.org/10.62712/juktisi.v5i1.1094

Keywords:

Decision Tree, Machine Learning, QS Rankings, Regresi Linear, Supervised Learning

Abstract

Sistem perangkingan universitas dunia telah menjadi tolak ukur global yang krusial dalam mengukur kualitas institusi pendidikan tinggi, produktivitas riset, dan keunggulan akademik. Dataset QS World University Rankings 2025 menyediakan seperangkat indikator evaluasi yang komprehensif, mencakup reputasi akademik, reputasi pemberi kerja, rasio dosen-mahasiswa, sitasi per fakultas, serta berbagai indikator internasionalisasi. Penelitian ini melakukan studi komparatif regresi machine learning untuk memprediksi Overall Score universitas berdasarkan indikator-indikator tersebut. Dua model supervised learning diterapkan, yaitu Regresi Linear dan Decision Tree Regressor. Dataset yang terdiri dari 1.503 entri dan 28 kolom diproses melalui tahapan preprocessing menyeluruh, meliputi penanganan nilai hilang dengan imputasi median, deteksi outlier menggunakan metode IQR, pengkodean variabel kategorikal dengan LabelEncoder, dan normalisasi fitur menggunakan StandardScaler. Data dibagi dengan rasio 80:20 untuk pelatihan dan pengujian. Metrik evaluasi yang digunakan mencakup Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), dan koefisien determinasi (R²). Hasil penelitian menunjukkan bahwa Regresi Linear secara signifikan mengungguli Decision Tree, dengan capaian R² sebesar 0,9985, MAE sebesar 0,3662, dan RMSE sebesar 0,7427. Validasi silang 5-fold mengonfirmasi stabilitas model Regresi Linear dengan R² rata-rata 0,9374 ± 0,0668. Analisis feature importance mengidentifikasi Academic Reputation Score sebagai prediktor paling berpengaruh terhadap Overall Score, konsisten dengan temuan analisis korelasi (r = 0,90).

Downloads

Download data is not yet available.

References

C. S. Basireddy, V. K. G. Cheruku, S. Rajagopal, and R. Soangra, “Hybrid prediction models for assessing the Higher Education Institutions Performance in QS World Institution Rankings,” F1000Research, vol. 13, p. 1529, Dec. 2024. https://doi.org/10.12688/f1000research.155847.1

QS Quacquarelli Symonds, “QS World University Rankings Methodology,” QS Top Universities, 2024. [Online]. Available: https://www.topuniversities.com/world-university-rankings/methodology.

M. Javaid, A. Haleem, R. P. Singh, R. Suman, and S. Rab, “Significance of machine learning in healthcare: Features, pillars and applications,” International Journal of Intelligent Networks, vol. 3, pp. 58–73, 2022. https://doi.org/10.1016/j.ijin.2022.05.002

S. C. Matz et al., “Using machine learning to predict student retention from socio-demographic characteristics and app-based engagement metrics,” Scientific Reports, vol. 13, no. 5608, Apr. 2023. https://doi.org/10.1038/s41598-023-32484-w

H. M. Ahmed et al., “Student performance prediction using machine learning algorithms,” Applied Computational Intelligence and Soft Computing, vol. 2024, Art. no. 4067721, 2024. https://doi.org/10.1155/2024/4067721

I. Ullah et al., “Evaluating factors influencing university ranking based on QS ranking 2023–2024 using machine learning algorithms,” in Proc. IEEE International Conference, 2024. https://doi.org/10.1109/11013585

U. Navia-Gamero, A. Portilla-Flores, P. Vega-Leal, and M. Pozo-Guerrero, “A data analytics approach for university competitiveness: The QS world university rankings,” International Journal on Interactive Design and Manufacturing (IJIDeM), vol. 16, pp. 1803–1812, Jul. 2022. https://doi.org/10.1007/s12008-022-00966-2

S. Raschka, Y. H. Liu, and V. Mirjalili, Machine Learning with PyTorch and Scikit-Learn. Birmingham, UK: Packt Publishing, 2022. https://doi.org/10.17226/26580

T. Nkosi, M. Dlamini, and S. Sibanda, “Towards a data quality framework: Preprocessing and cleaning practices in machine learning pipelines,” IEEE Access, vol. 12, pp. 18340–18358, 2024. https://doi.org/10.1109/ACCESS.2024.3360152

T. Nkosi, M. Dlamini, and S. Sibanda, “A systematic review of data cleaning and preprocessing methods for machine learning applications,” IEEE Access, vol. 11, pp. 65320–65338, 2023. https://doi.org/10.1109/ACCESS.2023.3288456

W. McKinney, Python for Data Analysis: Data Wrangling with pandas, NumPy, and Jupyter, 3rd ed. Sebastopol, CA, USA: O’Reilly Media, 2022. [Online]. Available: https://www.oreilly.com/library/view/python-for-data/9781098104023/

A. Géron, Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 3rd ed. Sebastopol, CA, USA: O’Reilly Media, 2022, pp. 64–95. [Online]. Available: https://www.oreilly.com/library/view/hands-on-machine-learning/9781098125967/

S. Boukerche, N. Zhong, and P. Hung, “Outlier detection: Methods, models, and classification,” ACM Computing Surveys, vol. 53, no. 3, Art. no. 55, 2022. https://doi.org/10.1145/3381028

P. Schober and T. R. Vetter, “Decision trees in clinical research: Tree structure, overfitting, and cross-validation,” Anesthesia & Analgesia, vol. 134, no. 2, pp. 275–278, Feb. 2022. https://doi.org/10.1213/ANE.0000000000005857

R. Loh and W. Y. Loh, “Classification and regression tree methods,” Wiley Interdisciplinary Reviews: Computational Statistics, vol. 14, no. 3, Art. no. e1547, 2022. https://doi.org/10.1002/wics.1547

T. O. Hodson, “Root-mean-square error (RMSE) or mean absolute error (MAE): When to use them or not,” Geoscientific Model Development, vol. 15, no. 14, pp. 5481–5487, Jul. 2022. https://doi.org/10.5194/gmd-15-5481-2022

G. James, D. Witten, T. Hastie, R. Tibshirani, and J. Taylor, An Introduction to Statistical Learning with Applications in Python. New York: Springer, 2023. https://doi.org/10.1007/978-3-031-38747-0

C. Botchkarev, “Performance metrics (error measures) in machine learning regression, forecasting and prognostics: Properties and typology,” Interdisciplinary Journal of Information, Knowledge, and Management, vol. 14, pp. 45–79, 2023. https://doi.org/10.28945/4184

Y. Li, “Prediction of university comprehensive score based on regression analysis,” in Proc. 2022 International Conference on Science and Technology Ethics and Human Future (STEHF 2022), Atlantis Press, Jul. 2022. https://doi.org/10.2991/978-2-494069-79-6_183

Y. A. Alsariera et al., “Assessment and evaluation of different machine learning algorithms for predicting student performance,” Computational Intelligence and Neuroscience, vol. 2022, Art. no. 4151487, 2022. https://doi.org/10.1155/2022/4151487

I. D. Stanciu and N. Nistor, “Doctoral capstone theories as indicators of university rankings: Insights from a machine learning approach,” Computers in Human Behavior, vol. 164, Art. no. 108504, Mar. 2025. https://doi.org/10.1016/j.chb.2024.108504

L. Bellantuono et al., “Territorial bias in university rankings: A complex network approach,” Scientific Reports, vol. 12, Art. no. 4995, 2022. https://doi.org/10.1038/s41598-022-08859-w

Analisis Komparatif Linear Regression dan Decision Tree untuk Prediksi Skor QS World University Rankings 2025

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

How to Cite

Issue

Section

License

Accredited

Indexing by

The Boards

Download Template

Visitors

ISSN Portal

Members of:

Recommended Tools

Current Issue