Unsupervised Clustering of Handwritten Essay Answer Images Using Vision Transformer
DOI:
https://doi.org/10.62712/juktisi.v4i2.517Keywords:
Image Clustering, DeepCluster, Vision Transformer, Convolutional Neural Network, Handwritten-essay, Computer VisionAbstract
This study explores the use of deep clustering methods to automatically group handwritten essay answer sheets based on their visual patterns. Feature extraction was performed using three backbone models: ResNet-50, Vision Transformer (ViT-base), and Tr-OCR. These features were then clustered using two unsupervised algorithms—K-means (with k=5) and HDBSCAN (with minimum cluster size = 10). To enhance clustering performance, a deep clustering approach was implemented by applying K-means iteratively to refine feature representations. Evaluation was conducted both quantitatively, using Silhouette Score, Davies-Bouldin Index, and Calinski- Harabasz Score, and qualitatively, through t-SNE visualizations and cluster content inspection. The ViT and Tr-OCR backbones outperformed CNN-based ResNet-50, achieving higher cluster cohesion and separation. Notably, the final clustering result using ViT with HDBSCAN reached a Silhouette Score of 0.772, Davies-Bouldin Index of 0.369, and Calinski-Harabasz Score of 408.006. The findings indicate that vision transformer-based models are more effective for unsupervised grouping of handwritten visual data. This approach can assist educators in accelerating and objectifying the grading process and may serve as a foundation for future automated essay evaluation systems integrating OCR and NLP techniques.
Downloads
References
A. Sopian, D. Fungsi Guru, and A. Sopian Sekolah Tinggi Ilmu Tarbiyah Raudhatul Ulum, “Tugas, Peran dan Fungsi Guru dalam Pendidikan,” 2016.
N. Nurhasanah et al., “Evaluasi Pembelajaran Dikelas Universitas Islam Negeri Sumatera Utara,” vol. 1, no. 2, p. 6, 2023, doi: 10.59581/jmpb-widyakarya.v1i2.485.
I. Magdalena, H. N. Fauzi, and R. Putri, “PENTINGNYA EVALUASI DALAM PEMBELAJARAN DAN AKIBAT MEMANIPULASINYA,” 2020. [Online]. Available: https://ejournal.stitpn.ac.id/index.php/bintang
L. R. Wachidah, Y. Laila, A. Irmawati, S. Amin, T. Bahasa Indonesia, and I. Madura, “Implementasi Penggunaan Tes Essay dalam Evaluasi Pembelajaran Daring pada Siswa Kelas VII SMP Negeri 1 Tlanakan KONFERENSI NASIONAL LALONGÉT II.”
Siswanto, “PENGGUNAAN TES ESSAY DALAM EVALUASI PEMBELAJARAN,” JURNAL PENDIDIKAN AKUNTANSI INDONESIA Vol. V No. 1 – Tahun 2006 Hal. 55 - 61.
H. Sheikh, C. Prins, and E. Schrijvers, “Artificial Intelligence: Definition and Background,” 2023, pp. 15–41. doi: 10.1007/978-3-031-21448-6_2.
Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature.
Y. Chen, S. Wang, L. Lin, Z. Cui, and Y. Zong, “Computer Vision and Deep Learning Transforming Image Recognition and Beyond,” International Journal of Computer Science and Information Technology, vol. 2, no. 1, pp. 45–51, Mar. 2024, doi: 10.62051/ijcsit.v2n1.06.
J. A. Hartigan and M. A. Wong, “A K-Means Clustering Algorithm,” J R Stat Soc Ser C Appl Stat, vol. 28, no. 1, pp. 100–108, 1979, doi: https://doi.org/10.2307/2346830.
C. C. Aggarwal, A. Hinneburg, and D. A. Keim, “On the Surprising Behavior of Distance Metrics in High Dimensional Space.”
E. Schubert, J. Sander, M. Ester, H. P. Kriegel, and X. Xu, “DBSCAN revisited, revisited: Why and how you should (still) use DBSCAN,” ACM Transactions on Database Systems, vol. 42, no. 3, Jul. 2017, doi: 10.1145/3068335.
L. McInnes, J. Healy, and S. Astels, “hdbscan: Hierarchical density based clustering,” The Journal of Open Source Software, vol. 2, no. 11, p. 205, Mar. 2017, doi: 10.21105/joss.00205.
S. C. Lowe, J. B. Haurum, S. Oore, T. B. Moeslund, and G. W. Taylor, “An Empirical Study into Clustering of Unseen Datasets with Self-Supervised Encoders,” Jun. 2024, [Online]. Available: http://arxiv.org/abs/2406.02465
M. Caron, P. Bojanowski, A. Joulin, and M. Douze, “Deep Clustering for Unsupervised Learning of Visual Features,” Jul. 2018, [Online]. Available: http://arxiv.org/abs/1807.05520
H.-B. Ling, B. Zhu, D. Huang, D.-H. Chen, C.-D. Wang, and J.-H. Lai, “Vision Transformer for Contrastive Clustering,” Jul. 2022, [Online]. Available: http://arxiv.org/abs/2206.12925
A. Dosovitskiy et al., “An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale,” Oct. 2020, [Online]. Available: http://arxiv.org/abs/2010.11929
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
B. Wu et al., “Visual Transformers: Token-based Image Representation and Processing for Computer Vision,” 2020.
M. Li et al., “TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models,” 2021.
P. J. Rousseeuw, “Silhouettes: a graphical aid to the interpretation and validation of cluster analysis,” 1987.
D. L. Davies and D. W. Bouldin, “A Cluster Separation Measure,” IEEE Trans Pattern Anal Mach Intell, vol. PAMI-1, no. 2, pp. 224–227, 1979, doi: 10.1109/TPAMI.1979.4766909.
T. Caliński and J. Harabasz, “A dendrite method for cluster analysis,” Communications in Statistics, vol. 3, no. 1, pp. 1–27, Jan. 1974, doi: 10.1080/03610927408827101.
L. Van Der Maaten and G. Hinton, “Visualizing Data using t-SNE,” 2008.
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “ImageNet: A large-scale hierarchical image database,” in 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009, pp. 248–255. doi: 10.1109/CVPR.2009.5206848.
X. Zhou and N. L. Zhang, “Deep Clustering with Features from Self-Supervised Pretraining,” Jul. 2022, [Online]. Available: http://arxiv.org/abs/2207.13364
T. T. Cai and R. Ma, “Theoretical Foundations of t-SNE for Visualizing High-Dimensional Clustered Data,” Nov. 2022, [Online]. Available: http://arxiv.org/abs/2105.07536
Z. Yang, Y. Chen, and J. Corander, “T-SNE Is Not Optimized to Reveal Clusters in Data,” Oct. 2021, [Online]. Available: http://arxiv.org/abs/2110.02573
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Mohamad Asyqari Anugrah, Yaya Wihardi, Rani Megasari

This work is licensed under a Creative Commons Attribution 4.0 International License.
Mohamad Asyqari Anugrah




