Detecting Data Leakage in Cloud Storage Using Decision Tree Classification
Abstract
Data leakage in cloud storage systems poses a significant security threat, potentially leading to unauthorized access, loss of sensitive information, and operational disruptions. This research proposes a classification model for detecting potential data leakage incidents using the Decision Tree algorithm. The dataset, obtained from the Kaggle public repository, contains user activity logs representing both normal and anomalous behaviors in cloud storage environments. Several preprocessing steps were applied to improve model quality, including handling missing values, removing outliers, and converting categorical data into numerical form. Hyperparameter optimization was performed using GridSearchCV to determine the best configuration for the Decision Tree classifier. Experimental results demonstrate that the optimized model achieved high classification performance, with an accuracy of 70,84%, a precision of 55% for the data leakage class, and an F1-score of 40%. The analysis also highlights the significance of certain features, such as multi-factor authentication usage and access to confidential data, in predicting potential leakage events. This study provides a theoretical contribution by \establishing a robust methodology for applying Decision Tree algorithms to a novel cloud security dataset, offering a scalable and interpretable framework for automated threat detection.
Downloads
References
D. D. Firmansyah Putri and M. H. Fahrozi, “Upaya Pencegahan Kebocoran Data Konsumen Melalui Pengesahan Ruu Perlindungan Data Pribadi (Studi Kasus E-Commerce Bhinneka.Com),” Borneo Law Rev., vol. 5, no. 1, pp. 46–68, 2021, doi: 10.35334/bolrev.v5i1.2014.
L. Tantowi and L. Wijayanti, “Peluang Dan Tantangan Penyimpanan Cloud Storage Pada Dokumen Digital,” Shaut Al-Maktabah J. Perpustakaan, Arsip dan Dokumentasi, vol. 15, no. 1, pp. 118–131, 2023, doi: 10.37108/shaut.v15i1.803.
R. Rifany, M. D. Prakoso, and P. D. Laksono, “Analisis Dampak Cloud Computing terhadap Keamanan Sistem dan Data,” Semin. Nas. TEKNOKA, vol. 8, no. 2502, pp. 01–06, 2023.
A. F. Mahmud and S. Wirawan, “Sistemasi: Jurnal Sistem Informasi Deteksi Phishing Website menggunakan Machine Learning Metode Klasifikasi Phishing Website Detection using Machine Learning Classification Method,” vol. 13, no. 4, pp. 2540–9719, 2024.
M. Fadhlurrohman, A. Muliawati, and B. Hananto, “Analisis Kinerja Intrusion Detection System pada Deteksi Anomali dengan Metode Decision Tree Terhadap Serangan Siber,” J. Ilmu Komput. dan Agri-Informatika, vol. 8, no. 2, pp. 90–94, 2021, doi: 10.29244/jika.8.2.90-94.
A. Halim Lubis, Y. Fadillah Harahap, and P. Studi Ilmu Komputer, “Analisis Sentimen Masyarakat Terhadap Resesi Ekonomi Global 2023 Menggunakan Algoritma Naïve Bayes Classifier,” J. Ilm. Elektron. Dan Komput., vol. 16, no. 2, pp. 442–450, 2023.
M. S. Hasibuan and A. Serdano, “Analisis Sentimen Kebijakan Pembelajaran Tatap Muka Menggunakan Support Vector Machine dan Naive Bayes,” JRST (Jurnal Ris. Sains dan Teknol., vol. 6, no. 2, pp. 199–204, 2022.
M. R. Fatiha, I. Setiawan, A. N. Ikhsan, and I. R. Yunita, “Optimisasi Sistem Deteksi Phishing Berbasis WeB,” J. Ilm. IT CIDA, vol. 10, no. 2, pp. 97–108, 2024.
S. Yuan, H. Li, X. Qian, W. Jiang, and G. Xu, “OnePath: Efficient and Privacy-Preserving Decision Tree Inference in the Cloud,” arXiv (Cornell Univ., pp. 1–12, 2024, doi: arXiv:2409.19334.
M. A. Nugroho and R. Kartadie, “Cloud Storage Dengan Teknologi Kubernetes Untuk Platform Collaborative Research,” JIPI (Jurnal Ilm. Penelit. dan Pembelajaran Inform., vol. 6, no. 1, pp. 74–81, 2021, doi: 10.29100/jipi.v6i1.1908.
A. C. Darmawan, “Pengembanga Aplikasi Berbasis Web dengan Python Flask untuk Klasifikasi Data Menggunakan Metode Decision Tree C4.5,” Universitas Islam Indonesia, 2022.
A. Fahri and Y. Ramdhani, “Visualisasi Data dan Penerapan Machine Learning Menggunakan Decision Tree Untuk Keputusan Layanan Kesehatan COVID-19,” J. Tekno Kompak, vol. 17, no. 2, p. 50, 2023, doi: 10.33365/jtk.v17i2.2438.
R. N. Ramadhon, A. Ogi, A. P. Agung, R. Putra, S. S. Febrihartina, and U. Firdaus, “Implementasi Algoritma Decision Tree untuk Klasifikasi Pelanggan Aktif atau Tidak Aktif pada Data Bank,” Karimah Tauhid, vol. 3, no. 2, pp. 1860–1874, 2024, doi: 10.30997/karimahtauhid.v3i2.11952.
D. A. Setyawan, “Pengembangan Metode Decision Tree Dengan Diskritisasi Data Dan Splitting Atribut Menggunakan Hierarchical Clustering Dan,” Institut Teknologi Sepuluh Nopember Surabaya, 2020.
S. M. Prasetiyo, T. U. Ningsih, B. Hakim, and A. A. R. Putra, “Jurnal Managemen Proyek Informatika Artificial Intelligence Vision Engineer,” BULLET J. Multidisiplin Ilmu, vol. 01, no. 6, pp. 987–991, 2022.
M. Ţălu, “Exploring Machine Learning Algorithms to Enhance Cloud Comput‑ ing Security,” Digit. Technol. Res. Appl., vol. 4, no. 2, pp. 33–47, 2025, doi: 10.54963/dtra.v4i2.1272.
A. B. Nassif, M. A. Talib, Q. Nasir, H. Albadani, and F. M. Dakalbab, “Machine Learning for Cloud Security: A Systematic Review,” IEEE Access, vol. 9, pp. 20717–20735, 2021, doi: 10.1109/ACCESS.2021.3054129.
S. V. Bhaskaran and S. Achar, “a Study of Evolving Cloud Computing Data Security: a Machine Learning Perspective,” Int. J. Prof. Bus. Rev., vol. 10, no. 3, p. e05315, 2025, doi: 10.26668/businessreview/2025.v10i3.5315.
Z. M. J. Nafis, R. Nazilla, R. Nugraha, and S. ’Uyun Shofwatul ’Uyun, “Perbandingan Algoritma Decision Tree dan K-Nearest Neighbor untuk Klasifikasi Serangan Jaringan IoT,” Komputika J. Sist. Komput., vol. 13, no. 2, pp. 245–252, 2024, doi: 10.34010/komputika.v13i2.12609.
F. A. Oktavirahani and R. Maharesi, “Implementasi Algoritma Decision Tree Cart Untuk Merekomendasikan Ukuran Baju,” JURIKOM (Jurnal Ris. Komputer), vol. 9, no. 1, p. 138, 2022, doi: 10.30865/jurikom.v9i1.3838.
A. Rasyid, S. Gilbijatno, A. W. Pramudya, D. Prasetyo, and T. Informatika, “Implementasi Algoritma Decision Tree CART untuk Deteksi Dini,” Pros. Semin. Nas. Teknol. Dan Sains Tahun, vol. 4, pp. 440–445, 2025.
D. Muriyatmoko, A. Musthafa, and M. H. Wijaya, “Klasifikasi Profil Kelulusan Nilai AKPAM Dengan Metode Decision Tree,” Semin. Nas. Sains dan Teknol. 2024 Fak., no. April, pp. 448–453, 2024.
R. E. Nugroho, W. Y. Pamungkas, and J. H. Jaman, “Pendeteksi Penyakit Hepatitis Menggunakan Cart Decision Tree,” J. Inform. dan Tek. Elektro Terap., vol. 12, no. 3S1, pp. 3690–3696, 2024, doi: 10.23960/jitet.v12i3s1.5184.
R. Muzayanah, D. A. A. Pertiwi, M. Ali, and M. A. Muslim, “Comparison of gridsearchcv and bayesian hyperparameter optimization in random forest algorithm for diabetes prediction,” J. Soft Comput. Explor., vol. 5, no. 1, pp. 86–91, 2024, doi: 10.52465/joscex.v5i1.308.
K. Alemerien, S. Alsarayreh, and E. Altarawneh, “Diagnosing Cardiovascular Diseases using Optimized Machine Learning Algorithms with GridSearchCV,” J. Appl. Data Sci., vol. 5, no. 4, pp. 1539–1552, 2024, doi: 10.47738/jads.v5i4.280.


Copyright (c) 2025 Journal of Information Systems and Informatics

This work is licensed under a Creative Commons Attribution 4.0 International License.
- I certify that I have read, understand and agreed to the Journal of Information Systems and Informatics (Journal-ISI) submission guidelines, policies and submission declaration. Submission already using the provided template.
- I certify that all authors have approved the publication of this and there is no conflict of interest.
- I confirm that the manuscript is the authors' original work and the manuscript has not received prior publication and is not under consideration for publication elsewhere and has not been previously published.
- I confirm that all authors listed on the title page have contributed significantly to the work, have read the manuscript, attest to the validity and legitimacy of the data and its interpretation, and agree to its submission.
- I confirm that the paper now submitted is not copied or plagiarized version of some other published work.
- I declare that I shall not submit the paper for publication in any other Journal or Magazine till the decision is made by journal editors.
- If the paper is finally accepted by the journal for publication, I confirm that I will either publish the paper immediately or withdraw it according to withdrawal policies
- I Agree that the paper published by this journal, I transfer copyright or assign exclusive rights to the publisher (including commercial rights)