LoDaPro: Combining Local Detail and Global Projection for Improved Image Quality Assessment Using Efficient-Net and Vision Transformer

Peter Sackitey; Patrick Sackitey

doi:10.51519/journalisi.v7i3.1186

Peter Sackitey Kwame Nkrumah University of Science and Technology, Ghana
Patrick Sackitey University of Education, Ghana

DOI: 10.51519/journalisi.v7i3.1186

Keywords: Image Quality Assessment (IQA), LoDaPro, Efficient-Net, Distortion Detection, Vision Transformers.

Abstract

Image Quality Assessment (IQA) is crucial in fields like digital imaging and telemedicine, where intricate details and overall scene composition affect human perception. Existing methodologies often prioritize either local or global features, leading to insufficient quality assessments. A hybrid deep learning framework, LoDaPro (Local Detail and Global Projection), that integrates EfficientNet for precise local detail extraction with a Vision Transformer (ViT) for comprehensive global context modelling was introduced. Its balanced feature representation makes it easier to do a more thorough and human-centered evaluation of image quality. Assessed using the KonIQ-10k and TID2013 benchmark datasets, LoDaPro attained a validation SRCC of 91% and PLCC of 92%, exceeding the predictive accuracy of prominent IQA methods. The results illustrate LoDaPro's capacity to proficiently learn the intricate relationship between image content and perceived quality, providing strong and generalizable performance across various image quality contexts.

Downloads

Download data is not yet available.

References

R. Jalaboi, O. Winther, and A. Galimzianova, “Explainable Image Quality Assessments in Teledermatological Photography,” Telemedicine and e-Health, vol. 29, no. 9, pp. 1342–1348, Sep. 2023, doi: 10.1089/tmj.2022.0405.

C. Li et al., “Image Quality Assessment: From Human to Machine Preference,” Mar. 2025, [Online]. Available: http://arxiv.org/abs/2503.10078

U. Zidan, M. M. Gaber, and M. M. Abdelsamea, “SwinCup: Cascaded swin transformer for histopathological structures segmentation in colorectal cancer,” Expert Syst Appl, vol. 216, p. 119452, Apr. 2023, doi: 10.1016/j.eswa.2022.119452.

C. Ma, Z. Shi, Z. Lu, S. Xie, F. Chao, and Y. Sui, “A Survey on Image Quality Assessment: Insights, Analysis, and Future Outlook,” Feb. 2025, [Online]. Available: http://arxiv.org/abs/2502.08540

X. Zhang, W. Lin, and Q. Huang, “Fine-Grained Image Quality Assessment: A Revisit and Further Thinking,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 5, pp. 2746–2759, May 2022, doi: 10.1109/TCSVT.2021.3096528.

Y. I. Golub, “Image quality assessment,” «System analysis and applied information science», no. 4, pp. 4–15, Jan. 2022, doi: 10.21122/2309-4923-2021-4-4-15.

J. Shi, B. Wei, G. Zhou, and L. Zhang, “Sandformer: CNN and Transformer under Gated Fusion for Sand Dust Image Restoration,” in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, Jun. 2023, pp. 1–5. doi: 10.1109/ICASSP49357.2023.10095242.

H. Li, L. Wang, and Y. Li, “Efficient Context and Saliency Aware Transformer Network for No-Reference Image Quality Assessment,” in 2023 IEEE International Conference on Visual Communications and Image Processing (VCIP), IEEE, Dec. 2023, pp. 1–5. doi: 10.1109/VCIP59821.2023.10402637.

Q. Gao et al., “Combined global and local information for blind CT image quality assessment via deep learning,” in Medical Imaging 2020: Image Perception, Observer Performance, and Technology Assessment, F. W. Samuelson and S. Taylor-Phillips, Eds., SPIE, Mar. 2020, p. 39. doi: 10.1117/12.2548953.

R. Chang, “No-reference image quality assessment based on local and global using attention features,” in International Conference on Artificial Intelligence and Industrial Design (AIID 2022), Z. Xiong and R. He, Eds., SPIE, Apr. 2023, p. 29. doi: 10.1117/12.2673113.

C. Sun, H. Li, and W. Li, “No-reference image quality assessment based on global and local content perception,” in 2016 Visual Communications and Image Processing (VCIP), IEEE, Nov. 2016, pp. 1–4. doi: 10.1109/VCIP.2016.7805544.

A. Saha and Q. M. J. Wu, “Full-reference image quality assessment by combining global and local distortion measures,” Signal Processing, vol. 128, pp. 186–197, Nov. 2016, doi: 10.1016/j.sigpro.2016.03.026.

T. J. Ramírez-Rozo, H. D. Benítez-Restrepo, J. C. García-Álvarez, and G. Castellanos-Domínguez, “Non–referenced Quality Assessment of Image Processing Methods in Infrared Non-destructive Testing,” 2013, pp. 121–130. doi: 10.1007/978-3-642-41184-7_13.

Peng Zhang, Wengang Zhou, Lei Wu, and Houqiang Li, “SOM: Semantic obviousness metric for image quality assessment,” in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Jun. 2015, pp. 2394–2402. doi: 10.1109/CVPR.2015.7298853.

Z. Tang, Z. Chen, Z. Li, B. Zhong, X. Zhang, and X. Zhang, “Unifying Dual-Attention and Siamese Transformer Network for Full-Reference Image Quality Assessment,” ACM Transactions on Multimedia Computing, Communications, and Applications, vol. 19, no. 6, pp. 1–24, Nov. 2023, doi: 10.1145/3597434.

X. Min, G. Zhai, K. Gu, Y. Liu, and X. Yang, “Blind Image Quality Estimation via Distortion Aggravation,” IEEE Transactions on Broadcasting, vol. 64, no. 2, pp. 508–517, Jun. 2018, doi: 10.1109/TBC.2018.2816783.

S. Bosse, D. Maniry, K.-R. Muller, T. Wiegand, and W. Samek, “Deep Neural Networks for No-Reference and Full-Reference Image Quality Assessment,” IEEE Transactions on Image Processing, vol. 27, no. 1, pp. 206–219, Jan. 2018, doi: 10.1109/TIP.2017.2760518.

W. Zhou and Z. Chen, “Deep Multi-Scale Features Learning for Distorted Image Quality Assessment,” in 2021 IEEE International Symposium on Circuits and Systems (ISCAS), IEEE, May 2021, pp. 1–5. doi: 10.1109/ISCAS51556.2021.9401285.

H. Zhu, L. Li, J. Wu, W. Dong, and G. Shi, “Generalizable No-Reference Image Quality Assessment via Deep Meta-Learning,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 3, pp. 1048–1060, Mar. 2022, doi: 10.1109/TCSVT.2021.3073410.

B. Bare, K. Li, and B. Yan, “An accurate deep convolutional neural networks model for no-reference image quality assessment,” in 2017 IEEE International Conference on Multimedia and Expo (ICME), IEEE, Jul. 2017, pp. 1356–1361. doi: 10.1109/ICME.2017.8019508.

O. Elharrouss et al., “ViTs as backbones: Leveraging vision transformers for feature extraction,” Information Fusion, vol. 118, p. 102951, Jun. 2025, doi: 10.1016/j.inffus.2025.102951.

D. C. Lepcha, B. Goyal, A. Dogra, and V. Goyal, “Image super-resolution: A comprehensive review, recent trends, challenges and applications,” Information Fusion, vol. 91, pp. 230–260, Mar. 2023, doi: 10.1016/j.inffus.2022.10.007.

S. Lao et al., “Attentions Help CNNs See Better: Attention-based Hybrid Image Quality Assessment Network,” in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), IEEE, Jun. 2022, pp. 1139–1148. doi: 10.1109/CVPRW56347.2022.00123.

M. Cheon, S.-J. Yoon, B. Kang, and J. Lee, “Perceptual Image Quality Assessment with Transformers,” in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), IEEE, Jun. 2021, pp. 433–442. doi: 10.1109/CVPRW53098.2021.00054.

A. A. Adegun, S. Viriri, and J.-R. Tapamo, “Review of deep learning methods for remote sensing satellite images classification: experimental survey and comparative analysis,” J Big Data, vol. 10, no. 1, p. 93, Jun. 2023, doi: 10.1186/s40537-023-00772-x.

D. V. Rao and L. P. Reddy, “Image Quality Assessment Based on Perceptual Structural Similarity,” in Pattern Recognition and Machine Intelligence, Berlin, Heidelberg: Springer Berlin Heidelberg, pp. 87–94. doi: 10.1007/978-3-540-77046-6_11.

J. Kaur and C. Shekhar, “Multimodal medical image fusion using deep learning,” Advances in Computational Techniques for Biomedical Image Analysis: Methods and Applications, pp. 35–56, Jan. 2020, doi: 10.1016/B978-0-12-820024-7.00002-5.