A Novel Degraded Document Binarization Model through Vision Transformer Network
Tài liệu tham khảo
Gatos, 2009, ICDAR 2009 document image binarization contest, 1375
Nafchi, 2014, Phase-based binarization of ancient document images: Model and applications, IEEE Trans. Image Process., 23, 2916, 10.1109/TIP.2014.2322451
Hedjam, 2015, Influence of color-to-gray conversion on the performance of document image binarization: Toward a novel optimization problem, IEEE Trans. Image Process., 24, 3637, 10.1109/TIP.2015.2442923
Otsu, 1979, A threshold selection method from gray-level histograms, IEEE Trans. Syst. Man Cybern., 9, 62, 10.1109/TSMC.1979.4310076
Bardozzo, 2021, Sugeno integral generalization applied to improve adaptive image binarization, Inf. Fusion, 68, 37, 10.1016/j.inffus.2020.10.020
Hammouda, 2008, Distributed collaborative web document clustering using cluster keyphrase summaries, Inf. Fusion, 9, 465, 10.1016/j.inffus.2006.12.001
Bhunia, 2020, Indic handwritten script identification using offline-online multi-modal deep network, Inf. Fusion, 57, 1, 10.1016/j.inffus.2019.10.010
Sauvola, 2000, Adaptive document image binarization, Pattern Recognit., 33, 225, 10.1016/S0031-3203(99)00055-2
Wolf, 2002, Text localization, enhancement and binarization in multimedia documents, 1037
Su, 2013, Robust document image binarization technique for degraded document images, IEEE Trans. Image Process., 22, 1408, 10.1109/TIP.2012.2231089
S. Lu, B. Su, C.L. Tan, Document image binarization using background estimation and stroke edges, in: 2010 International Journal on Document Analysis and Recognition, IJDAR, 2010, pp. 303–314.
Jia, 2018, Degraded document image binarization using structural symmetry of strokes, Pattern Recognit., 74, 225, 10.1016/j.patcog.2017.09.032
Lelore, 2013, FAIR: a fast algorithm for document image restoration, IEEE Trans. Pattern Anal. Mach. Intell., 35, 2039, 10.1109/TPAMI.2013.63
Mitianoudis, 2015, Document image binarization using local features and gaussian mixture modeling, Image Vis. Comput., 38, 33, 10.1016/j.imavis.2015.04.003
N.R. Howe, Document binarization with automatic parameter tuning, in: 2013 International Journal on Document Analysis and Recognition, IJDAR, 2013, pp. 247–258.
Bhowmik, 2018, GiB: A game theory inspired binarization technique for degraded document images, IEEE Trans. Image Process., 28, 1443, 10.1109/TIP.2018.2878959
Salehani, 2020, MSdB-NMF: MultiSpectral document image binarization framework via non-negative matrix factorization approach, IEEE Trans. Image Process., 29, 9099, 10.1109/TIP.2020.3023613
Guo, 2019, Nonlinear edge-preserving diffusion with adaptive source for document images binarization, Appl. Math. Comput., 351, 8
Du, 2021, Nonlinear diffusion equation with selective source for binarization of degraded document images, Appl. Math. Model., 99, 243, 10.1016/j.apm.2021.06.023
Zhang, 2020, Selective diffusion involving reaction for binarization of bleed-through document images, Appl. Math. Model., 81, 844, 10.1016/j.apm.2020.01.020
Rabelo, 2011, A multi-layer perceptron approach to threshold documents with complex background, 2523
M.Z. Afzal, J. Pastor-Pellicer, F. Shafait, T.M. Breuel, A. Dengel, M. Liwicki, Document image binarization using lstm: A sequence learning approach, in: 2015 International Workshop on Historical Document Imaging and Processing, 2015, pp. 79–84.
Westphal, 2018, Document image binarization using recurrent neural networks, 263
Pastor-Pellicer, 2015, Insights on the use of convolutional neural networks for document image binarization, 115
Tensmeyer, 2017, Document image binarization with fully convolutional neural networks, 99
Peng, 2017, Using convolutional encoder–decoder for document image binarization, 708
Vo, 2018, Binarization of degraded document images based on hierarchical deep supervised network, Pattern Recognit., 74, 568, 10.1016/j.patcog.2017.08.025
Ayyalasomayajula, 2018, PDNET: Semantic segmentation integrated with a primal–dual network for document binarization, Pattern Recognit. Lett., 121, 52, 10.1016/j.patrec.2018.05.011
He, 2019, DeepOtsu: Document enhancement and binarization using iterative deep learning, Pattern Recognit., 91, 379, 10.1016/j.patcog.2019.01.025
Calvo-Zaragoza, 2019, A selectional auto-encoder approach for document image binarization, Pattern Recognit., 86, 37, 10.1016/j.patcog.2018.08.011
Huang, 2020, Binarization of degraded document images with global-local unets, Optik, 203, 10.1016/j.ijleo.2019.164025
He, 2021, CT-Net: Cascade T-shape deep fusion networks for document binarization, Pattern Recognit., 118, 10.1016/j.patcog.2021.108010
Kang, 2021, Complex image processing with less data-document image binarization by integrating multiple pre-trained U-Net modules, Pattern Recognit., 109, 10.1016/j.patcog.2020.107577
Souibgui, 2022, De-gan: A conditional generative adversarial network for document enhancement, IEEE Trans. Pattern Anal. Mach. Intell., 44, 1180, 10.1109/TPAMI.2020.3022406
Zhao, 2019, Document image binarization with cascaded generators of conditional generative adversarial networks, Pattern Recognit., 96, 10.1016/j.patcog.2019.106968
De, 2020, Document image binarization using dual discriminator generative adversarial networks, IEEE Signal Process. Lett., 27, 1090, 10.1109/LSP.2020.3003828
Kumar, 2021, UDBNET: Unsupervised document binarization network via adversarial game, 7817
Jemni, 2022, Enhance to read better: a multi-task adversarial network for handwritten document image enhancement, Pattern Recognit., 123
Peng, 2019, Document binarization via multi-resolutional attention model with DRD loss, 45
Guo, 2020, Multi-scale multi-attention network for moiré document image binarization, Signal Process., Image Commun., 90
Castellanos, 2021, Unsupervised neural domain adaptation for document image binarization, Pattern Recognit., 119, 10.1016/j.patcog.2021.108099
F.J. Castellanos, A.J. Gallego, J. Calvo-Zaragoza, Unsupervised domain adaptation for document analysis of music score images, in: 2021 International Society for Music Information Retrieval Conference, 2021, pp. 81–87.
Ronneberger, 2015, U-net: Convolutional networks for biomedical image segmentation, 234
A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, N. Houlsby, An image is worth 16x16 words: Transformers for image recognition at scale, in: 2021 International Conference on Learning Representations, 2021.
Yang, 2021, Orthogonal nonnegative matrix factorization using a novel deep autoencoder network, Knowl.-Based Syst., 227, 10.1016/j.knosys.2021.107236
Yang, 2021, A novel deep quantile matrix completion model for top-n recommendation, Knowl.-Based Syst., 228, 10.1016/j.knosys.2021.107302
P. Isola, J.Y. Zhu, T. Zhou, A.A. Efros, Image-to-image translation with conditional adversarial networks, in: 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 1125–1134.
Suh, 2022, Two-stage generative adversarial networks for binarization of color document images, Pattern Recognit., 130, 10.1016/j.patcog.2022.108810
Ganin, 2016, Domain-adversarial training of neural networks, J. Mach. Learn. Res., 17, 2030
Basu, 2020, U-Net versus Pix2Pix: a comparative study on degraded document image binarization, J. Electron. Imaging, 29, 10.1117/1.JEI.29.6.063019
Vaswani, 2017, Attention is all you need, 5998
S. Zheng, J. Lu, H. Zhao, X. Zhu, Z. Luo, Y. Wang, Y. Fu, J. Feng, T. Xiang, P.H.S. Torr, L. Zhang, Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers, in: 2021 IEEE Conference on Computer Vision and Pattern Recognition, 2021, pp. 6881–6890.
Chen, 2021
S. Li, X. Sui, X. Luo, X. Xu, Y. Liu, R.S.M. Goh, Medical image segmentation using squeeze-and-expansion transformers, in: 2021 International Joint Conference on Artificial Intelligence, 2021, pp. 807–815.
Atienza, 2021, Vision transformer for fast and efficient scene text recognition, 319
Z. Raisi, M.A. Naiel, G. Younes, S. Wardell, J.S. Zelek, Transformer-based text detection in the wild, in: 2021 IEEE Conference on Computer Vision and Pattern Recognition, 2021, pp. 3162–3171.
Carion, 2020, End-to-end object detection with transformers, 213
Han, 2023, A survey on vision transformer, IEEE Trans. Pattern Anal. Mach. Intell., 45, 87, 10.1109/TPAMI.2022.3152247
K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778.
Pratikakis, 2010, H-DIBCO 2010-handwritten document image binarization competition, 727
I. Pratikakis, B. Gatos, K. Ntirogiannis, ICDAR 2011 document image binarization contest, in: 2011 International Conference on Document Analysis and Recognition, 2011, pp. 1506–1510.
Pratikakis, 2012, ICFHR 2012 competition on handwritten document image binarization, 817
I. Pratikakis, B. Gatos, K. Ntirogiannis, ICDAR 2013 document image binarization contest, in: 2013 International Conference on Document Analysis and Recognition, 2013, pp. 1471–1476.
Ntirogiannis, 2014, ICFHR 2014 competition on handwritten document image binarization, 809
Pratikakis, 2016, ICFHR 2016 handwritten document image binarization contest, 619
I. Pratikakis, K. Zagoris, G. Barlas, B. Gatos, ICDAR 2017 competition on document image binarization, in: 2017 International Conference on Document Analysis and Recognition, 2017, pp. 1395–1403.
Pratikakis, 2018, ICFHR 2018 competition on handwritten document image binarization, 489
Bera, 2021, A non-parametric binarization method based on ensemble of clustering algorithms, Multimedia Tools Appl., 80, 7653, 10.1007/s11042-020-09836-z
F. Deng, Z. Wu, Z. Lu, M.S. Brown, Binarizationshop: a user-assisted software suite for converting old documents to black-and-white, in: 2010 Annual Joint Conference on Digital Libraries, 2010, pp. 255–258.
H.Z. Nafchi, S.M. Ayatollahi, R.F. Moghaddam, M. Cheriet, An efficient ground truthing tool for binarization of historical manuscripts, in: 2013 International Conference on Document Analysis and Recognition, 2013, pp. 807–811.
R. Hedjam, H.Z. Nafchi, R.F. Moghaddam, M. Kalacska, M. Cheriet, ICDAR 2015 contest on multispectral text extraction (ms-tex 2015), in: 2015 International Conference on Document Analysis and Recognition, 2015, pp. 1181–1185.
R. Mondal, D. Chakraborty, B. Chanda, Learning 2D morphological network for old document image binarization, in: 2019 International Conference on Document Analysis and Recognition, 2019, pp. 65–70.
Chattopadhay, 2018, Grad-cam++: Generalized gradient-based visual explanations for deep convolutional networks, 839