A comprehensive analysis of the diverse aspects inherent to image data stream classification

Knowledge and Information Systems - Tập 64 - Trang 2215-2238 - 2022
Mateus C. de Lima1, YanStivalettie Souza1, Elaine R. Faria1, Maria Camila N. Barioni1
1Faculdade de Computação, Universidade Federal de Uberlândia (UFU), Uberlândia, Brazil

Tóm tắt

Image data stream classification presents several challenges, for example, the evolution of concepts of known classes (concept drift) and the emergence of new classes (open set). Many studies conducted on image data stream classification investigate the classifier, but do not explore other important issues, such as specific evaluation methods for data stream scenarios, evolution of the image feature descriptor and the updating of the decision model, while considering characteristics of real application environments. This article thus aims at making contributions that aid in closing these gaps through the incorporation of an experimental study, which considers a new evaluation method for the classification of image streams, while deliberating on important issues connected to this task. To this end, algorithms from the literature were considered, in order to identify how such algorithms lose performance when evaluated in real-world scenarios. Experiments were carried out exploring the refinement of the feature descriptor, updating the model in the presence of concept drift and open set, in addition to the use of latency and active learning strategies. The results obtained show that the greater the reality considered in the experiments, the greater the degradation of the results.

Tài liệu tham khảo

Silva JA, Faria ER, Barros RC, Hruschka ER, Carvalho ACPLFD, Gama JA (2013) Data stream clustering: a survey. ACM Comput Surv 46(1):13–11331. https://doi.org/10.1145/2522968.2522981 Gurjar GS, Chhabria S (2015) A review on concept evolution technique on data stream. In: International conference on pervasive computing. IEEE, Pune, pp 1–3. https://doi.org/10.1109/PERVASIVE.2015.7087172 Mehta JS (2017) Concept drift in streaming data classification: algorithms, platforms and issues. Procedia Comput Sci 122:804–811. https://doi.org/10.1016/j.procs.2017.11.440 Parreira P, Prati R (2019) Active learning in data stream with intermediate latency. In: ENIAC, Salvador Masud MM, Chen Q, Khan L, Aggarwal CC, Gao J, Han J, Srivastava A, Oza NC (2013) Classification and adaptive novel class detection of feature-evolving data streams. TKDE 25(7):1484–1497. https://doi.org/10.1109/TKDE.2012.109 Rebuffi S-A, Kolesnikov A, Sperl G, Lampert CH (2017) iCaRL: incremental classifier and representation learning. In: CVPR. IEEE, Honolulu, Hawaii, pp 5533–5542. https://doi.org/10.1109/CVPR.2017.587 Goo W, Kim J, Kim G, Hwang S (2016) Taxonomy-regularized semantic deep convolutional neural networks. In: ECCV. Springer, Amsterdam, pp 86–101. https://doi.org/10.1007/978-3-319-46475-6_6 Castro FM, Marin-Jimenez MJ, Guil N, Schmid C, Alahari K (2018) End-to-end incremental learning. In: ECCV. Springer, Munich, pp 241–257. https://doi.org/10.1007/978-3-030-01258-8_15 Hu J, Sun Z, Li B, Yang K, Li D (2017) Online user modeling for interactive streaming image classification. In: MMM. Springer, Reykjavik, pp 293–305. https://doi.org/10.1007/978-3-319-51814-5_25 Ristin M, Guillaumin M, Gall J, Gool LV (2014) Incremental learning of NCM forests for large-scale image classification. In: CVPR. IEEE, Columbus, pp 3654–3661. https://doi.org/10.1109/CVPR.2014.467 Wu J, Sheng VS, Zhang J, Li H, Dadakova T, Swisher CL, Cui Z, Zhao P (2020) Multi-label active learning algorithms for image classification: overview and future promise. ACM Comput Surv 53(2):1–35. https://doi.org/10.1145/3379504 de Lima MC, Barioni MCN, Faria ER, Razente HL (2020) Evisclass: a new evaluation method for image data stream classifiers. In: ICMLA. IEEE, Miami, pp 399–406. https://doi.org/10.1109/ICMLA51294.2020.00070 de Lima MC, de Abreu AJS, Faria ER, Barioni MCN (2021) Evaluating the construction of feature descriptors in the performance of the image data stream classification. In: CIARP. Springer, Porto, pp 327–339. https://doi.org/10.1007/978-3-030-93420-0_31 Nguyen H-L, Woon Y-K, Ng W-K (2015) A survey on data stream clustering and classification. Knowl Inf Syst 45(3):535–569. https://doi.org/10.1007/s10115-014-0808-1 Souza VMA, Silva DF, Batista GEAPA, Gama J (2015) Classification of evolving data streams with infinitely delayed labels. In: ICMLA. IEEE, Miami, pp 214–219. https://doi.org/10.1109/ICMLA.2015.174 Zhu X, Zhang P, Lin X, Shi Y (2010) Active learning from stream data using optimal weight classifier ensemble. Syst Man Cybern B Cybern 40(6):1607–1621. https://doi.org/10.1109/TSMCB.2010.2042445 Žliobaitė I, Bifet A, Pfahringer B, Holmes G (2014) Active learning with drifting streaming data. TNNLS 25(1):27–39. https://doi.org/10.1109/TNNLS.2012.2236570 Bifet A, Gavaldà R, Holmes G, Pfahringer B (2018) Machine learning for data streams with practical examples in MOA. MIT Press, Cambridge Pugliese VU, Costa RD, Hirata CM (2021) Comparative evaluation of the supervised machine learning classification methods and the concept drift detection methods in the financial business problems. In: Filipe J, Śmiałek M, Brodsky A, Hammoudi S (eds) ICEIS. Springer, Online Conference, pp 268–292. https://doi.org/10.1007/978-3-030-75418-1_13 Bifet A, Frank E (2010) Sentiment knowledge discovery in twitter streaming data. In: Pfahringer B, Holmes G, Hoffmann A (eds) Discovery science. Springer, Canberra, pp 1–15. https://doi.org/10.1007/978-3-642-16184-1_1