Regularized Simple Graph Convolution (SGC) for improved interpretability of large datasets
Tóm tắt
Classification of data points which correspond to complex entities such as people or journal articles is a ongoing research task. Notable applications are recommendation systems for customer behaviors based upon their features or past purchases and in academia labeling relevant research papers in order to reduce the reading time required. The features that can be extracted are many and result in large datasets which are a challenge to process with complex machine learning methodologies. There is also an issue on how this is presented and how to interpret the parameterizations beyond the classification accuracies. This work shows how the network information contained in an adjacency matrix allows improved classification of entities through their associations and how the framework of the SGC provide an expressive and fast approach. The proposed regularized SGC incorporates shrinkage upon three different aspects of the projection vectors to reduce the number of parameters, the size of the parameters and the directions between the vectors to produce more meaningful interpretations.
Tài liệu tham khảo
Krallman A, Pelletier MJ, Adams FG.. @ size vs.# impact: Social media engagement differences amongst facebook, twitter, and instagram. In: Celebrating America’s Pastimes: Baseball, Hot Dogs, Apple Pie and Marketing? Berlin: Springer; 2016. p. 557–61.
Newman M. Networks. Oxford: Oxford University Press; 2018.
Barabási A-L, et al. Network science. Cambridge: Cambridge University Press; 2016.
Girvan M, Newman ME. Community structure in social and biological networks. Proc Natl Acad Sci. 2002;99(12):7821–6.
Borgatti SP. Centrality and network flow. Soc Netw. 2005;27(1):55–71.
Jeong H, Néda Z, Barabási A-L. Measuring preferential attachment in evolving networks. EPL (Europhys Lett). 2003;61(4):567.
Albert R, Barabási A-L. Statistical mechanics of complex networks. Rev Mod Phys. 2002;74(1):47.
Laflin P, Mantzaris AV, Ainley F, Otley A, Grindrod P, Higham DJ. Discovering and validating influence in a dynamic online social network. Soc Netw Anal Min. 2013;3(4):1311–23.
Soares FB, Recuero R, Zago G. Influencers in polarized political networks on twitter. In: Proceedings of the 9th international conference on social media and society. 2018. p. 168–77.
Ricci F, Rokach L, Shapira B. Introduction to recommender systems handbook. In: Recommender systems handbook. Berlin: Springer; 2011. p. 1–35.
Su X, Khoshgoftaar TM. A survey of collaborative filtering techniques. Advances in artificial intelligence. 2009;2009.
Unga, LH, Foster DP. Clustering methods for collaborative filtering. In: AAAI workshop on recommendation systems, Menlo Park, CA, vol. 1. 1998. p. 114–29.
Zhang R, Tran T. An information gain-based approach for recommending useful product reviews. Knowl Inf Syst. 2011;26(3):419–34.
Kabakchieva D. Student performance prediction by using data mining classification algorithms. Int J Comput Sci Manag Res. 2012;1(4):686–90.
Thai-Nghe N, Drumond L, Krohn-Grimberghe A, Schmidt-Thieme L. Recommender system for predicting student performance. Procedia Comput Sci. 2010;1(2):2811–9.
Jackson MO. Networks in the understanding of economic behaviors. J Econ Perspect. 2014;28(4):3–22.
Shu K, Wang S, Tang J, Zafarani R, Liu H. User identity linkage across online social networks: a review. Acm Sigkdd Explor Newsl. 2017;18(2):5–17.
Althoff T, Jindal P, Leskovec J. Online actions with offline impact: how online social networks influence online and offline user behavior. In: Proceedings of the tenth ACM international conference on web search and data mining. 2017. p. 537–46.
Crucitti P, Latora V, Porta S. Centrality measures in spatial networks of urban streets. Phys Rev E. 2006;73(3):036125.
Euler L. Solutio problematis ad geometriam situs pertinentis. Commentarii academiae scientiarum Petropolitanae. 1741: 128–40.
Kivelä M, Arenas A, Barthelemy M, Gleeson JP, Moreno Y, Porter MA. Multilayer networks. J Complex Netw. 2014;2(3):203–71.
Belyi A, Bojic I, Sobolevsky S, Sitko I, Hawelka B, Rudikova L, Kurbatski A, Ratti C. Global multi-layer network of human mobility. Int J Geogr Inf Sci. 2017;31(7):1381–402.
McPherson M, Smith-Lovin L, Cook JM. Birds of a feather: homophily in social networks. Ann Rev Sociology. 2001;27(1):415–44.
Son J, Kim SB. Academic paper recommender system using multilevel simultaneous citation networks. Decis Support Syst. 2018;105:24–33.
Radicchi F, Fortunato S, Vespignani A. Citation networks. In: Models of science dynamics. Berlin: Springer; 2012. p. 233–57.
Suthaharan S. Big data classification: problems and challenges in network intrusion prediction with machine learning. ACM SIGMETRICS Perform Eval Rev. 2014;41(4):70–3.
Caldarola EG, Rinaldi AM. Big data visualization tools: a survey. Research Gate 2017.
Reddy GT, Reddy MPK, Lakshmanna K, Kaluri R, Rajput DS, Srivastava G, Baker T. Analysis of dimensionality reduction techniques on big data. IEEE Access. 2020;8:54776–88.
Scarselli F, Gori M, Tsoi AC, Hagenbuchner M, Monfardini G. The graph neural network model. IEEE Trans Neural Netw. 2008;20(1):61–80.
Zhou J, Cui G, Zhang Z, Yang C, Liu Z, Wang L, Li C, Sun M. Graph neural networks: a review of methods and applications. arXiv preprint arXiv:1812.08434 2018.
Wu F, Zhang T, Souza AHd, Fifty C, Yu T, Weinberger KQ. Simplifying graph convolutional networks. In: 36th international conference on machine learning, ICML 2019, 2019-June, 2019. p. 11884–94.
Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat Soc Ser B (Methodol). 1996;58(1):267–88.
Mccallum A. Cora research paper classification dataset. people. cs. umass. edu/mccallum/data. html. KDD 2001.
LeCun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proc IEEE. 1998;86(11):2278–324.
LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436–44.
Defferrard M, Bresson X, Vandergheynst P. Convolutional neural networks on graphs with fast localized spectral filtering. In: Advances in neural information processing systems. 2016. p. 3844–52
Shuman DI, Narang SK, Frossard P, Ortega A, Vandergheynst P. The emerging field of signal processing on graphs: extending high-dimensional data analysis to networks and other irregular domains. IEEE Signal Process Mag. 2013;30(3):83–98.
Sandryhaila A, Moura JM. Discrete signal processing on graphs. IEEE Trans Signal Process. 2013;61(7):1644–56.
NT H, Maehara T. Revisiting graph neural networks: All we have is low-pass filters. arXiv preprint arXiv:1905.09550 2019.
Kipf TN, Welling M. Semi-supervised classification with graph convolutional networks. In: 5th international conference on learning representations, ICLR 2017—conference track proceedings 2016.
Shchur O, Mumme M, Bojchevski A, Günnemann S. Pitfalls of graph neural network evaluation. arXiv preprint arXiv:1811.05868 2018.
Newman ME. Modularity and community structure in networks. Proc Natl Acad Sci. 2006;103(23):8577–82.
Ketkar N. Introduction to pytorch. In: Deep learning with Python. Berlin: Springer; 2017. p. 195–208.
Bidoki NH, Mantzaris AV, Sukthankar G. Exploiting weak ties in incomplete network datasets using simplified graph convolutional neural networks. Mach Learn Knowl Extract. 2020;2(2):125–46.