Automated extraction of attributes from natural language attribute-based access control (ABAC) Policies

Cybersecurity - Tập 2 - Trang 1-25 - 2019
Manar Alohaly1,2, Hassan Takabi1, Eduardo Blanco1
1Department of Computer Science and Engineering, University of North Texas, Denton, USA
2College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia

Tóm tắt

The National Institute of Standards and Technology (NIST) has identified natural language policies as the preferred expression of policy and implicitly called for an automated translation of ABAC natural language access control policy (NLACP) to a machine-readable form. To study the automation process, we consider the hierarchical ABAC model as our reference model since it better reflects the requirements of real-world organizations. Therefore, this paper focuses on the questions of: how can we automatically infer the hierarchical structure of an ABAC model given NLACPs; and, how can we extract and define the set of authorization attributes based on the resulting structure. To address these questions, we propose an approach built upon recent advancements in natural language processing and machine learning techniques. For such a solution, the lack of appropriate data often poses a bottleneck. Therefore, we decouple the primary contributions of this work into: (1) developing a practical framework to extract authorization attributes of hierarchical ABAC system from natural language artifacts, and (2) generating a set of realistic synthetic natural language access control policies (NLACPs) to evaluate the proposed framework. Our experimental results are promising as we achieved - in average - an F1-score of 0.96 when extracting attributes values of subjects, and 0.91 when extracting the values of objects’ attributes from natural language access control policies.

Tài liệu tham khảo

Abassi, R, Rusinowitch M, Jacquemard F, El Fatmi SG (2010) Xml access control: from xacml to annotated schemas In: The Second International Conference on Communications and Networking, 1–8.. IEEE, Tozeur. https://ieeexplore.ieee.org/document/5699810. Alohaly, M, Takabi H, Blanco E (2018) A Deep Learning Approach for Extracting Attributes of ABAC Policies In: Proceedings of the 23nd ACM on Symposium on Access Control Models and Technologies, ACM, New York, IN, USA, SACMAT ’18, 137–148. https://dl.acm.org/citation.cfm?doid=3205977.3205984. Axiomatics (2017) Attribute based access control (ABAC). https://www.axiomatics.com/attribute-based-access-control/. Accessed 2018. Bakhshandeh, O, Allen J (2015) From adjective glosses to attribute concepts: Learning different aspects that an adjective can describe In: Proceedings of the 11th International Conference on Computational Semantics, Association for Computational Linguistics, London, UK IWCS, 23–33. http://aclweb.org/anthology/W15-0103. Banea, C, Chen D, Mihalcea R, Cardie C, Wiebe J (2014) Simcompass: Using deep learning word embeddings to assess cross-level similarity In: Proceedings of the 8th International Workshop on Semantic Evaluation, Association for Computational Linguistics,Dublin, Ireland, (SemEval 2014), 560–565. http://aclweb.org/anthology/S14-2098. Bengio, Y, Ducharme R, Vincent P, Jauvin C (2003) A neural probabilistic language model. J Mach Learn Res 3(Feb):1137–1155. Benkaouz, Y, Erradi M, Freisleben B (2016) Work in progress: K-nearest neighbors techniques for ABAC policies clustering In: Proceedings of the 2016 ACM International Workshop on Attribute Based Access Control, ACM, New York, NY, USA, ABAC ’16, 72–75. https://doi.org/10.1145/2875491.2875497, http://doi.acm.org/10.1145/2875491.2875497. Berland, M, Charniak E (1999) Finding parts in very large corpora In: Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics on Computational Linguistics, Association for Computational Linguistics, Stroudsburg, PA, USA, ACL ’99, 57–64. https://doi.org/10.3115/1034678.1034697. Brill, E (1995) Transformation-based error-driven learning and natural language processing: A case study in part-of-speech tagging. Comput Linguist 21(4):543–565. http://dl.acm.org/citation.cfm?id=218355.218367. Brossard, D, Gebel G, Berg M (2017) A systematic approach to implementing ABAC In: Proceedings of the 2Nd ACM Workshop on Attribute-Based Access Control, ACM, New York, NY, USA, ABAC ’17, 53–59. https://doi.org/10.1145/3041048.3041051, http://doi.acm.org/10.1145/3041048.3041051. Chen, D, Manning C (2014) A fast and accurate dependency parser using neural networks In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 740–750.. Association for Computational Linguistics, Doha. http://aclweb.org/anthology/D14-1082. Collobert, R, Weston J (2008) A unified architecture for natural language processing: Deep neural networks with multitask learning In: Proceedings of the 25th International Conference on Machine Learning, ACM, New York, NY, USA, ICML ’08, 160–167. https://doi.org/10.1145/1390156.1390177, http://doi.acm.org/10.1145/1390156.1390177. Collobert, R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language processing (almost) from scratch. J Mach Learn Res 12(Aug):2493–2537. Common crawl (2018). http://commoncrawl.org/. Accessed 2018. Culotta, A, Sorensen J (2004) Dependency tree kernels for relation extraction In: Proceedings of the 42Nd Annual Meeting on Association for Computational Linguistics, Association for Computational Linguistics, Stroudsburg, PA, USA, ACL ’04. https://doi.org/10.3115/1218955.1219009. Dandelion api (2018). https://dandelion.eu/. Accessed 2018. De Marneffe, MC, Manning CD (2008) Stanford typed dependencies manual. Tech. Stanford University, Technical report. Dingwall, N, Potts C (2018) Mittens: An extension of glove for learning domain-specialized representations. arXiv preprint arXiv:180309901. Ester, M, Kriegel HP, Sander J, Xu X, et al. (1996) A density-based algorithm for discovering clusters in large spatial databases with noise In: Kdd, vol 96, 226–231.. ACM, Portland. https://dl.acm.org/citation.cfm?id=3001507. Frey, BJ, Dueck D (2007) Clustering by passing messages between data points. Science 315(5814):972–976. https://doi.org/10.1126/science.1136800, http://science.sciencemag.org/content/315/5814/972. Gartner (2013) Market trends: Cloud-based security services market, worldwide, 2014. https://www.gartner.com/doc/2607617. Accessed 2018. Gautam, M, Jha S, Sural S, Vaidya J, Atluri V (2017) Poster: Constrained policy mining in attribute based access control In: Proceedings of the 22Nd ACM on Symposium on Access Control Models and Technologies, ACM, New York, NY, USA, SACMAT ’17 Abstracts, 121–123. https://doi.org/10.1145/3078861.3084163, http://doi.acm.org/10.1145/3078861.3084163. Glasser, J, Lindauer B (2013) Bridging the gap: A pragmatic approach to generating insider threat data In: 2013 IEEE Security and Privacy Workshops, 98–104.. IEEE, San Francisco. https://doi.org/10.1109/SPW.2013.37. Guo, Q, Vaidya J, Atluri V (2008) The role hierarchy mining problem: Discovery of optimal role hierarchies In: 2008 Annual Computer Security Applications Conference (ACSAC), 237–246.. IEEE, Anaheim. https://doi.org/10.1109/ACSAC.2008.38. GuoDong, Z, Jian S, Jie Z, Min Z (2005) Exploring various knowledge in relation extraction:427–434. https://doi.org/10.3115/1219840.1219893. Hartigan, JA, Wong MA (1979) Algorithm as 136: A k-means clustering algorithm. J R Stat Soc Ser C (Appl Stat) 28(1):100–108. Hu, VC, Ferraiolo D, Kuhn R, Friedman AR, Lang AJ, Cogdell MM, Schnitzer A, Sandlin K, Miller R, Scarfone K, et al. (2013) Guide to attribute based access control (ABAC) definition and considerations. NIST Spec Publ 800(162). IBM (2004) Course registration requirements. https://khanhn.files.wordpress.com/2016/08/vidu-ibm.pdf. Accessed 2018. Iyer, P, Masoumzadeh A (2018) Mining positive and negative attribute-based access control policy rules In: Proceedings of the 23Nd ACM on Symposium on Access Control Models and Technologies, ACM, New York, NY, USA, SACMAT ’18, 161–172. https://doi.org/10.1145/3205977.3205988, http://doi.acm.org/10.1145/3205977.3205988. Jiang, J (2012) Information extraction from text. In: Aggarwal CC Zhai C (eds)Data, Mining Text, 11–41.. Springer, U S, Boston, MA. Johansson, R, Nugues P (2008) Dependency-based semantic role labeling of propbank In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Stroudsburg, PA, USA, EMNLP ’08,69–78. http://dl.acm.org/citation.cfm?id=1613715.1613726. Jurafsky, D, Martin JH (2009) Speech and language processing: An introduction to natural language processing, computational linguistics, and speech recognition, 1–1024.. Prentice Hall series in artificial intelligence. Kalchbrenner, N, Blunsom P (2013) Recurrent convolutional neural networks for discourse compositionality. CoRR abs/1306:3584. 1306.3584, 1306.3584. Karol, S, Mangat V (2013) Evaluation of text document clustering approach based on particle swarm optimization. Open Comput Sci 3(2):69–90. Lampson, BW (1974) Protection. SIGOPS Oper Syst Rev 8(1):18–24. https://doi.org/10.1145/775265.775268, http://doi.acm.org/10.1145/775265.775268. LeCun, Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436. Martin, JH, Jurafsky D (2000) Speech and Language processing: An introduction to natural language processing. Computational Linguistics and Speech Recognition, Prentice Hall, 2. McCarthy, V (2003) Xacml a no-nonsense developer’s guide. http://www.idevnews.com/stories/57. Accessed 2018. Medvet, E, Bartoli A, Carminati B, Ferrari E (2015) Evolutionary inference of attribute-based access control policies In: International Conference on Evolutionary Multi-Criterion Optimization, Springer International Publishing, Cham, EMO (1), 351–365. Meneely, A, Williams L, Smith B (2011) itrust electronic health care system: A case study. http://bensmith.s3.amazonaws.com/website/papers/sst2011.pdf. Accessed 2018. Mikolov, T, Chen K, Corrado G, Dean J (2013) Efficient Estimation of Word Representations in Vector Space In: Proceedings of ICLR Workshops Track. ArXiv e-prints, arxiv.org/abs/1301.3781. Mikolov, T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality In: Advances in neural information processing systems, 3111–3119.. Curran Associates, Inc. Miller, GA (1995) Wordnet: a lexical database for english. Commun ACM 38(11):39–41. Mnih, A, Hinton GE (2009) A scalable hierarchical distributed language model. In: Koller D, Schuurmans D, Bengio Y, Bottou L (eds)Advances in Neural Information Processing Systems 21,1081–1088.. Curran Associates, Inc. http://papers.nips.cc/paper/3583-a-scalable-hierarchical-distributed-language-model.pdf. Mocanu, D, Turkmen F, Liotta A (2015) Towards ABAC policy mining from logs with deep learning In: Proceedings of the 18th International Multiconference, IS 2015, Intelligent Systems, Ljubljana. Narouei, M, Takabi H (2015) Automatic top-down role engineering framework using natural language processing techniques In: IFIP International Conference on Information Security Theory and Practice, 137–152.. Springer International Publishing, Cham. https://link.springer.com/chapter/10.1007/978-3-319-24018-3_9. Narouei, M, Takabi H (2015a) Towards an automatic top-down role engineering approach using natural language processing techniques In: Proceedings of the 20th ACM Symposium on Access Control Models and Technologies, ACM, New York, NY, USA, SACMAT ’15, 157–160. https://doi.org/10.1145/2752952.2752958, http://doi.acm.org/10.1145/2752952.2752958. Narouei, M, Khanpour H, Takabi H (2017) Identification of access control policy sentences from natural language policy documents. In: Livraga G Zhu S (eds)Data and Applications Security and Privacy XXXI, 82–100.. Springer International Publishing, Cham, DBSec. Narouei, M, Khanpour H, Takabi H, Parde N, Nielsen R (2017) Towards a top-down policy engineering framework for attribute-based access control In: Proceedings of the 22Nd ACM on Symposium on Access Control Models and Technologies, ACM, New York, NY, USA, SACMAT ’17, 103–114. https://doi.org/10.1145/3078861.3078874. http://doi.acm.org/10.1145/3078861.3078874. Narouei, M, Takabi H, Nielsen R (2018) Automatic extraction of access control policies from natural language documents. IEEE Transactions on Dependable and Secure Computing, 1–1.. IEEE. https://doi.org/10.1109/TDSC.2018.2818708. Oak, M, Behera A, Thomas T, Alm CO, Prud’hommeaux E, Homan C, Ptucha RW (2016) Generating clinically relevant texts: A case study on life-changing events In: Proceedings of the Third Workshop on Computational Lingusitics and Clinical Psychology, 85–94. OASIS (2013) extensible access control markup language (xacml) version 3.0. http://docs.oasis-open.org/xacml/3.0/xacml-3.0-core-spec-os-en.html. Pelletier, FJ (1994) The principle of semantic compositionality. Topoi 13(1):11–24. https://doi.org/10.1007/BF00763644. Pennington, J, Manning CD, Socher R (2017) Glove: Global vectors for word representation. https://nlp.stanford.edu/projects/glove/. Pennington, J, Socher R, Manning C (2014) Glove: Global vectors for word representation In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 1532–1543. http://www.aclweb.org/anthology/D14-1162. Rehman, SU, Asghar S, Fong S, Sarasvady S (2014) Dbscan: Past, present and future In: The Fifth International Conference on the Applications of Digital Information and Web Technologies (ICADIWT 2014), 232–238.. IEEE, Bangalore. https://doi.org/10.1109/ICADIWT.2014.6814687. Schubert, E, Sander J, Ester M, Kriegel HP, Xu X (2017) Dbscan revisited, revisited: Why and how you should (still) use dbscan. ACM Trans Database Syst 42(3):19:1–19:21. Servos, D, Osborn SL (2015) Hgabac: Towards a formal model of hierarchical attribute-based access control. In: Cuppens F, Garcia-Alfaro J, Zincir Heywood N, Fong PWL (eds)Foundations and Practice of Security, Springer International Publishing, Cham, 187–204. https://link.springer.com/chapter/10.1007/978-3-319-17040-4_12#citeas. Shen, Y, He X, Gao J, Deng L, Mesnil G (2014) Learning semantic representations using convolutional neural networks for web search In: Proceedings of the 23rd International Conference on World Wide Web, ACM, New York, NY, USA, WWW ’14 Companion, 373–374. https://doi.org/10.1145/2567948.2577348, http://doi.acm.org/10.1145/2567948.2577348. Slankas, J, Williams L (2013) Access control policy identification and extraction from project documentation. SCIENCE 2(3):145–159. Slankas, J, Xiao X, Williams L, Xie T (2014) Relation extraction for inferring access control rules from natural language artifacts In: Proceedings of the 30th Annual Computer Security Applications Conference, ACM, New York, NY, USA, ACSAC ’14, 366–375. https://doi.org/10.1145/2664243.2664280, http://doi.acm.org/10.1145/2664243.2664280. Sokolova, M, Lapalme G (2009) A systematic analysis of performance measures for classification tasks. Inf Process Manag 45(4):427–437. https://doi.org/10.1016/j.ipm.2009.03.002, http://www.sciencedirect.com/science/article/pii/S0306457309000259. Steinbach, M, Karypis G, Kumar V, et al. (2000) A comparison of document clustering techniques In: KDD workshop on text mining, Boston, vol 400, 525–526. http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.125.9225. Stoller, SD, Bui T (2016) Mining hierarchical temporal roles with multiple metrics In: Proceedings of the 30th Annual IFIP WG 11.3 Working Conference on Data and Applications Security and Privacy, 79–95.. Springer International Publishing, Cham, DBSec16. Tjong Kim Sang, EF, De Meulder F (2003) Introduction to the conll-2003 shared task: Language-independent named entity recognition In: Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003 - Volume 4, Association for Computational Linguistics, Stroudsburg, PA, USA, CONLL ’03, 142–147. https://doi.org/10.3115/1119176.1119195. Tran, TN, Drab K, Daszykowski M (2013) Revised dbscan algorithm to cluster data with dense adjacent clusters. Chemometr Intell Lab Syst 120:92–96. Turner, RC (2017) Proposed model for natural language ABAC authoring In: Proceedings of the 2Nd ACM Workshop on Attribute-Based Access Control, ACM, New York, NY, USA, ABAC ’17, 61–72. https://doi.org/10.1145/3041048.3041054, http://doi.acm.org/10.1145/3041048.3041054. Van De Stadt, R (2012) Cyberchair: A web-based groupware application to facilitate the paper reviewing process. CoRR abs/1206.1833. 1206.1833, withdrawn., 1206.1833. Webster, JJ, Kit C (1992) Tokenization as the initial phase in nlp In: Proceedings of the 14th Conference on Computational Linguistics - Volume 4, Association for Computational Linguistics, Stroudsburg, PA, USA, COLING ’92, 1106–1110. https://doi.org/10.3115/992424.992434. Xiao, X, Paradkar A, Thummalapenta S, Xie T (2012) Automated extraction of security policies from natural-language software documents In: Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering, ACM, New York, NY, USA, FSE ’12, 12:1–12:11. https://doi.org/10.1145/2393596.2393608, http://doi.acm.org/10.1145/2393596.2393608. Xu, Z, Stoller SD (2013) Mining attribute-based access control policies from rbac policies In: Proceedings of the 10th International Conference and Expo on Emerging Technologies for a Smarter World CEWIT 2013, 1–6.. IEEE. Xu, Z, Stoller SD (2014) Mining attribute-based access control policies from logs In: IFIP Annual Conference on Data and Applications Security and Privacy, Springer, 276–291. https://link.springer.com/chapter/10.1007/978-3-662-43936-4_18. Xu, Z, Stoller SD (2015) Mining attribute-based access control policies. IEEE Trans Dependable Secure Computing 12(5):533–545. https://ieeexplore.ieee.org/document/6951368. Zelenko, D, Aone C, Richardella A (2003) Kernel methods for relation extraction. J Mach Learn Res 3(Feb):1083–1106. Zeng, D, Liu K, Lai S, Zhou G, Zhao J, et al. (2014) Relation classification via convolutional deep neural network In: 2014, the 25th International Conference on Computational Linguistics, 2335–2344.. Dublin City University and Association for Computational Linguistics, Dublin. Zhang, M, Zhang J, Su J (2006) Exploring syntactic features for relation extraction using a convolution tree kernel In: Proceedings of the Main Conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, Association for Computational Linguistics, Stroudsburg, PA, USA, HLT-NAACL ’06, 288–295. https://doi.org/10.3115/1220835.1220872.