Automated system for construction specification review using natural language processing

Advanced Engineering Informatics - Tập 51 - Trang 101495 - 2022
Seonghyeon Moon1,2, Gitaek Lee1, Seokho Chi1,2
1Department of Civil and Environmental Engineering, Seoul National University, Seoul 08826, Republic of Korea
2Institute of Construction and Environmental Engineering, Seoul National University, Seoul, 08826, Republic of Korea

Tài liệu tham khảo

Ryoo, 2010, Web-Based Construction Project Specification System, J. Comput. Civ. Eng., 24, 212, 10.1061/(ASCE)0887-3801(2010)24:2(212) Lam, 2007, International Treatise on Construction Specification Problems from a Legal Perspective, J. Prof. Issues Eng. Educ. Pract., 133, 229, 10.1061/(ASCE)1052-3928(2007)133:3(229) Zhang, 2017, Semantic-Based Logic Representation and Reasoning for Automated Regulatory Compliance Checking, J. Comput. Civ. Eng., 31, 04016037, 10.1061/(ASCE)CP.1943-5487.0000583 Zhong, 2012, Ontology-based semantic modeling of regulation constraint for automated construction quality compliance checking, Autom. Constr., 28, 58, 10.1016/j.autcon.2012.06.006 Zhang, 2017, Integrating semantic NLP and logic reasoning into a unified system for fully-automated code checking, Autom. Constr., 73, 45, 10.1016/j.autcon.2016.08.027 Zhong, 2020, Ontology-Based Semantic Modeling of Knowledge in Construction: Classification and Identification of Hazards Implied in Images, J. Constr. Eng. Manag., 146, 04020013, 10.1061/(ASCE)CO.1943-7862.0001767 Lee, 2017, Predicting Project’s Uncertainty Risk in the Bidding Process by Integrating Unstructured Text Data and Structured Numerical Data Using Text Mining, Appl. Sci., 7, 1, 10.3390/app7111141 Xu, 2021, Extracting Domain Knowledge Elements of Construction Safety Management: Rule-Based Approach Using Chinese Natural Language Processing, J. Manag. Eng., 37, 04021001, 10.1061/(ASCE)ME.1943-5479.0000870 Zhong, 2020, Deep learning-based extraction of construction procedural constraints from construction regulations, Adv. Eng. Informatics., 43, 1, 10.1016/j.aei.2019.101003 ul Hassan, 2020, Automated Requirements Identification from Construction Contract Documents Using Natural Language Processing, J. Leg. Aff. Disput. Resolut. Eng. Constr., 12, 04520009, 10.1061/(ASCE)LA.1943-4170.0000379 Caldas, 2002, Automated Classification of Construction Project Documents, J. Comput. Civ. Eng., 16, 234, 10.1061/(ASCE)0887-3801(2002)16:4(234) Caldas, 2003, Automating hierarchical document classification for construction management information systems, Autom. Constr., 12, 395, 10.1016/S0926-5805(03)00004-9 Al Qady, 2013, Document Management in Construction: Practices and Opinions, J. Constr. Eng. Manag., 139, 06013002, 10.1061/(ASCE)CO.1943-7862.0000741 Al Qady, 2013, Document Discourse for Managing Construction Project Documents, J. Comput. Civ. Eng., 27, 466, 10.1061/(ASCE)CP.1943-5487.0000201 Al Qady, 2014, Automatic clustering of construction project documents based on textual similarity, Autom. Constr., 42, 36, 10.1016/j.autcon.2014.02.006 Al Qady, 2015, Automatic Classification of Project Documents on the Basis of Text Content, J. Comput. Civ. Eng., 29, 04014043, 10.1061/(ASCE)CP.1943-5487.0000338 Salama, 2016, Semantic Text Classification for Supporting Automated Compliance Checking in Construction, J. Comput. Civ. Eng., 30, 04014106, 10.1061/(ASCE)CP.1943-5487.0000301 Soibelman, 2008, Management and analysis of unstructured construction data types, Adv. Eng. Informatics., 22, 15, 10.1016/j.aei.2007.08.011 Moon, 2018, Document Management System Using Text Mining for Information Acquisition of International Construction, KSCE J. Civ. Eng., 22, 4791, 10.1007/s12205-018-1528-y Bilgin, 2018, An ontology-based approach for delay analysis in construction, KSCE J. Civ. Eng., 22, 384, 10.1007/s12205-017-0651-5 Jallan, 2019, Application of Natural Language Processing and Text Mining to Identify Patterns in Construction-Defect Litigation Cases, J. Leg. Aff. Disput. Resolut. Eng. Constr., 11, 04519024, 10.1061/(ASCE)LA.1943-4170.0000308 Lee, 2019, Development of Automatic-Extraction Model of Poisonous Clauses in International Construction Contracts Using Rule-Based NLP, J. Comput. Civ. Eng., 33, 04019003, 10.1061/(ASCE)CP.1943-5487.0000807 Marzouk, 2019, Text analytics to analyze and monitor construction project contract and correspondence, Autom. Constr., 98, 265, 10.1016/j.autcon.2018.11.018 Williams, 2014, Predicting construction cost overruns using text mining, numerical data and ensemble classifiers, Autom. Constr., 43, 23, 10.1016/j.autcon.2014.02.014 Zou, 2017, Retrieving similar cases for construction project risk management using Natural Language Processing techniques, Autom. Constr., 80, 66, 10.1016/j.autcon.2017.04.003 Beach, 2015, A rule-based semantic approach for automated regulatory compliance in the construction sector, Expert Syst. Appl., 42, 5219, 10.1016/j.eswa.2015.02.029 Malsane, 2015, Development of an object model for automated compliance checking, Autom. Constr., 49, 51, 10.1016/j.autcon.2014.10.004 Salama, 2013, Automated Compliance Checking of Construction Operation Plans Using a Deontology for the Construction Domain, J. Comput. Civ. Eng., 27, 681, 10.1061/(ASCE)CP.1943-5487.0000298 Zhang, 2015, Automated Information Transformation for Automated Regulatory Compliance Checking in Construction, J. Comput. Civ. Eng., 29, 1, 10.1061/(ASCE)CP.1943-5487.0000427 Zhang, 2016, Semantic NLP-Based Information Extraction from Construction Regulatory Documents for Automated Compliance Checking, J. Comput. Civ. Eng., 30, 04015014, 10.1061/(ASCE)CP.1943-5487.0000346 Zhou, 2016, Ontology-Based Multilabel Text Classification of Construction Regulatory Documents, J. Comput. Civ. Eng., 30, 04015058, 10.1061/(ASCE)CP.1943-5487.0000530 Xue, 2020, Building Codes Part-of-Speech Tagging Performance Improvement by Error-Driven Transformational Rules, J. Comput. Civ. Eng., 34, 04020035, 10.1061/(ASCE)CP.1943-5487.0000917 Zhang, 2020, Automated IFC-based building information modelling and extraction for supporting value analysis of buildings, Int. J. Constr. Manag., 20, 269 Fan, 2013, Retrieving similar cases for alternative dispute resolution in construction accidents using text mining techniques, Autom. Constr., 34, 85, 10.1016/j.autcon.2012.10.014 Kim, 2019, Accident Case Retrieval and Analyses: Using Natural Language Processing in the Construction Industry, J. Constr. Eng. Manag., 145, 04019004, 10.1061/(ASCE)CO.1943-7862.0001625 Kwayu, 2020, Semantic N-Gram Feature Analysis and Machine Learning-Based Classification of Drivers’ Hazardous Actions at Signal-Controlled Intersections, J. Comput. Civ. Eng., 34, 04020015, 10.1061/(ASCE)CP.1943-5487.0000895 Nunoo, 2020, Margin of Safety in TMDLs: Natural Language Processing-Aided Review of the State of Practice, J. Hydrol. Eng., 25, 04020002, 10.1061/(ASCE)HE.1943-5584.0001889 Tixier, 2016, Automated content analysis for construction safety: A natural language processing system to extract precursors and outcomes from unstructured injury reports, Autom. Constr., 62, 45, 10.1016/j.autcon.2015.11.001 Zhang, 2015, Ontology-based semantic modeling of construction safety knowledge: Towards automated safety planning for job hazard analysis (JHA), Autom. Constr., 52, 29, 10.1016/j.autcon.2015.02.005 Zhou, 2017, Ontology-based automated information extraction from building energy conservation codes, Autom. Constr., 74, 103, 10.1016/j.autcon.2016.09.004 Moon, 2021, Automated Construction Specification Review with Named Entity Recognition Using Natural Language Processing, J. Constr. Eng. Manag., 147, 04020147, 10.1061/(ASCE)CO.1943-7862.0001953 Manning, 2008 Aitchison, 2003 Curran, 2002, Improvements in automatic thesaurus extraction, 59 Y. Jing, W.B. Croft, An association thesaurus for information retrieval, in: Proc. Intell. Multimed. Inf. Retr. Syst., 1994, pp. 146–160. Wielinga, 2001, From thesaurus to ontology, 194 D.A. Evans, K. Ginther-Webster, M. Hart, R.G. Lefferts, I.A. Monarch, Automatic indexing using selective NLP and first-order thesauri, in: RIAO ’91 Intell. Text Image Handl., ParisFrance, 1991, pp. 624–643. T. Mikolov, I. Sutskever, K. Chen, G. Corrado, J. Dean, Distributed Representations of Words and Phrases and their Compositionality, in: Adv. Neural Inf. Process. Syst. 26, 2013, pp. 3111–3119. https://doi.org/10.1162/jmlr.2003.3.4-5.951. Google Code Archive - word2vec. https://code.google.com/archive/p/word2vec/, 2013 (accessed November 26, 2019). Kleinberg, 2011, Authoritative Sources in a Hyperlinked Environment, 514 Page, 1999, The PageRank Citation Ranking: Bringing Order to the Web, World Wide Web Internet Web Inf. Syst., 66, 1 McCallum, 2003, Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons, 188 E.F.T.K. Sang, F. De Meulder, Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition, in: Proceeding 6th Conf. Nat. Lang. Learn. - COLING-02, Edmonton, Canada, 2003, pp. 142–147. https://doi.org/10.3115/1118853.1118877. Zhong, 2021, Does semantics aid syntax? An empirical study on named entity recognition and classification, Neural Comput. Appl., 1 Liu, 2017, Ontology-based semi-supervised conditional random fields for automated information extraction from bridge inspection reports, Autom. Constr., 81, 313, 10.1016/j.autcon.2017.02.003 Moon, 2020, Bridge Damage Recognition from Inspection Reports Using NER Based on Recurrent Neural Network with Active Learning, J. Perform. Constr. Facil, 34, 04020119, 10.1061/(ASCE)CF.1943-5509.0001530 Y. Wu, M. Schuster, Z. Chen, Q. V. Le, M. Norouzi, W. Macherey, M. Krikun, Y. Cao, Q. Gao, K. Macherey, J. Klingner, A. Shah, M. Johnson, X. Liu, Ł. Kaiser, S. Gouws, Y. Kato, T. Kudo, H. Kazawa, K. Stevens, G. Kurian, N. Patil, W. Wang, C. Young, J. Smith, J. Riesa, A. Rudnick, O. Vinyals, G. Corrado, M. Hughes, J. Dean, Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation, ArXiv Prepr. ArXiv1609.08144. (2016) 1–23. http://arxiv.org/abs/1609.08144. Z. Cui, R. Ke, Z. Pu, Y. Wang, Deep Bidirectional and Unidirectional LSTM Recurrent Neural Network for Network-wide Traffic Speed Prediction, ArXiv Prepr. ArXiv1801.02143. (2018) 1–11. http://arxiv.org/abs/1801.02143. J. Lafferty, A. McCallum, F.C.N. Pereira, Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data, in: Proc. 18th Int. Conf. Mach. Learn. 2001 (ICML 2001), 2001, pp. 282–289. https://repository.upenn.edu/cis_papers/159/. Z. Huang, W. Xu, K. Yu, Bidirectional LSTM-CRF Models for Sequence Tagging, (2015). http://arxiv.org/abs/1508.01991. Lample, 2016, Neural Architectures for Named Entity Recognition, 260 Lau, 2016, An Empirical Evaluation of doc2vec with Practical Insights into Document Embedding Generation, 78 Q.V. Le, T. Mikolov, Distributed representations of sentences and documents, in: Proc. Mach. Learn. Res., Beijing, China, 2014, pp. 1188–1196. Lee, 2016, Sentiment classification for unlabeled dataset using Doc2Vec with JST, 1 Croft, 2010, Search Engines: Information Retrieval in Practice, Pearson Education