Big data actionable intelligence architecture

Journal of Big Data - Tập 7 - Trang 1-19 - 2020
Tian J. Ma1,2, Rudy J. Garcia1,2, Forest Danford1,2, Laura Patrizi1,2, Jennifer Galasso1,2, Jason Loyd1,2
1Sandia National Laboratories, Albuquerque, USA
2Livermore, USA

Tóm tắt

The amount of data produced by sensors, social and digital media, and Internet of Things (IoTs) are rapidly increasing each day. Decision makers often need to sift through a sea of Big Data to utilize information from a variety of sources in order to determine a course of action. This can be a very difficult and time-consuming task. For each data source encountered, the information can be redundant, conflicting, and/or incomplete. For near-real-time application, there is insufficient time for a human to interpret all the information from different sources. In this project, we have developed a near-real-time, data-agnostic, software architecture that is capable of using several disparate sources to autonomously generate Actionable Intelligence with a human in the loop. We demonstrated our solution through a traffic prediction exemplar problem.

Tài liệu tham khảo

Sandia Labs News Service. “Wrangling Big Data”, Albuquerque Journal, November 4, 2019. https://www.abqjournal.com/1386752/wrangling-big-data-to-locate-actionable-info-a-lot-faster.html Reinsel D, Gantz J, Rydning J. Data Age 2025 - The Digitization of the World From Edge to Core. Framingham, MA: International Data Corporation (IDC). 2018. https://www.seagate.com/files/www-content/our-story/trends/files/idc-seagate-dataage-whitepaper.pdf Ma P, Sun X. Leveraging for Big Data Regression. Wiley Interdisciplinary Reviews: Computational Statistics. 2015;7:70–6. https://doi.org/10.1002/wics.1324. Qiu J, Wu Q, Ding G, et al. A survey of machine learning for big data processing. EURASIP J Adv Signal Process. 2016;2016:67. https://doi.org/10.1186/s13634-016-0355-x. Majumdar J, Naraseeyappa S, Ankalaki S. Analysis of agriculture data using data mining techniques: application of big data. J Big Data. 2017;4:20. https://doi.org/10.1186/s40537-017-0077-4. B. Chandramouli J, Goldstein, Duan S. Temporal Analytics on Big Data for Web Advertising. In: 2012 IEEE 28th International Conference on Data Engineering, Washington, DC, 2012, pp. 90–101. https://ieeexplore.ieee.org/document/6228075 Rathore MM, Ahmad A, Paul A, Rho S. Urban planning and building smart cities based on the Internet of Things using Big Data analytics. Comput Netw. 2016;101:63–80. Zhou D, et al. Distributed Data Analytics Platform for Wide-Area Synchrophasor Measurement Systems. In: IEEE Transactions on Smart Grid, vol. 7, no. 5, pp. 2397–2405, Sept. 2016. https://ieeexplore.ieee.org/iel7/5165411/5446437/07420696.pdf National Spatial Data Infrastructure (NSDI), "Presidential Documents", Federal Register. Vol. 59, No. 71 Wednesday, April 13, 1993. https://www.archives.gov/files/federal-register/executive-orders/pdf/12906.pdf Waze. https://www.waze.com/ Twitter Data Source. https://twitter.com/?lang=en Travel Midwest Data Source. https://www.travelmidwest.com City of Chicago Data Source. https://www.chicago.gov/city/en.html GDELT Data Source. https://www.gdeltproject.org/ Mapquest Data Source. https://www.mapquest.com/ Digital Globe Data Source. https://www.digitalglobe.com/ Necula E. Dynamic Traffic Flow Prediction Based on GPS Data. In: 2014 IEEE 26th International Conference on Tools with Artificial Intelligence, Limassol, 2014, pp. 922–929. https://ieeexplore.ieee.org/document/6984576 Lv Y, Chen Y, Zhang X, Duan Y, Li NL. Social media based transportation research: the state of the work and the networking. In: IEEE/CAA Journal of Automatica Sinica, vol. 4, no. 1, pp. 19–26 2017. https://ieeexplore.ieee.org/document/7815548 Barros J, Araujo M, Rossetti RJF. Short-term real-time traffic prediction methods: A survey. In: 2015 International Conference on Models and Technologies for Intelligent Transportation Systems (MT-ITS), Budapest, 2015, pp. 132–139. https://ieeexplore.ieee.org/abstract/document/7223248 Hu H, Wen Y, Chua T, Li X. Toward scalable systems for big data analytics: a technology tutorial. IEEE Access. 2014;2:652–87. https://doi.org/10.1109/ACCESS.2014.2332453. Marchal S, Jiang X, State R, Engel T. A Big Data Architecture for Large Scale Security Monitoring IEEE International Congress on Big Data. Anchorage, AK. 2014;2014:56–63. https://doi.org/10.1109/BigData.Congress.2014.18. Chen Z, Guobin X, Mahalingam V, Ge L, Nguyen J, Wei Y, Chao L. A cloud computing based network monitoring and threat detection system for critical infrastructures. Big Data Res. 2016;3:10–23. https://doi.org/10.1016/j.bdr.2015.11.002. Casas P, D'Alconzo A, Zseby T, Mellia M. Big-DAMA: Big Data Analytics for Network Traffic Monitoring and Analysis. In: Proceedings of the 2016 workshop on Fostering Latin-American Research in Data Communication Networks (LANCOMM ’16). Association for Computing Machinery, New York, NY, USA, 1–3. 2016. DOI: https://doi.org/10.1145/2940116.2940117 Julie Z, Bo T, Victor L. A five-layer architecture for big data processing and analytics. Int J Big Data Intelligence. 2019;6:1. Weiming L, Chen Z, Bin Y, Yitong L. A General Multi-Source Data Fusion Framework. In: Proceedings of the 2019 11th International Conference on Machine Learning and Computing (ICMLC ’19). Association for Computing Machinery, New York, NY, USA, 285–289. 2019. https://doi.org/10.1145/3318299.3318394. NIST Big Data Public Working Group (NBD-PWG), “NIST Special Publication 1500–1: NIST Big Data Interoperability Framework: Volume 1, Definitions”, National Institute of Standards and Technology, California, September 2015. https://doi.org/10.6028/NIST.SP.1500-1. Bello-Orgaz G, Jung JJ, Camacho D. Social big data: Recent achievements and new challenges. Inform Fusion. 2016;1(28):45–59. https://www.cloudera.com/downloads/hdp.html https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html Microsoft Azure Stack. https://azure.microsoft.com/en-us/overview/azure-stack/ Java Programming Language. https://www.java.com/en/ Python Programming Language: https://www.python.org/ Apache Storm. https://storm.apache.org/index.html Apache Kafka. https://kafka.apache.org/ Nasiri H, Nasehi S, Goudarzi M. Evaluation of distributed stream processing frameworks for IoT applications in Smart Cities. J Big Data. 2019;6(1):52. https://doi.org/10.1186/s40537-019-0215-2. Aung T, Min HY, Maw AH. Performance Evaluation for Real-Time Messaging System in Big Data Pipeline Architecture. 2018 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), Zhengzhou, China, 2018, pp. 198–1986, https://doi.org/10.1109/CyberC.2018.00047. Apache Lucene. https://lucene.apache.org/solr/ Joseph R, Santosh D, Ross G, Ali F. You Only Look Once: Unified, Real-Time Object Detection. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 779–788. https://ieeexplore.ieee.org/document/7780460 Snidaro L et al. Context-Enhanced Information Fusion: Boosting Real-World Performance with Domain Knowledge. 2016. https://doi.org/10.1007/978-3-319-28971-7.pdf Banana Dashboard. https://doc.lucidworks.com/lucidworks-hdpsearch/2.5/Guide-Banana.html