Big data actionable intelligence architecture
Tóm tắt
The amount of data produced by sensors, social and digital media, and Internet of Things (IoTs) are rapidly increasing each day. Decision makers often need to sift through a sea of Big Data to utilize information from a variety of sources in order to determine a course of action. This can be a very difficult and time-consuming task. For each data source encountered, the information can be redundant, conflicting, and/or incomplete. For near-real-time application, there is insufficient time for a human to interpret all the information from different sources. In this project, we have developed a near-real-time, data-agnostic, software architecture that is capable of using several disparate sources to autonomously generate Actionable Intelligence with a human in the loop. We demonstrated our solution through a traffic prediction exemplar problem.
Tài liệu tham khảo
Sandia Labs News Service. “Wrangling Big Data”, Albuquerque Journal, November 4, 2019. https://www.abqjournal.com/1386752/wrangling-big-data-to-locate-actionable-info-a-lot-faster.html
Reinsel D, Gantz J, Rydning J. Data Age 2025 - The Digitization of the World From Edge to Core. Framingham, MA: International Data Corporation (IDC). 2018. https://www.seagate.com/files/www-content/our-story/trends/files/idc-seagate-dataage-whitepaper.pdf
Ma P, Sun X. Leveraging for Big Data Regression. Wiley Interdisciplinary Reviews: Computational Statistics. 2015;7:70–6. https://doi.org/10.1002/wics.1324.
Qiu J, Wu Q, Ding G, et al. A survey of machine learning for big data processing. EURASIP J Adv Signal Process. 2016;2016:67. https://doi.org/10.1186/s13634-016-0355-x.
Majumdar J, Naraseeyappa S, Ankalaki S. Analysis of agriculture data using data mining techniques: application of big data. J Big Data. 2017;4:20. https://doi.org/10.1186/s40537-017-0077-4.
B. Chandramouli J, Goldstein, Duan S. Temporal Analytics on Big Data for Web Advertising. In: 2012 IEEE 28th International Conference on Data Engineering, Washington, DC, 2012, pp. 90–101. https://ieeexplore.ieee.org/document/6228075
Rathore MM, Ahmad A, Paul A, Rho S. Urban planning and building smart cities based on the Internet of Things using Big Data analytics. Comput Netw. 2016;101:63–80.
Zhou D, et al. Distributed Data Analytics Platform for Wide-Area Synchrophasor Measurement Systems. In: IEEE Transactions on Smart Grid, vol. 7, no. 5, pp. 2397–2405, Sept. 2016. https://ieeexplore.ieee.org/iel7/5165411/5446437/07420696.pdf
National Spatial Data Infrastructure (NSDI), "Presidential Documents", Federal Register. Vol. 59, No. 71 Wednesday, April 13, 1993. https://www.archives.gov/files/federal-register/executive-orders/pdf/12906.pdf
Waze. https://www.waze.com/
Twitter Data Source. https://twitter.com/?lang=en
Travel Midwest Data Source. https://www.travelmidwest.com
City of Chicago Data Source. https://www.chicago.gov/city/en.html
GDELT Data Source. https://www.gdeltproject.org/
Mapquest Data Source. https://www.mapquest.com/
Digital Globe Data Source. https://www.digitalglobe.com/
Necula E. Dynamic Traffic Flow Prediction Based on GPS Data. In: 2014 IEEE 26th International Conference on Tools with Artificial Intelligence, Limassol, 2014, pp. 922–929. https://ieeexplore.ieee.org/document/6984576
Lv Y, Chen Y, Zhang X, Duan Y, Li NL. Social media based transportation research: the state of the work and the networking. In: IEEE/CAA Journal of Automatica Sinica, vol. 4, no. 1, pp. 19–26 2017. https://ieeexplore.ieee.org/document/7815548
Barros J, Araujo M, Rossetti RJF. Short-term real-time traffic prediction methods: A survey. In: 2015 International Conference on Models and Technologies for Intelligent Transportation Systems (MT-ITS), Budapest, 2015, pp. 132–139. https://ieeexplore.ieee.org/abstract/document/7223248
Hu H, Wen Y, Chua T, Li X. Toward scalable systems for big data analytics: a technology tutorial. IEEE Access. 2014;2:652–87. https://doi.org/10.1109/ACCESS.2014.2332453.
Marchal S, Jiang X, State R, Engel T. A Big Data Architecture for Large Scale Security Monitoring IEEE International Congress on Big Data. Anchorage, AK. 2014;2014:56–63. https://doi.org/10.1109/BigData.Congress.2014.18.
Chen Z, Guobin X, Mahalingam V, Ge L, Nguyen J, Wei Y, Chao L. A cloud computing based network monitoring and threat detection system for critical infrastructures. Big Data Res. 2016;3:10–23. https://doi.org/10.1016/j.bdr.2015.11.002.
Casas P, D'Alconzo A, Zseby T, Mellia M. Big-DAMA: Big Data Analytics for Network Traffic Monitoring and Analysis. In: Proceedings of the 2016 workshop on Fostering Latin-American Research in Data Communication Networks (LANCOMM ’16). Association for Computing Machinery, New York, NY, USA, 1–3. 2016. DOI: https://doi.org/10.1145/2940116.2940117
Julie Z, Bo T, Victor L. A five-layer architecture for big data processing and analytics. Int J Big Data Intelligence. 2019;6:1.
Weiming L, Chen Z, Bin Y, Yitong L. A General Multi-Source Data Fusion Framework. In: Proceedings of the 2019 11th International Conference on Machine Learning and Computing (ICMLC ’19). Association for Computing Machinery, New York, NY, USA, 285–289. 2019. https://doi.org/10.1145/3318299.3318394.
NIST Big Data Public Working Group (NBD-PWG), “NIST Special Publication 1500–1: NIST Big Data Interoperability Framework: Volume 1, Definitions”, National Institute of Standards and Technology, California, September 2015. https://doi.org/10.6028/NIST.SP.1500-1.
Bello-Orgaz G, Jung JJ, Camacho D. Social big data: Recent achievements and new challenges. Inform Fusion. 2016;1(28):45–59.
https://www.cloudera.com/downloads/hdp.html
https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html
Microsoft Azure Stack. https://azure.microsoft.com/en-us/overview/azure-stack/
Java Programming Language. https://www.java.com/en/
Python Programming Language: https://www.python.org/
Apache Storm. https://storm.apache.org/index.html
Apache Kafka. https://kafka.apache.org/
Nasiri H, Nasehi S, Goudarzi M. Evaluation of distributed stream processing frameworks for IoT applications in Smart Cities. J Big Data. 2019;6(1):52. https://doi.org/10.1186/s40537-019-0215-2.
Aung T, Min HY, Maw AH. Performance Evaluation for Real-Time Messaging System in Big Data Pipeline Architecture. 2018 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), Zhengzhou, China, 2018, pp. 198–1986, https://doi.org/10.1109/CyberC.2018.00047.
Apache Lucene. https://lucene.apache.org/solr/
Joseph R, Santosh D, Ross G, Ali F. You Only Look Once: Unified, Real-Time Object Detection. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 779–788. https://ieeexplore.ieee.org/document/7780460
Snidaro L et al. Context-Enhanced Information Fusion: Boosting Real-World Performance with Domain Knowledge. 2016. https://doi.org/10.1007/978-3-319-28971-7.pdf
Banana Dashboard. https://doc.lucidworks.com/lucidworks-hdpsearch/2.5/Guide-Banana.html