Hybrid Cloud Approaches for Large-Scale Medicaid Data Engineering Using AWS and Hadoop
DOI:
https://doi.org/10.63282/3050-9246.IJETCSIT-V3I1P103Keywords:
Hybrid Cloud, Medicaid Data, AWS, Hadoop, Data Engineering, Big Data Processing, ETL Pipelines, Data Security, Compliance, Scalability, Distributed Computing, Data Storage, Cloud Computing, Healthcare Analytics, Hybrid Architecture, Real-Time Data Processing, Batch Processing, Data Lakes, Data Warehouses, Cost OptimizationAbstract
Medicaid programs generate huge volumes of complex, sensitive data requiring scalable, safe, quick processing solutions. Conventional on-site infrastructure often struggles with data volume and variety, so cloud-based solutions seem more interesting. This paper looks at a hybrid cloud model using Hadoop's strong distributed computing capacity in concert with AWS's flexibility and managed services. Combining on-site Hadoop clusters with AWS services as S3, EHR & Redshift will help organizations to achieve a balance of cost efficiency, performance & also adherence to strict regulatory norms. The findings highlight major challenges like security concerns, data transfer latency & their interoperability between cloud & on-site systems. While guaranteeing governance & access policies are maintained, we provide ideal ways for improving data input, storage & processing techniques. Using actual world Medicaid data scenarios, this study shows how hybrid architectures increase data analytics, reporting & ML capabilities, therefore enabling accelerated insights & also better decision-making. Organizations may preserve their present investments while improving their Medicaid data infrastructure by combining the reliability of Hadoop with the agility of AWS. This approach helps to meet the growing need for actual time data processing, enhanced security & affordable scalability, therefore enabling better healthcare outcomes
Downloads
References
[1] Almasi, Sepideh, and Guillem Pratx. "Cloud computing for big data." Big Data in Radiation Oncology. CRC Press, 2019. 61-78.
[2] Begoli, Edmon. "A short survey on the state of the art in architectures and platforms for large scale data analysis and knowledge discovery from data." Proceedings of the WICSA/ECSA 2012 Companion Volume (2012): 177-183.
[3] Raghupathi, Wullianallur, and Viju Raghupathi. "Big data analytics in healthcare: promise and potential." Health information science and systems 2 (2014): 1-10.
[4] Keck, Anastasia, et al. "Predicting Unethical Physician Behavior At Scale: A Distributed Computing Framework." 2019 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation
[5] (SmartWorld/SCALCOM/UIC/ATC/CBD Com/IOP/SCI). IEEE, 2019.Roussev, Vassil, Golden G. Richard III, and Daniel Bilar. "Security Assessment of Cloud Computing Vendor Offerings." (2009).
[6] Natarajan, Vaithilingam Anantha, Subbaiyan Jothilakshmi, and Venkat N. Gudivada. "Scalable traffic video analytics using hadoop MapReduce." ALLDATA 2015 (2015): 18.
[7] Shameer, Khader, et al. "Translational bioinformatics in the era of real-time biomedical, health care and wellness data streams." Briefings in bioinformatics 18.1 (2017): 105-124.
[8] Roy, Somak, et al. "Next-generation sequencing informatics: challenges and strategies for implementation in a clinical environment." Archives of pathology & laboratory medicine 140.9 (2016): 958-975.
[9] Mengle, Saket SR, and Maximo Gurmendez. Mastering machine learning on Aws: advanced machine learning in Python using SageMaker, Apache Spark, and TensorFlow. Packt Publishing Ltd, 2019.
[10] Etchings, Jay A. Strategies in biomedical data science: driving force for innovation. John Wiley & Sons, 2017.
[11] Foster, Ian, et al., eds. Big data and social science: Data science methods and tools for research and practice. CRC Press, 2020.
[12] Donoho, David. "50 years of Data Science." URL http://courses. csail. mit. edu/18 337 (2015): 2015.
[13] Zhan, Andong. Towards AI-assisted healthcare: System design and deployment for machine learning based clinical decision support. Diss. Johns Hopkins University, 2018.
[14] Raghupathi, Wullianallur, and Viju Raghupathi. "Data Analytics: Architectures, Implementation, Methodology, and Tools." Encyclopedia of Information Systems and Technology-Two Volume Set. CRC Press, 2015. 311-320.
[15] Dove, Edward S., Yann Joly, and Bartha M. Knoppers. "International genomic cloud computing:‘mining’the terms of service." Privacy and legal issues in cloud computing. Edward Elgar Publishing, 2015. 237-260.