ETL Techniques for Structured and Unstructured Data
DOI:
https://doi.org/10.63282/3050-9246.IJETCSIT-V3I4P102Keywords:
ETL (Extract, Transform, Load), Structured Data, Unstructured Data, Data Integration, Data Transformation, Data Extraction, Data LoadingAbstract
ETL concepts are among the most crucial in data management since as organizations continue to grow more especially in the numbers and types of data required for operation. Therefore, the purpose of this paper is to comparatively consider ETL methodologies for both, structured and unstructured data with regards to the differences, problems, and suggestions. This is because the first kind of data in data mining which is structured data proves to be comprised in a tabular structure with patterns like rows and columns while the other kind of data known as unstructured data lacks such format. Particularly in the initiatives of the evaluation, there is emphasis on issues such as the tools required in the extraction of data from several sources, transformation of data so that it is quality and consistent and the methods used in loading it in the target system. Alongside with the cases and examples from the real live, commenting on the tendencies for evolution of ETL and perspectives for further advancements the paper attempts to describe the situation comprehensively. Managers, data analysts, engineers and IT specialists concerned with extending the use of various data and its function for business will find this research useful
Downloads
References
[1] Ralph Kimball, and Margy Ross, The Data Warehouse Toolkit the Definitive Guide to Dimensional Modeling, Wiley, pp.1-608, 2013. https://www.google.co.in/books/edition/The_Data_Warehouse_Toolkit/4rFXzk8wAB8C?hl=en&gbpv=0
[2] W. H. Inmon, Building the Data Warehouse, John Wiley & Sons, pp. 576, 2005. https://books.google.co.in/books/about/Building_the_Data_Warehouse.html?id=QFKTmh5IFS4C&redir_esc=y
[3] Bello-Orgaz G, Jung JJ, Camacho D. Social big data: Recent achievements and new challenges. Inf Fusion. 2016. doi:10.1016/j.inffus.2015.08.005.
[4] Amir Gandomi, Murtaza Haider, Beyond the hype: Big data concepts, methods, and analytics, International Journal of Information Management, vol. 35, no. 2, pp. 137-144, 2015. https://doi.org/10.1016/j.ijinfomgt.2014.10.007
[5] What Is ETL Process, Medium. https://medium.com/@datadrix/what-is-etl-process-in-data-science-4249745453bd