Data Centralisation (ETL)

Extract-Transform-Load (ETL) using Leading Technologies (Python, Hadoop, Javascript, Tableau, etc.)

Data is extracted from unformatted or unstructured data sources (which are not optimised for analytics), and moved to a central host for analysis. The exact steps in that process might differ from one ETL tool to the next, but the end result is the same. The basic ETL process encompasses data extraction, transformation, and loading. While the abbreviation implies a neat, three-step process – extract, transform, load – this simple definition doesn’t capture:

  • The movement of data

  • The overlap between each of these stages

  • How any new technologies are changing this flow

Traditional ETL process

Historically, the ETL process has looked like this:

However, data is frequently analysed in raw form rather than from preloaded OLAP summaries. This has led to the development of lightweight, flexible, and transparent ETL systems with processes outlined here: