Activity 2: Designing an extraction, loading and transformation strategy 06/12/2023 – 13/12/2023

Material
SQL Data Manipulation Language

Objective: The research question is how to transform raw data to adapt to a database relational schema and load data to populate it.

Theory background: 

  • The notion of consistency and a consistent database
  • Extraction, loading and transformation (ETL) processes [PDF]
  • Database (consistent) state

ToDo

  1. Using the SQL data definition language, create a script to insert some example tuples in the database. Remember to respect a specific order according to how you defined foreign keys. Otherwise, you will have errors. Query your schema if necessary, as seen in Activity 1.
  2. Propose an algorithm in pseudo-code for transforming raw data into your schema. Input is raw data, and output is an SQL script with the insertion commands.
  3. Implement and test your algorithm. 
  4. Run the resulting script on the DBMS for populating your database schema.

Deliverables

  • SQL script for inserting tuples in the relational database implemented in Activity 1 06/12/2023
  • Program transforming raw data into instances (tuples) of your relational schema -13/12/2023
  • Test on the created database 13/12/2023