Hands-On

Some of the following hands-on exercises are modified versions of the ones proposed in L. Igual and S. Seguí, Introduction to Data Science: A Python approach to concepts, techniques and applications, Undergraduate Topics in Computer Science Series, Springer, 2017

Material

  • Create a Kaggle account https://www.kaggle.com
  • Access your Kaggle account (https://www.kaggle.com/) and follow the instructions in the class.
  • Prepare your Kaggle environment following the instructions here
  • Have a Google account for using Colab https://colab.research.google.com

Activities

  1. Getting started with the data science ecosystem [HO-1]
  2. Creating a metadata repository on Apache ATLAS [HO-2a] [HO-2b]
  3. Identifying outliers with classification techniques [HO-3]