HO-1: Getting acquainted to notebooks for developing data science projects

Objective

  1. Understand the different tools available in a WIDE for developing data science projects.
  2. Recall basic concepts of Descriptive Statistics and apply them to data exploration tasks.

Material

ToDo

  1. Create a project on azure notebooks devoted to work for this lecture with the material available on github.
  2. Have a look at the notebook that implements the analysis for Understanding the Gender Divide at Work
  3. Going through the notebook while executing it, have a look at the use of descriptive analytics concepts for exploring data.
    • Explain the use of statistical measures for understanding data collections.
    • Explain the importance of computing data distributions for the attributes of the data sets.
    • Explain why is it important to observe outliers in the analysis. How do you technically observe outliers?
    • Which is the strategy to measure the relative risk of early promotion? How did you processed the data to evaluate this risk?

To Hand In

  1. Use the notebook that you analyse and add at the end of the document using MarkDown a section where you explain the use of descriptive analytics in data exploration
  2. Create a figure of the pipeline you extracted and add it to your notebook
  3. Send the modified notebook to genoveva.vargas@gmail.com