Some of the following hands on exercises are modified version of the ones proposed in L. Igual and S. Seguí, Introduction to Data Science: A Python approach to concepts, techniques and applications, Undergraduate topics in Computer Science Series, Springer, 2017
1. If you are willing to use your own computer, using a self contained Data Science environment follow two steps (requires medium technical skills):
- Download Anaconda according to the characteristics of your machine and OS.
- Install Anaconda following the instructions according to your OS (Windows, MacOS).
2. Online nothing to install (basic technical skills, recommended):
- Create a Kaggle account https://www.kaggle.com
Understanding data collections content: a quantitative vision
Keep in mind that for performing data analytics you are willing to make sense of data and this implies acquiring Data Literacy. Have a look at this reference for background.
Michel Bowen, Anthony Bartley, The Basics of Data Literacy: making your students (and you!) make sens of data, NST Press, Arlington Virginia
- Getting acquainted to a Data Lab [data]
- First steps into data analytics [Let us be Holmes and find the murderer]
Some of the following hands on will be done in Python. So here a memento of the language [PDF]
We are going to use Kaggle to perform this exercice. Access your Kaggle account (https://www.kaggle.com/) and follow instructions in the class.
Prepare your Kaggle environment following the instructions here
- Getting started with the data science ecosystem [HO-1] Tabular operators Pandas-SQL [CheatSheet]
- Exploring data collections using descriptive statistics [HO-2] [PDF]
Data analysis using artificial intelligence techniques
Imbalanced Data in Classification Cheat Sheet
- Classification: Unsupervised learning light version [HO-3][PDF]
- Comparing clustering algorithms long version [HO-4] (for Data Science & Computing Science lectures)
For a lecture for a Data Science or Computing Science audience
- Supervised learning [HO-7]
- Network Analysis: 5 graph operations social networks [HO-8]
Prediction using inferential statistics
Towards Data Analytics at Scale
- https://github.com/jbsneto-ppgsc-ufrn/spark-tutorial
- Azure Machine Learning Gallery https://gallery.azure.ai