**Introduction** [PDF]- Data centric sciences: principles & common aspects
- Digital data collections: characteristics & properties
- Data science: big data, data analytics algorithms & tools

**From centralized to high scale WIDES, data science laboratories to artificial intelligences studios** [PDF]- In house data analytics environments: Jupyter
- Targeting large scale: Zeppelin, Spark [PDF]
- Data science virtual machines: cloud solutions
- Data science labs: CoLab, Kaggle, Azure Notebooks
- Artificial intelligences execution environments & studios: tensor, cafĂ©, Azure ML studio

**Designing experiments environments**- Data labs: data collections, quality, & profiling
- Architectural settings: from in house to large scale experiments
- Parallel execution platforms & environments
- Multi-core programming
- Externalising computation

**Data engineering** [PDF] [Data]

a. Data formats, transformations, distribution

b. Studying data quality

c. Statistical properties

d. Techniques for adjusting data and building data samples

e. An overview of applied mathematics to machine learning**Designing data science pipelines** [[PDF]- Linear regression
- Simple linear regression
- Multiple linear regression & polynomial regression
- Sparse model

- Logistic regression
- Supervised & unsupervised learning projects
- Learning curves
- Training, validation & test
- Two learning models
- Super vector machines
- Random forest

- Clustering
- Assessing clustering: metrics
- Techniques taxonomy

- Graph processing: network science
- Background on networks and graphs
- Graph operation
- Similarity & distances

Top