- Introduction [PDF]
- Data centric sciences: principles & common aspects
- Digital data collections: characteristics & properties
- Data science: big data, data analytics algorithms & tools
- From centralized to high scale WIDES, data science laboratories to artificial intelligences studios [PDF]
- In house data analytics environments: Jupyter
- Targeting large scale: Zeppelin, Spark [PDF]
- Data science virtual machines: cloud solutions
- Data science labs: CoLab, Kaggle, Azure Notebooks
- Artificial intelligences execution environments & studios: tensor, café, Azure ML studio
- Designing experiments environments
- Data labs: data collections, quality, & profiling
- Architectural settings: from in house to large scale experiments
- Parallel execution platforms & environments
- Multi-core programming
- Externalising computation
- Data engineering [PDF] [Data]
a. Data formats, transformations, distribution
b. Studying data quality
c. Statistical properties
d. Techniques for adjusting data and building data samples
e. An overview of applied mathematics to machine learning - Designing data science pipelines [[PDF]
- Linear regression
- Simple linear regression
- Multiple linear regression & polynomial regression
- Sparse model
- Logistic regression
- Supervised & unsupervised learning projects
- Learning curves
- Training, validation & test
- Two learning models
- Super vector machines
- Random forest
- Clustering
- Assessing clustering: metrics
- Techniques taxonomy
- Graph processing: network science
- Background on networks and graphs
- Graph operation
- Similarity & distances
Top