Complete syllabus here: LIS-4102
- Introduction: dealing with data at scale [slides] [YouTube][YouTube-2]
- Datification and Data properties
- Data-centric applications at scale
- Computing centres: hardware and resources delivery
- Distributed data management and storage
- Cluster based data stores [slides][YouTube] [YouTube-2]
- [MongoExamples] [slides][slides-2][slides-3]
- Graph databases [slides] [YouTube]
- Cypher [YouTube]
- [Polyglot UseCase]
- Non-functional properties: concurrency, eventual consistency, …
- Distributed archival systems [slides]
- Distributed File Systems
- Data Labs
- Data Lakes
- Cluster based data stores [slides][YouTube] [YouTube-2]
- Big data processing and analysis Parallel programming models [YouTube] [YouTube][YouTube][YouTube][YouTube] [YouTube][YouTube]
- Map Reduce: families of algorithms and patterns [slides – part A] [slides – part B][Glossary]
- Data flow-based models: operators, data representation, management [slides-part A] [slides-part B]
- Spark programming Use Case [exercise]
- Ecosystems for massive data management and processing
- High-performance architectures: cluster, HPC, cloud, fog, edge, just in time architectures [slides]