NoSQL data stores: expressing queries using MapReduce
- Downloading Couch: http://couchdb.apache.org [cURL for Windows]
- Building a document database: using CouchDB [Ex-1] [Ex1-answers]
- Querying a document database [Ex-2] [answers on explicit demand]
Sharding a data for balancing loads and ensuring availability
- Sharding MongoBD
- Exercise [Ex1-2Do2Handin] [cities]
- Mongo reference guide [MongoDB-sharding-guide]
Data sanitation with Pig
- Installing Pig
- Dealing with network behavior data collections [pdf]
Data analytics with Hadoop
- Environment: hadoop on Hortonworks
- Counting words and other summarization challenges [AllData]
- Counting words: first approach [ pdf ] [WordCount Example]
- Counting with some optimizations using combiners: understanding some principles of the map reduce model [ pdf ] [MapReduce-book-final] [code examples]
- Some interesting map reduce patterns: see the challenges section [patterns reference]