Challenges

Installing Hadoop in

  1. CH-1: Declarative approach for map-reduce
    1. First steps on counting: counting Facebook contacts using Pig [ pdf ]
  2. CH-2: Counting words and other summarization challenges
    1. Counting with some optimizations using combiners: understanding some principles of the map reduce model [ pdf ]
    2. Hints on the first question  [ pdf ]
  3. CH-3: More intensive summarization 
    1. Median and standard deviation [ pdf ]
    2. Inverted index summarizations [ pdf ]
  4. CH-4 Filtering patterns: Everyone has to program challenge 4b and each and then each remaining challenge can be addressed by at most two groups. Hurry choosing your challenge and notify it !
    1. Filtering [ pdf ]
    2. Bloom [ pdf ]
    3. Top ten [ pdf ]
    4. Distinct [ pdf ]
  5. CH-5: Data organization patterns: Choose one of the following challenges. Each one can be assigned only to one person/group. For your chosen challenge propose an alternative practical example where the pattern that you implemented can apply.
    1. Structured to hierarchical [ pdf ]
    2. Partitioning [ pdf ]
    3. Binning [ pdf ]
    4. Total order sorting [ pdf ]
    5. Shuffling [ pdf ]
  6. CH-6: Join patterns: Choose one of the following challenges. Each one except number 1 can be assigned only to one person/group. For your chosen challenge propose an alternative practical example where the pattern that you implemented can apply.
    1. Reduce side join classic and with bloom filter [ pdf ]
    2. Replicated join [ pdf ]
    3. Composite join [ pdf ]
    4. Cartesian product [ pdf ]