Find the “murderer” using data analytics

This exercise was prepared by Prod. Géraldine Castel, University Grenoble AlpesILCEA4 Lab for the That Camp event (http://ugainvalence2018.thatcamp.org)

Objective

Have an intuitive look to the use of statistics and other language processing tools for analysing text and discovering knowledge

Material

ToDo

Get some extra points for the final exam!

Use a word or equivalent document to propose answers and observations and send the resulting document to genoveva.vargas@imag.fr.

You can work individually or in a team. The name of your file must show your name or the names of the team. This information should also appear in a section Authors of the content of the document.

Hand in valid interval: Monday 8th October – Monday 15th October 2018

  1. Click on Upload and upload de Doyle-Baskervilles file. Then, click on Reveal.
  2. Hover with your mouse on the menu above the word cloud to find the button Define options for this tool.
  3. In the Stopwords section, choose English, then Edit List. Add to the list the words you see in the cloud which are useless to understand the plot. Then click on Confirm. Look at the cloud and then find out where the story is taking place and who are the main characters.
  4. Type ‘Baskerville’ in the box in the bottom right corner of the screen, and if necessary choose it in the popup list, then click on Contexts.
  5. What more can you find out about the plot and the characters? By double clicking on the lines, a wider context is given and you have the full extract in the central panel.
  6. Then go to Trends, just above. Type ‘Holmes’ in the box underneath the graph. What do you notice?
  7. Now delete ‘Holmes’ from the box and instead, type the names Stapleton, Barrymore and Mortimer.
  8. Looking at the graph, which suspect is the first to be eliminated by the investigators? Who do you think is the murderer in the end?