MongoDB creating and querying document databases

Objective

Get acquainted with the design and querying of document databases using MongoDB. Mastering and having a critical view of documents databases collections, attribute types and references. Understanding the querying expression principles and then having a look to performance issues.

Material

  • MongoDB server
  • Restaurants database [download]

Preparing the hands on environment

  • Connecting to the MongoServer
  • Loading data in the database

ToDo: Querying the database

It is up to you now. All data is loaded. You will now query the database you just created.

Question 1: List all the restaurants in the collection, sorted in increasing order of names.

Question 2:  List all restaurants with “Italian” cuisine and display the name, zip code and geographic coordinates for each of them. Also, make sure the answer is ordered according to the sorting key (increasing postcode, decreasing name).

Question 3: List all Italian restaurants with the postcode “10075” for which the telephone number is provided in the database. Display their name, zip code and phone number.

Question 4: Find all restaurants with at least a score of 50.

Question 5:  List all restaurants that are either Italian or have the postcode “10075”.

Question 6:  List all restaurants with a zip code of “10075” or “10098” with either Italian or American cuisine with a score of at least 50.

Question 7: List all restaurants with at least one score concerning customer service (C)price (P) or quality (Q). Simply display the names, cuisine and zip code.

Updates

Question 8: Change the type of cuisine in the restaurant “Juni” to “American (new)”. In addition, record the date and time of the system in a “lastModified” field at the time the change is made. If there are several restaurants with the same name, only the first one must be modified.

Question 9: Change the address of the restaurant whose id is “41156888” to “East 31st Street”.

Question 10: Change the type of cuisine of all restaurants with a postcode “10016” and the type of cuisine “Other”into “Cuisine to be determined”. In addition, record the date and time of the system in a “lastModified”field at the time the change is made.

Question 11: Replace all information about the restaurant with an ID “41154403” with the following information:

            "name" : "Vella 2",
     "address" : {
       "city" : "1480",
       "street" : "2 Avenue",
       "zipcode" : "10075"
     }

Complex querying (aggregation)

Question 12: List the types of cuisine represented in the database. For each one, display the number of associated restaurants. Order the result by decreasing number of restaurants.

Question 13: Show, for each zip code, the number of Italian restaurants with this zip code. Order the result by decreasing number of restaurants.

Question 14: Consider Italian restaurants whose identifier (restaurant_id) is higher or equal to “41205309” and has an “averagePrice” attribute. Calculate the average of these average prices. Then repeat the same operation by calculating the average by zip code.

References between collections and joins

All this is very interesting, but our collection of restaurants is very lonely now. We will now work with several collections, linked together by references (in the manner of referential integrity constraints, or foreign keys, in relational databases).

Question 15: Create a new collection called “comments” in the same database.

Question 16: Insert three documents into the previously created collection, using the following pattern:

Some useful details:

  • Restaurant identifiers must match existing restaurants in the restaurants collection ;
  • You need to provide feedback from different customers, and for different restaurants;
  • The type attribute can take only the “positive” or “negative” values.

Question 17: List all the comments in your database. Each comment must also contain all the information about the restaurant to which it relates.

Question 18: Insert seven other documents into the comments collection, following the pattern described above, and following these rules:

  • Restaurant identifiers must match existing restaurants in the restaurants collection ;
  • At least one of the restaurants must have several comments;
  • At least one of the customers must have commented several times;
  • At least one of the customers must have commented several times on the same restaurant;
  • The type attribute can take only the “positive” or “negative” values.

Question 19: Find the list of restaurants with reviews, and display only the name and comment list for each restaurant. Several strategies are possible.

Extra work: Indexing

Let us now deal with the efficiency aspects of our database. We will create indexes to optimize access to our restaurant collection.

Question 20: Create an increasing index on the cuisine attribute of the restaurants collection.

Question 21: Create another index for the restaurants collection, consisting of the cuisine attribute (increasing) and the zipcode attribute (decreasing).

Question 22: List all indexes created on the restaurants collection.

Question 23: Use the explain method to display the execution plan for the query that returns all Italian restaurants. What information is provided by the system?

Question 24: Same question but adding the parameter “executionStats” in the explain method.

Question 25: Drop the two indexes you previously created and then re-display the statistics on the query execution plan that returns all Italian restaurants. What do you see?

This exercice has been proposed by Prof. Christophe Bobineau, ENSIMAG, Grenoble INP, France