Polyglot Data management on the Cloud

MyNet Use Case – 1

Context

To evaluate the course on Cloud Computing and Big Data (LIS 4102), you will address use cases for which you have to propose a technical specification of a possible solution. The specification should be complete and technical enough to let development engineers implement it. The objective is to end the course with a book of cloud services specifications and a final solution for the proposed project MyNet. The book can show how you can use some of your technical skills for building systems, and it might help complete your resume.

General description

Integrated Social Data aims to develop an application for integrating personal contacts information and posts from several social networks belonging to a user to provide a global view of such information and provide querying functions that can help a user have different views of the posts. For instance, organize the posts according to the geographical region of the authors, or according to a period, count the posts submitted by a specific contact or contact group; retrieve the posts including images published by a particular contact at a given date. 

The user will also have a global view of her contacts’ network: connections can be related among each other by the relation “is contact”, but they can also be connected according to a shared property inherent to their profile or expressed explicitly by the user (i.e., family, colleagues, acquaintances, classmates).

The objective is to design and specify the application MyNet that will enable the integration of social networks information of users and the storage of some relevant posts or contents. These data can be organized according to a topic, a date, and the content will maintain information about authorship and provenance to share on the web. 

Integrated Social Data is asking you to develop its application MyNet for integrating and managing social networks data. 

Use case 1 topics

  • Polyglot persistence in a multi-cloud environment: Building a multi-database on a cluster-based data management stores
  • Sharding databases on the cloud

To Do

Design and specify a cloud-aware polyglot database solution based on services. We rely on your ability for imagining an original app that would use this solution and provide an integrated global view of the social network contacts of a person. Attention! The person wants to see an integrated set of contacts, but she/he also wants to know whether it is a private, furtive or professional contact. 

The solution must be simple, but it must have the following characteristics:

  1. Deal with constructing a polyglot database, with an ETL process that can populate it by interacting with existing data services (e.g., LinkedIn, Facebook, Twitter, Deezer, Tinder, Tik Tok, Snapchat, ….)
  2. Deployment of the polyglot database on several clouds
  3. Export services that can give access to it for manipulating and querying data.
  4. Define a global interface to explore (query) the polyglot database. How can global queries work on the different stores composing your polyglot store? Include the operations to be processed such that subqueries on the stores of your solution can use:
  5. Selection, projection style queries
  6. Aggregation and group by style queries
  7. Join or correlation style queries
  8. Design and specify a consolidation strategy when data are updated on the polyglot database. What are the BASE guarantees that your solution ensures? How do the internal processes of the system ensure them?

To Hand In

Share in your public space a document with the specification describing.

  • The polyglot database characterizing the data using UML, justifying the choice of the models used for storing the data, the ETL process (UML activity or sequence diagram) used for feeding it and the functional architecture.
  • Explain which services can be specified (API and functional logic) for accessing the polyglot database and how they can be deployed on one or several clouds (functional architecture as studied).
  • Describe the architecture of the application that can use your polyglot solution.

Expected quality

The report must be well and logically organised. Grammar and orthography must be correct as much as possible. Do not hesitate to use Grammarly or similar systems for verifying English. Consider sentence structure (object verb object), avoid adjectives, adverbs, and exaggerating with connectors.

Diversity and Inclusion

Consider using inclusive language (even if it can take sometimes more characters to write) for your report. 

++ Inclusion and diversity in writing https://dbdni.github.io/pages/inclusivewriting.html 

Use adapted fonts and fonts size. If you use images or icons, including human characters, make sure that you avoid gender, race, and socio-economical stereotypes. Also, consider the size, clarity, and quality of the images.