PROPOSED BY GENOVEVA VARGAS SOLAR
TECHNICAL ASSISTANCE: JUAN CARLOS CASTREJÓN
Given a data collection coming from different social networks stored on NoSQL systems (Neo4J and Mongo) [possibly according to a strategy combining sharding and replication techniques], extend the UnQL pivot query language considering
- Data processing operators adapted to query different data models (graph, document). Example query Neo + Mongo and what about Join, Union …
- Assuming concurrent CRUD operations to the stores can you expect query results to be consistent ? How can you tag your results or implement a sharding strategy in order to determine whether results are consistent?
- Querying data represented on different models: How can you exploit the structure of the different stores for expressing queries ? Provide adapted operators? Give generic operators and then rewrite queries?
- Normally, Polyglot solutions tend to solve some data processing issues in the application code. This can be penalizing. Discuss the challenges to address for ensuring that your queries will be able to scale as the collection grows.
EXPECTED RESULTS
- Give the principle of your proposal through a partial programming solution, of the operators of your UnQL extension, detail the query evaluation process if U want your solution to scale
- We ask U to sketch the solution on the polyglot database that we provide consisting of mongo, Neo4J stores
TECHNICAL SUPPORT: AVOID STARTING FROM SCRATCH
- Use our underlying part for the solution here https://github.com/jccastrejon/edbt-unql
- Technical requirements: VMware player 5