Polyglot meets Xperanto | Big Data Fest

PROPOSED BY GENOVEVA VARGAS SOLAR

TECHNICAL ASSISTANCE: JUAN CARLOS CASTREJÓN

Given a data collection coming from different social networks stored on NoSQL systems (Neo4J and Mongo) [possibly according to a strategy combining sharding and replication techniques], extend the UnQL pivot query language considering

Data processing operators adapted to query different data models (graph, document). Example query Neo + Mongo and what about Join, Union …
Assuming concurrent CRUD operations to the stores can you expect query results to be consistent ? How can you tag your results or implement a sharding strategy in order to determine whether results are consistent?
Querying data represented on different models: How can you exploit the structure of the different stores for expressing queries ? Provide adapted operators? Give generic operators and then rewrite queries?
Normally, Polyglot solutions tend to solve some data processing issues in the application code. This can be penalizing. Discuss the challenges to address for ensuring that your queries will be able to scale as the collection grows.

EXPECTED RESULTS

Give the principle of your proposal through a partial programming solution, of the operators of your UnQL extension, detail the query evaluation process if U want your solution to scale
We ask U to sketch the solution on the polyglot database that we provide consisting of mongo, Neo4J stores

TECHNICAL SUPPORT: AVOID STARTING FROM SCRATCH

Use our underlying part for the solution here https://github.com/jccastrejon/edbt-unql
Technical requirements: VMware player 5