LATAM Faculty Summit 2012@ Riviera Maya, Mexico

Three main trend topics were addressed in this summit (http://research.microsoft.com/en-US/events/latamfacsum2012/agenda.aspx), (1) human-devices interaction, (2) data and information management under different perspectives and (3) applications. Here comments on two lectures, full of wisdom, with promising and exciting visions of research on databases:

Information Management via CrowdSourcing, Hector Garcia-Molina, Professor, Stanford University, United States

Following the wisdom of the crowd connected and available in all sorts of forums and social networks, it is possible to ask them to contribute to answer questions (i.e., queries): Which is the best seafood restaurant in Riviera Maya? People ready to perform simple tasks – sometimes against some cents- can participate in looking for an answer this, for instance, providing their opinion, information or collective knowledge.

The very simple principle for answering this question is to ask the crowd and then have a strategy for classifying the opinions and decide which is the acceptable answer. Another possibility, is to combine the crowd’s answers with other information coming from classic databases, search engines or other data providers and then build an answer (e.g,, a ranked set of recommendations).

Interesting research challenges must be addressed if queries evaluation is “crowdsourced”: (a) the classification of the answers, for instance, considering their provenance; the generation of opinion trends; (b) the combination with information from more classic data providers, for instance by redefining some classic relational operators (e.g., join) or maybe defining new ones; (c) having queries answered efficiently (query optimisation) despite the fact that answers can arrive continuously for long periods of time; and certainly others, if the problem is studied in its whole complexity …

Data-Intensive Discoveries in Science: the Fourth Paradigm  Alex Szalay, Alumni Centennial Professor, Department of Physics and Astronomy, The Johns Hopkins University, United States

Long time before the emergence of the buzzwords “Big Data” and “Open Data”, data experts met scientists and decided to build huge databases and make them available to the community through front ends. Maybe the best-known examples are the SDDS project (http://skyserver.sdss.org/) and the worldwide telescope (http://www.worldwidetelescope.org/).

Thanks to these ambitious projects, the notion of Internet scientist emerged and started making discoveries by exploiting these databases[1]. The surprise is that non-scientists also started accessing these data for hobby or for learning purposes. The project, initially lead by Jim Grey, has grown and it touches today other areas, like biology, cancer, social, human sciences and even HPC observation (!). The scientific community, old and young researchers, has certainly a role to play for populating and exploiting these democratized deposits of potential knowledge.

[1] This term Internet scientist is borrowed from A. Szalay, who used it in his keynote presentation.