After vivid discussions led by the emergence of the buzzword “Big Data”, it seems that industry and academia have reached an objective understanding about data properties (volume, velocity, variety, veracity and value), the resources and “know how” it requires, and the opportunities it opens. Indeed, new applications promising fundamental changes in society, industry and science, include face recognition, machine translation, digital assistants, self-driving cars, ad-serving, chat-bots, personalized healthcare, smart industry and more.
The first lesson of the era of “Big Data” is that it is possible to access and exploit representative “samples” of available data collections thanks to the availability of the necessary resources for storing it and running greedy processing tasks on it. The second lesson is that computer science and mathematics disciplines must generate synergy with other sciences in order to exploit these new available “value”. The consequence is the emergence of “new” data centric sciences: data science, digital humanities, social data science, network science, computational science. These sciences with their new requirements and challenges call for a need to revisit the fundaments of databases, artificial intelligence and other disciplines used for addressing them with new perspectives.
This novel and multidisciplinary data centric and scientific movement, promises new and not yet imagined applications that rely on massive amounts of evolving data that need to be cleaned, integrated and analysed for modelling purposes. Yet, data management issues are not usually perceived as central.
This lecture explores the key challenges and opportunities for data management in the new scientific world, and discusses how a possible data centric communities can best contribute to these exciting domains. If the moto is not academic, huge numbers of dollars being devoted to related applications are moving industry and academia to analyse these directions.