{"id":212,"date":"2018-11-04T22:33:00","date_gmt":"2018-11-04T22:33:00","guid":{"rendered":"http:\/\/vargas-solar.com\/data-centric-smart-everything\/?page_id=212"},"modified":"2019-11-15T20:27:00","modified_gmt":"2019-11-15T20:27:00","slug":"knowledge-control","status":"publish","type":"page","link":"http:\/\/vargas-solar.com\/data-centric-smart-everything\/knowledge-control\/","title":{"rendered":"KNOWLEDGE CONTROL"},"content":{"rendered":"<ul>\n<li>Define the notion of &#8220;Datification&#8221;? In which way is it a revolution with respect to smart environments?<\/li>\n<li>Define the characteristics of data centric sciences? What is the role of data for them? What are the two components that make them a new generation of experimental sciences?<\/li>\n<li>Define the notion of Big Data. In your opinion how does this notion opens new challenges to data management?<\/li>\n<li>Give 5 properties that characterise Big Data ? Explain in which way they are challenging for managing data?<\/li>\n<li>In the case of your domain of expertise, \u00a0how does Big Data opens novel possibilities or problems\/challenges?<\/li>\n<li>In terms of multi-dabases used for storing data collections, which are the challenges related to query rewriting in such setting? Design an example of data collections stemming from a smart building or a smart city quarter that can be stored in different databases and that are then queried in the spirit of distributed queries.<\/li>\n<li style=\"list-style-type: none;\">\n<h3>Data science issues<\/h3>\n<ul>\n<li>Describe the general methodology of data science? What is its objective?<\/li>\n<li>What is a Web IDE? What does IDE stand for? What is a notebook?<\/li>\n<li>Give a general description of a Data Science virtual machine<\/li>\n<li>Give the general functional architecture showing how does Azure Notebooks communicates with GitHub and with the Python interpreter in the setting used for experimenting in the lab sessions?<\/li>\n<\/ul>\n<h3>Defining a tabular view of a data collection<\/h3>\n<ul>\n<li>What is a DataFrame? Define a DataFrame that shows the readings of home appliances energy consumption when they are used according to the following schema:<\/li>\n<\/ul>\n<pre>&lt;applianceName, initialdate, initialhour, finaldate, finalhour, consumedWatts&gt;<\/pre>\n<h3>Manipulating data<\/h3>\n<ul>\n<li>Consider the operations that can be applied on top of tabular data structures like <strong>projection<\/strong> (retrieving a subset of columns\/attributes), <strong>selection<\/strong> (retrieving a subset of records) and <strong>filter<\/strong> (retrieving a subset of records given a condition). Which are the \u00a0operators provided by Pandas that implement these operations for DataFrame? What is the result type? Give examples particularly the way null values can be filtered.<\/li>\n<\/ul>\n<ul>\n<li>Which are the aggregation functions that can be applied to the DataFrames and which is the role of the parameters axis and inplace often used together with these functions?<\/li>\n<li>Which is the form of the expressions for adding columns to a DataFrame? and Rows? How can rows or columns be deleted?<\/li>\n<li>How can default values be added to attributes containing missing or null values?<\/li>\n<li>Give an example of the use of the group() method applied on a DataFrame.<\/li>\n<li>How are manipulation operators associated to DataFrames related and useful for implementing Data Science processes?<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<h3>Descriptive Statistics<\/h3>\n<ul>\n<li>What is the role of descriptive statistics\u00a0with regard to the analysis of data collections?<\/li>\n<li>What type of questions can be answered using descriptive statistics? Which are the mathematical tools used for that?<\/li>\n<li>Which methods are provided by Pandas for getting acquainted with data collections content in a quantitative manner?<\/li>\n<li>How is the method shape used for analysing data in a DataFrame?<\/li>\n<li>What issues have to be considered in order to be able to apply statistics to raw data collections?<\/li>\n<li>What is the role of the generation of graphics in the application of descriptive statistics for analysing data?<\/li>\n<li>Which are the strategies used for dealing with dirty data when applying descriptive statistics functions?<\/li>\n<li>Why can the distribution of the values of a given attribute be important to be known in a data analytics process?<\/li>\n<\/ul>\n<h3>Unsupervised learning<\/h3>\n<ul>\n<li>What is unsupervised learning? Explain its general principle.<\/li>\n<li>What type of questions can unsupervised learning methods answer? Give examples or use cases.<\/li>\n<li>Describe the general principle of the K-Means clustering algorithm?<\/li>\n<li>Explain which measures can be used for assessing the result fo applying such algorithm on data?<\/li>\n<li>What is the role of visualisation of results of the K-Means algorithm applied to a data collection?<\/li>\n<\/ul>\n\n\n<h3 class=\"wp-block-heading\">Inferential statistics<\/h3>\n\n\n\n<ul class=\"wp-block-list\"><li>Explain the principle of linear regression and give an example<\/li><li>What can be linear regression used for<\/li><li>What are the criteria associated to data to be considered for deciding whether linear regression can be applied or not?<\/li><li>Define a pipeline that gives the general steps to be implemented to solve a prediction problem using linear regression.<\/li><li>What are the scores used for assessing linear regression results?<\/li><li>What does it mean to bootstrap the std error of mean?<\/li><li>What are confidence intervals and p-values?<\/li><\/ul>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Define the notion of &ldquo;Datification&rdquo;? In which way is it a revolution with respect to smart environments? Define the characteristics of data centric sciences? What is the role of data for them? What are the two components that make them a new generation of experimental sciences? Define the notion of Big Data. In your opinion [&hellip;]<\/p>\n","protected":false},"author":11,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"page-templates\/full-width.php","meta":{"footnotes":""},"class_list":["post-212","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"http:\/\/vargas-solar.com\/data-centric-smart-everything\/wp-json\/wp\/v2\/pages\/212","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/vargas-solar.com\/data-centric-smart-everything\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"http:\/\/vargas-solar.com\/data-centric-smart-everything\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"http:\/\/vargas-solar.com\/data-centric-smart-everything\/wp-json\/wp\/v2\/users\/11"}],"replies":[{"embeddable":true,"href":"http:\/\/vargas-solar.com\/data-centric-smart-everything\/wp-json\/wp\/v2\/comments?post=212"}],"version-history":[{"count":20,"href":"http:\/\/vargas-solar.com\/data-centric-smart-everything\/wp-json\/wp\/v2\/pages\/212\/revisions"}],"predecessor-version":[{"id":389,"href":"http:\/\/vargas-solar.com\/data-centric-smart-everything\/wp-json\/wp\/v2\/pages\/212\/revisions\/389"}],"wp:attachment":[{"href":"http:\/\/vargas-solar.com\/data-centric-smart-everything\/wp-json\/wp\/v2\/media?parent=212"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}