{"id":11,"date":"2018-09-16T11:00:05","date_gmt":"2018-09-16T11:00:05","guid":{"rendered":"http:\/\/vargas-solar.com\/data-centric-smart-everything\/?page_id=11"},"modified":"2025-12-07T16:35:36","modified_gmt":"2025-12-07T16:35:36","slug":"hands-on","status":"publish","type":"page","link":"http:\/\/vargas-solar.com\/data-centric-smart-everything\/hands-on\/","title":{"rendered":"HANDS-ON-old"},"content":{"rendered":"<p>Some of the following hands on exercises are modified version of the ones proposed in L. Igual and S. Segu\u00ed, <em>Introduction to Data Science: A Python approach to concepts, techniques and applications<\/em>, Undergraduate topics in Computer Science Series, Springer, 2017<\/p>\n<h3>Understanding data collections content: a quantitative vision<\/h3>\n<pre>Keep in mind that for performing data analytics you are willing to make sense of data and this implies acquiring Data Literacy. Have a look at this reference for background.\u00a0<\/pre>\n<p><strong>Michel Bowen, Anthony Bartley, <a href=\"https:\/\/drive.google.com\/file\/d\/117xsuTUIOtad5M6KQ66cUiUBPQ5BFJAq\/view?usp=sharing\">The Basics of Data Literacy: making your students (and you!) make sens of data<\/a>, NST Press, Arlington Virginia<\/strong><\/p>\n<h2><span style=\"color: #339966;\">Digital humanities program<\/span><\/h2>\n<ol>\n<li>Getting acquainted to a Data Lab [<a href=\"http:\/\/vargas-solar.com\/data-centric-smart-everything\/wp-content\/uploads\/sites\/42\/2020\/09\/archive.zip\">data<\/a>]<\/li>\n<li>First steps into data analytics <a href=\"http:\/\/vargas-solar.com\/data-centric-smart-everything\/find-the-murderer-using-data-analytics\/\" target=\"_blank\" rel=\"noopener noreferrer\">[Let us be Holmes and find the murderer]<\/a><span style=\"color: #ff0000;\"><strong><em>\u00a0<\/em><\/strong><\/span><\/li>\n<\/ol>\n<h2><span style=\"color: #339966;\">Engineering master and undergraduate programs stepping into data science and (big) data<\/span><\/h2>\n<h3>1. Characterising data collections according to their V-properties (Desk)<\/h3>\n<ul>\n<li>What is the smell of the city? [<a href=\"http:\/\/vargas-solar.com\/data-centric-smart-everything\/what-does-your-city-smell-like\/\" target=\"_blank\" rel=\"noopener\">DE-1<\/a>] \u00a0(<span style=\"color: #ff00ff;\"><strong>ENSE3<\/strong><\/span>)<\/li>\n<li>Data collection campaign: building a smell cartography [<a href=\"http:\/\/vargas-solar.com\/data-centric-smart-everything\/data-collection-campaign-building-a-smell-cartography\/\">DE-1<\/a>]\u00a0(<span style=\"color: #00ccff;\"><strong>EGI BD Industry 4.0<\/strong><\/span>)<\/li>\n<\/ul>\n<h3 class=\"O1\">Working environment settings: From in house to large scale experiments<\/h3>\n<h3><span style=\"color: #3366ff;\">Data and experimental lab: Kaggle &amp; Colab<\/span><\/h3>\n<ul>\n<li>Access your Kaggle account (<a href=\"https:\/\/www.kaggle.com\/\">https:\/\/www.kaggle.com\/<\/a>)\u00a0<\/li>\n<li>Prepare your Kaggle environment following the instructions <a href=\"http:\/\/vargas-solar.com\/data-centric-smart-everything\/getting-started-with-kaggle\/\" target=\"_blank\" rel=\"noopener noreferrer\">here<\/a><\/li>\n<li>Create a gmail account for using Colab (<a href=\"https:\/\/colab.research.google.com\/\">https:\/\/colab.research.google.com\/<\/a>) and follow instructions in class.<\/li>\n<li><span style=\"font-size: 14px; color: #0000ff;\"><span style=\"color: #000000;\">Working Locally on your computer (why not?)<\/span> If you are willing to use your own computer <span style=\"color: #ff0000;\">outside the course<\/span>, u<\/span><span style=\"color: #0000ff;\">sing a self contained Data Science environment follow two steps (requires medium technical skills):<\/span>\n<ol>\n<li>Download <a href=\"https:\/\/www.anaconda.com\/download\/#macos\" target=\"_blank\" rel=\"noopener noreferrer\">Anaconda<\/a> according to the characteristics of your machine and OS.<\/li>\n<li>Install Anaconda following the instructions according to your OS (<a href=\"https:\/\/docs.anaconda.com\/anaconda\/install\/windows\/\" target=\"_blank\" rel=\"noopener noreferrer\">Windows<\/a>, <a href=\"https:\/\/docs.anaconda.com\/anaconda\/install\/mac-os\/\" target=\"_blank\" rel=\"noopener noreferrer\">MacOS<\/a>).<\/li>\n<\/ol>\n<\/li>\n<\/ul>\n<h4><strong>Useful information: cheat sheets<\/strong><\/h4>\n<ol>\n<li>Some of the following hands on will be done in Python. So here a memento of the language [<a href=\"http:\/\/vargas-solar.com\/data-centric-smart-everything\/wp-content\/uploads\/sites\/42\/2018\/09\/2.Python-memento.pdf\" target=\"_blank\" rel=\"noopener noreferrer\">PDF<\/a>]<\/li>\n<li>Imbalanced Data in Classification <a href=\"https:\/\/drive.google.com\/file\/d\/118cArj4IiQcETQmFZi8LvcOGjX_WZfBg\/view?usp=sharing\">Cheat Sheet<\/a><\/li>\n<\/ol>\n<h3>2. Data exploration and preparation<\/h3>\n<ol>\n<li>Getting started with the data science ecosystem [<a href=\"http:\/\/vargas-solar.com\/data-centric-smart-everything\/ho-1-bis-exploring-datasets-getting-acquainted-with-tables-manipulation\/\">HO-1Bis<\/a>] [<a href=\"http:\/\/vargas-solar.com\/data-centric-smart-everything\/getting-started-with-the-data-science-ecosystem\/\" target=\"_blank\" rel=\"noopener noreferrer\">HO-1<\/a>] [<a href=\"https:\/\/www.kaggle.com\/code\/gevargas\/ho-1-data-exploration-quantitative-2024\">K-Notebook<\/a>]\u00a0(<span style=\"color: #ff00ff;\"><strong>ENSE3<\/strong><\/span>)<br \/>[<a href=\"https:\/\/www.kaggle.com\/code\/gevargas\/ho1-egi-table-operations-r\">K-Notebool in R<\/a>]\u00a0(<span style=\"color: #00ccff;\"><strong>EGI BD Industry 4.0<\/strong><\/span>)<br \/>Tabular operators Pandas-SQL [<a href=\"https:\/\/drive.google.com\/file\/d\/117eorzirjS719ghJjL4Q7fDNSGFEEwcw\/view?usp=sharing\">CheatSheet<\/a>]\u00a0<\/li>\n<li>Exploring data collections using descriptive statistics [<a href=\"http:\/\/vargas-solar.com\/data-centric-smart-everything\/hands-on\/exploring-data-collections-using-descriptive-statistics\/\" target=\"_blank\" rel=\"noopener noreferrer\">HO-2<\/a>] [<a href=\"http:\/\/vargas-solar.com\/data-centric-smart-everything\/wp-content\/uploads\/sites\/42\/2019\/10\/3.Descriptive-statistics.pdf\">PDF<\/a>] <span style=\"color: #ff0000;\"><strong><em>(for Data Science studying programs)<\/em><\/strong><\/span><\/li>\n<li>Classification for data exploration: <em><strong>Unsupervised learning<\/strong><\/em> light version <span style=\"color: #ff6600;\">Google Colab<\/span>\u00a0[<a href=\"http:\/\/vargas-solar.com\/data-centric-smart-everything\/hands-on\/a-step-forward-for-discovering-knowledge-using-unsupervised-learning\/\" target=\"_blank\" rel=\"noopener noreferrer\">HO-3<\/a>][<a href=\"https:\/\/gist.github.com\/gevargas\/32adebd2bf48c77c2b7d0daa48876b7c\">GIST<\/a>](<span style=\"color: #ff00ff;\"><strong>ENSE3<\/strong><\/span>)\n<ol>\n<li>Comparing clustering algorithms long version [<a href=\"http:\/\/vargas-solar.com\/data-centric-smart-everything\/unsupervised-learning-comparing-clustering-methods\/\">HO-4<\/a>] <span style=\"color: #ff0000;\"><strong><em>(for Data Science &amp; Computing Science programs)<\/em><\/strong><\/span><\/li>\n<\/ol>\n<\/li>\n<li>Dealing with <strong>bias <\/strong>example on personal data using <span style=\"color: #ff6600;\">Google Colab<\/span> [<a href=\"https:\/\/gist.github.com\/gevargas\/2876c671b46f511a81b78905d4406e07\">HO-4b<\/a>] (<span style=\"color: #ff00ff;\"><strong>ENSE3<\/strong><\/span>)<\/li>\n<\/ol>\n<h3>3. Network analysis: modelling and discovering knowledge using graphs<\/h3>\n<ol>\n<li>Network Analysis: 5 graph operations social networks [<a href=\"http:\/\/vargas-solar.com\/data-centric-smart-everything\/network-analysis\/\" target=\"_blank\" rel=\"noopener noreferrer\">HO-8<\/a>][<a href=\"https:\/\/www.kaggle.com\/code\/gevargas\/top-graph-algorithms\">K-Notebook<\/a>]\u00a0(<span style=\"color: #ff00ff;\"><strong>ENSE3<\/strong><\/span>)<\/li>\n<\/ol>\n<h3>4. Prediction using inferential statistics<\/h3>\n<ol>\n<li>Linear regression\u00a0<a href=\"http:\/\/vargas-solar.com\/data-centric-smart-everything\/statistical-inference\/\" target=\"_blank\" rel=\"noopener noreferrer\">[HO-5]<\/a> <span style=\"color: #ff0000;\"><strong><em>(for Data Science &amp; Computing Science programs)<\/em><\/strong><\/span><\/li>\n<li>Logistic regression [<a href=\"http:\/\/vargas-solar.com\/data-centric-smart-everything\/logistic-regression\/\" target=\"_blank\" rel=\"noopener noreferrer\">HO-6<\/a>] [<a href=\"https:\/\/gist.github.com\/gevargas\/bf760a656075ee56b53b83659efcc1ed\">GIST<\/a>]\u00a0(<span style=\"color: #ff00ff;\"><strong>ENSE3<\/strong><\/span>)<\/li>\n<\/ol>\n<h3>6. Towards Data Analytics at Scale (<span style=\"color: #ff0000;\">for Data Engineering Program<\/span>)<\/h3>\n<ol>\n<li><a href=\"https:\/\/github.com\/jbsneto-ppgsc-ufrn\/spark-tutorial\">https:\/\/github.com\/jbsneto-ppgsc-ufrn\/spark-tutorial\u00a0<\/a><\/li>\n<li>Azure Machine Learning Gallery <a href=\"https:\/\/gallery.azure.ai\">https:\/\/gallery.azure.ai<\/a><\/li>\n<\/ol>\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Some of the following hands on exercises are modified version of the ones proposed in L. Igual and S. Segu&iacute;, Introduction to Data Science: A Python approach to concepts, techniques and applications, Undergraduate topics in Computer Science Series, Springer, 2017 Understanding data collections content: a quantitative vision Keep in mind that for performing data analytics [&hellip;]<\/p>\n","protected":false},"author":11,"featured_media":0,"parent":0,"menu_order":3,"comment_status":"closed","ping_status":"closed","template":"page-templates\/full-width.php","meta":{"footnotes":""},"class_list":["post-11","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"http:\/\/vargas-solar.com\/data-centric-smart-everything\/wp-json\/wp\/v2\/pages\/11","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/vargas-solar.com\/data-centric-smart-everything\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"http:\/\/vargas-solar.com\/data-centric-smart-everything\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"http:\/\/vargas-solar.com\/data-centric-smart-everything\/wp-json\/wp\/v2\/users\/11"}],"replies":[{"embeddable":true,"href":"http:\/\/vargas-solar.com\/data-centric-smart-everything\/wp-json\/wp\/v2\/comments?post=11"}],"version-history":[{"count":83,"href":"http:\/\/vargas-solar.com\/data-centric-smart-everything\/wp-json\/wp\/v2\/pages\/11\/revisions"}],"predecessor-version":[{"id":587,"href":"http:\/\/vargas-solar.com\/data-centric-smart-everything\/wp-json\/wp\/v2\/pages\/11\/revisions\/587"}],"wp:attachment":[{"href":"http:\/\/vargas-solar.com\/data-centric-smart-everything\/wp-json\/wp\/v2\/media?parent=11"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}