{"id":310,"date":"2019-10-08T11:54:23","date_gmt":"2019-10-08T11:54:23","guid":{"rendered":"http:\/\/vargas-solar.com\/data-centric-smart-everything\/?page_id=310"},"modified":"2022-03-01T13:04:12","modified_gmt":"2022-03-01T13:04:12","slug":"network-analysis","status":"publish","type":"page","link":"http:\/\/vargas-solar.com\/data-centric-smart-everything\/network-analysis\/","title":{"rendered":"Network Analysis"},"content":{"rendered":"\n<h4 class=\"wp-block-heading\">Objective <\/h4>\n\n\n\n<p>Learn how to explore networks represented by graphs using 5 well known graph operations that you should know. Therefore, the exercise uses a facebook network data set.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Material<\/h4>\n\n\n\n<ul class=\"wp-block-list\"><li>Kaggle account<\/li><li>Python notebook<\/li><li>Facebook_Social_Network dataset available in Kaggle: <a rel=\"noreferrer noopener\" href=\"https:\/\/www.kaggle.com\/roshansharma\/facebook-social-network\" target=\"_blank\">https:\/\/www.kaggle.com\/roshansharma\/facebook-social-network<\/a><\/li><\/ul>\n\n\n\n<h4 class=\"wp-block-heading\">ToDo<\/h4>\n\n\n\n<p>1 Upload the data set searching on the Kaggle Datasets Explorer and add it to your Notebook running environment<\/p>\n\n\n\n<p>2 Locate the uploaded dataset as follows:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"428\" src=\"http:\/\/vargas-solar.com\/data-centric-smart-everything\/wp-content\/uploads\/sites\/42\/2019\/10\/Capture-d\u2019\u00e9cran-2019-10-08-\u00e0-13.02.47-1024x428.png\" alt=\"\" class=\"wp-image-311\" srcset=\"http:\/\/vargas-solar.com\/data-centric-smart-everything\/wp-content\/uploads\/sites\/42\/2019\/10\/Capture-d\u2019\u00e9cran-2019-10-08-\u00e0-13.02.47-1024x428.png 1024w, http:\/\/vargas-solar.com\/data-centric-smart-everything\/wp-content\/uploads\/sites\/42\/2019\/10\/Capture-d\u2019\u00e9cran-2019-10-08-\u00e0-13.02.47-300x126.png 300w, http:\/\/vargas-solar.com\/data-centric-smart-everything\/wp-content\/uploads\/sites\/42\/2019\/10\/Capture-d\u2019\u00e9cran-2019-10-08-\u00e0-13.02.47-768x321.png 768w, http:\/\/vargas-solar.com\/data-centric-smart-everything\/wp-content\/uploads\/sites\/42\/2019\/10\/Capture-d\u2019\u00e9cran-2019-10-08-\u00e0-13.02.47-624x261.png 624w, http:\/\/vargas-solar.com\/data-centric-smart-everything\/wp-content\/uploads\/sites\/42\/2019\/10\/Capture-d\u2019\u00e9cran-2019-10-08-\u00e0-13.02.47.png 1778w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>3. First we are going to have a first approximation to the libraries dealing with networks in Python. Therefore we import the corresponding libraries networkx for the corresponding data structure and associated operations and plotly for generating graphics.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"105\" src=\"http:\/\/vargas-solar.com\/data-centric-smart-everything\/wp-content\/uploads\/sites\/42\/2019\/10\/Capture-d\u2019\u00e9cran-2019-10-08-\u00e0-13.04.54-1024x105.png\" alt=\"\" class=\"wp-image-313\" srcset=\"http:\/\/vargas-solar.com\/data-centric-smart-everything\/wp-content\/uploads\/sites\/42\/2019\/10\/Capture-d\u2019\u00e9cran-2019-10-08-\u00e0-13.04.54-1024x105.png 1024w, http:\/\/vargas-solar.com\/data-centric-smart-everything\/wp-content\/uploads\/sites\/42\/2019\/10\/Capture-d\u2019\u00e9cran-2019-10-08-\u00e0-13.04.54-300x31.png 300w, http:\/\/vargas-solar.com\/data-centric-smart-everything\/wp-content\/uploads\/sites\/42\/2019\/10\/Capture-d\u2019\u00e9cran-2019-10-08-\u00e0-13.04.54-768x79.png 768w, http:\/\/vargas-solar.com\/data-centric-smart-everything\/wp-content\/uploads\/sites\/42\/2019\/10\/Capture-d\u2019\u00e9cran-2019-10-08-\u00e0-13.04.54-624x64.png 624w, http:\/\/vargas-solar.com\/data-centric-smart-everything\/wp-content\/uploads\/sites\/42\/2019\/10\/Capture-d\u2019\u00e9cran-2019-10-08-\u00e0-13.04.54.png 1770w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>First we create a network linking cities with their distance.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">edgelist = [['Mannheim', 'Frankfurt', 85], <br>\n            ['Mannheim', 'Karlsruhe', 80], <br>\n            ['Erfurt', 'Wurzburg', 186], <br>\n            ['Munchen', 'Numberg', 167], <br>\n            ['Munchen', 'Augsburg', 84], <br>\n            ['Munchen', 'Kassel', 502], <br>\n            ['Numberg', 'Stuttgart', 183], <br>\n            ['Numberg', 'Wurzburg', 103], <br>\n            ['Numberg', 'Munchen', 167], <br>\n            ['Stuttgart', 'Numberg', 183], <br>\n            ['Augsburg', 'Munchen', 84], <br>\n            ['Augsburg', 'Karlsruhe', 250], <br>\n            ['Kassel', 'Munchen', 502], <br>\n            ['Kassel', 'Frankfurt', 173], <br>\n            ['Frankfurt', 'Mannheim', 85], <br>\n            ['Frankfurt', 'Wurzburg', 217], <br>\n            ['Frankfurt', 'Kassel', 173], <br>\n            ['Wurzburg', 'Numberg', 103], <br>\n            ['Wurzburg', 'Erfurt', 186], <br>\n            ['Wurzburg', 'Frankfurt', 217], <br>\n            ['Karlsruhe', 'Mannheim', 80], <br>\n            ['Karlsruhe', 'Augsburg', 250],<br>\n            [\"Mumbai\", \"Delhi\",400],<br>\n            [\"Delhi\", \"Kolkata\",500],<br>\n            [\"Kolkata\", \"Bangalore\",600],<br>\n            [\"TX\", \"NY\",1200],<br>\n            [\"ALB\", \"NY\",800]]<\/pre>\n\n\n\n<p>We create a graph with the previous data.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">g = nx.Graph()<br>\nfor edge in edgelist: g.add_edge(edge[0],edge[1], weight = edge[2])<\/pre>\n\n\n\n<p><strong>Question 1: Find out distinct continents and their cities from this graph.<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">for i, x in enumerate(nx.connected_components(g)): print(\"cc\"+str(i)+\":\",x)<\/pre>\n\n\n\n<p><strong>Question 2: Find the shortest path between Stuttgart and Frankfurt and its length.<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">print(nx.shortest_path(g, 'Stuttgart','Frankfurt',weight='weight')) <br>\nprint(nx.shortest_path_length(g, 'Stuttgart','Frankfurt',weight='weight'))<\/pre>\n\n\n\n<p><strong>Question 3: Find the shortest path among the cities of the graph.<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">for x in nx.all_pairs_dijkstra_path(g,weight='weight'): print(x)<\/pre>\n\n\n\n<h4 class=\"wp-block-heading\">Minimum spannig tree<\/h4>\n\n\n\n<p><strong>Question 4: We need to connect all the cities in the graph we have using the minimum amount of wire\/pipe. How do we do this?<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">#nx.minimum_spanning_tree(g) returns a instance of type nx.minimum_spanning_tree(g) returns a instance of type graph<\/pre>\n\n\n\n<pre class=\"wp-block-preformatted\">graphnx.draw_networkx(nx.minimum_spanning_tree(g)) <\/pre>\n\n\n\n<h4 class=\"wp-block-heading\">PAGE RANK<\/h4>\n\n\n\n<p>It assigns scores to pages based on the number and quality of incoming and outgoing links. Pagerank can be used anywhere where we want to estimate node importance in any network.<\/p>\n\n\n\n<p>Create Link between user if user A follows user B and Link between user and Tweets if user tweets\/retweets a tweet.<\/p>\n\n\n\n<p>For this we are finally going to retrieve the Facebook Network data collection (!)<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">fb = nx.read_edgelist('\/kaggle\/input\/facebook-social-network\/facebook-combined.txt', create_using = nx.Graph(), nodetype = int)<\/pre>\n\n\n\n<p>.. and let us plot the network to see how does it look like:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">pos = nx.spring_layout(fb)<\/pre>\n\n\n\n<pre class=\"wp-block-preformatted\">import warnings \nwarnings.filterwarnings('ignore')\nwarnings.simplefilter('ignore')\nimport matplotlib.pyplot as plt\n\nplt.style.use('fivethirtyeight') \nplt.rcParams['figure.figsize'] = (20, 15)\nplt.axis('off')\n\nnx.draw_networkx(fb, pos, with_labels = False, node_size = 35) \nplt.show()<\/pre>\n\n\n\n<p>Let us compute the page rank for all the nodes.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">pageranks = nx.pagerank(fb)<br>\nprint(pageranks)<\/pre>\n\n\n\n<p>Let us now order the page rank results<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">import operator<br>\nsorted_pagerank = sorted(pageranks.items(), <br>\nkey=operator.itemgetter(1),reverse = True)<br>\nprint(sorted_pagerank)<\/pre>\n\n\n\n<p><strong>Question 5: Create a subgraph with more influential users. <\/strong><\/p>\n\n\n\n<p>first_degree_connected_nodes = list(fb.neighbors(3437)) <br>\nsecond_degree_connected_nodes = []<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">for x in first_degree_connected_nodes:\n\n    second_degree_connected_nodes+=list(fb.neighbors(x)) \n\nsecond_degree_connected_nodes.remove(3437) \nsecond_degree_connected_nodes = list(set(second_degree_connected_nodes))\n\nsubgraph_3437 = nx.subgraph(fb, first_degree_connected_nodes + second_degree_connected_nodes)\n\npos = nx.spring_layout(subgraph_3437)<\/pre>\n\n\n\n<p>And we visualise the most influential users painting them in yellow : <\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">import matplotlib.pyplot as plt\n\nnode_color = ['yellow' if v == 3437 else 'red' for v in subgraph_3437]\n\nnode_size = [1000 if v == 3437 else 35 for v in subgraph_3437] \n\nplt.style.use('fivethirtyeight') \n\nplt.rcParams['figure.figsize'] = (20, 15)\n\nplt.axis('off')\n\nnx.draw_networkx(subgraph_3437, pos, with_labels = False, node_color=node_color,node_size=node_size )\n\nplt.show()<\/pre>\n\n\n\n<h4 class=\"wp-block-heading\">Centrality measures<\/h4>\n\n\n\n<p>Betweenness centrality quanties how many times a particular node <br> comes in the shortest chosen path between two other nodes.<\/p>\n\n\n\n<p><strong><em>Degree Centrality:<\/em><\/strong> It is simply the number of connections for a node.<\/p>\n\n\n\n<p>Here is the code for finding the Betweenness centrality for the subgraph.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">pos = nx.spring_layout(subgraph_3437) \n\nbetweennessCentrality = nx.betweenness_centrality(subgraph_3437,normalized=True, endpoints=True)\n\nnode_size = [v * 10000 for v in betweennessCentrality.values()] \n\nplt.figure(figsize=(20,20))\n\nnx.draw_networkx(subgraph_3437, pos=pos, with_labels=False, node_size=node_size )\n\nplt.axis('off') <\/pre>\n\n\n\n<p>This notebook has been proposed by https:\/\/towardsdatascience.com\/data-scientists-the-five-graph-algorithms-that-you-should-know-30f454fa5513<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Objective Learn how to explore networks represented by graphs using 5 well known graph operations that you should know. Therefore, the exercise uses a facebook network data set. Material Kaggle account Python notebook Facebook_Social_Network dataset available in Kaggle: https:\/\/www.kaggle.com\/roshansharma\/facebook-social-network ToDo 1 Upload the data set searching on the Kaggle Datasets Explorer and add it to [&hellip;]<\/p>\n","protected":false},"author":11,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"page-templates\/full-width.php","meta":{"footnotes":""},"class_list":["post-310","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"http:\/\/vargas-solar.com\/data-centric-smart-everything\/wp-json\/wp\/v2\/pages\/310","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/vargas-solar.com\/data-centric-smart-everything\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"http:\/\/vargas-solar.com\/data-centric-smart-everything\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"http:\/\/vargas-solar.com\/data-centric-smart-everything\/wp-json\/wp\/v2\/users\/11"}],"replies":[{"embeddable":true,"href":"http:\/\/vargas-solar.com\/data-centric-smart-everything\/wp-json\/wp\/v2\/comments?post=310"}],"version-history":[{"count":15,"href":"http:\/\/vargas-solar.com\/data-centric-smart-everything\/wp-json\/wp\/v2\/pages\/310\/revisions"}],"predecessor-version":[{"id":447,"href":"http:\/\/vargas-solar.com\/data-centric-smart-everything\/wp-json\/wp\/v2\/pages\/310\/revisions\/447"}],"wp:attachment":[{"href":"http:\/\/vargas-solar.com\/data-centric-smart-everything\/wp-json\/wp\/v2\/media?parent=310"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}