{"id":54,"date":"2015-11-25T14:14:38","date_gmt":"2015-11-25T14:14:38","guid":{"rendered":"http:\/\/vargas-solar.com\/big-data-analytics\/?page_id=54"},"modified":"2015-12-02T22:14:35","modified_gmt":"2015-12-02T22:14:35","slug":"technical","status":"publish","type":"page","link":"http:\/\/vargas-solar.com\/big-data-analytics\/technical\/","title":{"rendered":"TECHNICAL GUIDE"},"content":{"rendered":"<h3>Technical Requirements<\/h3>\n<ul>\n<li>Hadoop Ecosystem\n<ul>\n<li>Option 1:\u00a0<a href=\"http:\/\/hortonworks.com\/products\/hortonworks-sandbox\/#install\" target=\"_blank\">Hortonworks Virtual Machine<\/a>\u00a0(local\u00a0or\u00a0<a href=\"http:\/\/hortonworks.com\/blog\/hortonworks-sandbox-with-hdp-2-3-is-now-available-on-microsoft-azure-gallery\/\" target=\"_blank\">cloud installation<\/a>)<\/li>\n<li>Option 2:\u00a0<a href=\"https:\/\/azure.microsoft.com\/en-us\/documentation\/articles\/hdinsight-hadoop-emulator-get-started\/\" target=\"_blank\">HDInsight\u00a0Emulator<\/a>\u00a0(windows only)<\/li>\n<\/ul>\n<\/li>\n<li><a title=\"\" href=\"http:\/\/www.oracle.com\/technetwork\/java\/javase\/downloads\/\" target=\"_blank\">Java JDK<\/a>\u00a07 or later<\/li>\n<li><a href=\"https:\/\/hadoop.apache.org\/docs\/stable\/hadoop-project-dist\/hadoop-common\/FileSystemShell.html\" target=\"_blank\">Hadoop HDFS commands<\/a><\/li>\n<\/ul>\n<h3>FAQ<\/h3>\n<ul>\n<li>I get &#8220;<em>AccessControlException: Permission denied: user=root, access=WRITE<\/em>&#8221; when running Hadoop\n<pre><strong>Create a folder in HDFS and set your user as the owner (<a href=\"http:\/\/www.zackriesland.com\/2014\/12\/permission-denied-by-hdfs\/\">explanation<\/a>)<\/strong>\r\n$ su hdfs hadoop fs -mkdir \/user\/root\r\n$ su hdfs hadoop fs -chown root \/user\/root<\/pre>\n<\/li>\n<li>Hadoop\u00a0does not found my Mapper\/Reducer classes (<em>ClassNotFoundException<\/em>)\n<pre><strong>Set the path to the jar containing your main class\r\n<\/strong>$ export HADOOP_CLASSPATH=\/path\/to\/jar\/myjar.jar:$HADOOP_CLASSPATH<\/pre>\n<\/li>\n<li>I get\u00a0&#8220;<em>ImportError<\/em>&#8221; when importing the\u00a0module\u00a0<em>numpy<\/em>\u00a0in Spark pyshell\n<pre><strong>Install <em>easy_install<\/em> and <em>pip<\/em>. Then install <em>numpy<\/em><\/strong>\r\n$ wget https:\/\/bitbucket.org\/pypa\/setuptools\\\/raw\/bootstrap\/ez_setup.py\r\n$ python ez_setup.py\r\n$ easy_install\u00a0pip\r\n$ pip\u00a0install\u00a0numpy<\/pre>\n<\/li>\n<li><strong><a href=\"http:\/\/www.scala-sbt.org\/\">sbt<\/a><\/strong>\u00a0is not installed in Hortonworks\n<pre><strong>Install it as follows<\/strong>\r\n$ curl https:\/\/bintray.com\/sbt\/rpm\/rpm | \\\r\n  sudo tee \/etc\/yum.repos.d\/bintray-sbt-rpm.repo\r\n$ sudo yum install sbt\r\n$ sbt -v sbtVersion           # Takes a lot of time<\/pre>\n<\/li>\n<li>How to compile\u00a0&amp; execute\u00a0Spark Streaming programs\u00a0(written in scala)\n<pre><strong>Execute the following commands inside the <a href=\"http:\/\/vargas-solar.com\/big-data-analytics\/wp-content\/uploads\/sites\/35\/2015\/11\/spark-streaming.zip\">project template<\/a><\/strong>\r\n$ sbt assembly     # Compiles and builds the jar for spark\r\n$ spark-submit\u00a0--class Tutorial\u00a0target\/scala-2.10\/Tutorial.jar<\/pre>\n<\/li>\n<li>How to reduce spark&#8217;\u00a0level of\u00a0verbosity\n<pre><strong>Replace the content of <em>\/etc\/spark\/2.3.2.0-2950\/0\/log4j.properties<\/em> with:\r\n\r\n<\/strong># Set everything to be logged to the console\r\nlog4j.rootCategory=WARN, console\r\nlog4j.appender.console=org.apache.log4j.ConsoleAppender\r\nlog4j.appender.console.target=System.err\r\nlog4j.appender.console.layout=org.apache.log4j.PatternLayout\r\nlog4j.appender.console.layout.ConversionPattern=%d{yy\/MM\/dd HH:mm:ss} %p %c{1}: %m%n\r\n\r\n# Settings to quiet third party logs that are too verbose\r\nlog4j.logger.org.eclipse.jetty=WARN\r\nlog4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO\r\nlog4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO<\/pre>\n<\/li>\n<li>I\u00a0get &#8220;ERROR\u00a0401:\u00a0<em>Authentication credentials<\/em>&#8221; when executing the\u00a0twitter streaming example\n<pre><strong>1. Ensure that you have set valid consumer key\/secret, access token\/secret<\/strong>\r\n<strong>2. Synchronize your system clock (Hortonworks)<\/strong> \r\n$ yum install ntp\r\n$ sudo ntpdate ntp.ubuntu.com<\/pre>\n<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Technical Requirements Hadoop Ecosystem Option 1:&nbsp;Hortonworks Virtual Machine&nbsp;(local&nbsp;or&nbsp;cloud installation) Option 2:&nbsp;HDInsight&nbsp;Emulator&nbsp;(windows only) Java JDK&nbsp;7 or later Hadoop HDFS commands FAQ I get &ldquo;AccessControlException: Permission denied: user=root, access=WRITE&rdquo; when running Hadoop Create a folder in HDFS and set your user as the owner (explanation) $ su hdfs hadoop fs -mkdir \/user\/root $ su hdfs hadoop fs [&hellip;]<\/p>\n","protected":false},"author":11,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-54","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"http:\/\/vargas-solar.com\/big-data-analytics\/wp-json\/wp\/v2\/pages\/54","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/vargas-solar.com\/big-data-analytics\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"http:\/\/vargas-solar.com\/big-data-analytics\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"http:\/\/vargas-solar.com\/big-data-analytics\/wp-json\/wp\/v2\/users\/11"}],"replies":[{"embeddable":true,"href":"http:\/\/vargas-solar.com\/big-data-analytics\/wp-json\/wp\/v2\/comments?post=54"}],"version-history":[{"count":32,"href":"http:\/\/vargas-solar.com\/big-data-analytics\/wp-json\/wp\/v2\/pages\/54\/revisions"}],"predecessor-version":[{"id":193,"href":"http:\/\/vargas-solar.com\/big-data-analytics\/wp-json\/wp\/v2\/pages\/54\/revisions\/193"}],"wp:attachment":[{"href":"http:\/\/vargas-solar.com\/big-data-analytics\/wp-json\/wp\/v2\/media?parent=54"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}