Running spark-1.6.0_PHP Tutorial on Yarn
Run spark-1.6.0 on Yarn

Directory
Directory 1
1. Agreement 1
2. Install Scala 1
2.1. Download 2
2.2. Install 2
2.3. Set environment variables 2
3. Install Spark 2
3.1. Download 2
3.2. Install 2
3.3. Configuration 3
3.3. 1. Modify conf/spark-env.sh 3
4. Start Spark 3
4.1. Run the built-in example 3
4.2.SparkSQLCli4
5. Integrate with Hive 4
6. Common errors 5
6.1. Error 1: unknownqueue:thequeue 5
6.2.SPARK_CLASSPATHwasdetected6
7. Related Document 6
1. Agreement
This article agrees that Hadoop2.7.1 is installed in /data/hadoop/current, and Spark1.6.0 is installed in /data/hadoop/spark , where /data/hadoop/spark points to /data/hadoop/spark.
Spark’s official website is: http://spark.apache.org/ (Shark’s official website is: http://shark.cs.berkeley.edu/. Shark has become a module of Spark and no longer needs to be used separately. Install).
Run Spark in cluster mode and do not introduce client mode.
2. Install Scala
Martin Odersky of the Ecole Polytechnique Fédérale de Lausanne (EPFL) started designing Scala in 2001 based on the work of Funnel.
Scala is a multi-paradigm programming language, designed to integrate various features of pure object-oriented programming and functional programming. It runs on the Java virtual machine JVM, is compatible with existing Java programs, and can call Java class libraries. Scala includes a compiler and class libraries and is released under the BSD license.
2.1. Download
Spark is developed using Scala. Before installing Spark, install Scala in each section. Scala’s official website is: http://www.scala-lang.org/, and the download URL is: http://www.scala-lang.org/download/. This article downloads the binary installation package scala-2.11.7. tgz.
2.2. Installation
This article uses the root user (actually it can also be a non-root user, it is recommended to plan in advance) to install Scala in /data/scala, where /data/scala points to /data Soft link to /scala-2.11.7.
The installation method is very simple, upload scala-2.11.7.tgz to the /data directory, and then decompress scala-2.11.7.tgz in the /data/ directory.
Next, create a soft link: ln-s/data/scala-2.11.7/data/scala.
2.3. Set environment variables
After Scala is installed, you need to add it to the PATH environment variable. You can directly modify the /etc/profile file and add the following content:
|
HADOOP_CONF_DIR=/data/hadoop/current/etc/hadoop YARN_CONF_DIR=/data/hadoop/current/etc/hadoop |
You can make a copy of spark-env.sh.template, and then add the following content:
HADOOP_CONF_DIR=/data/hadoop/current/etc/hadoopYARN_CONF_DIR=/data/hadoop/current/etc/hadoop |
./bin/spark-submit--classorg.apache.spark.examples.SparkPi --masteryarn--deploy-modecluster --driver-memory4g --executor-memory2g --executor-cores1 --queuedefault lib/spark-examples*.jar10 |
./bin/spark-submit--classorg.apache.spark.examples.SparkPi --masteryarn--deploy-modecluster --driver-memory4g --executor-memory2g --executor-cores1 --queuedefault lib/spark -examples*.jar10 |
运行输出:
|
./bin/spark-sql--masteryarn |
Why can SparkSQLCli only run in client mode? In fact, it is easy to understand. Since it is interactive and you need to see the output, the cluster mode cannot do it at this time. Because of the cluster mode, the machine on which ApplicationMaster runs is dynamically determined by Yarn.
5. Integrate with Hive
Spark integrating Hive is very simple, just the following steps:
1) Add HIVE_HOME to spark-env.sh, such as: exportHIVE_HOME =/data/hadoop/hive
2) Copy Hive’s hive-site.xml and hive-log4j.properties files to Spark’s conf directory.
After completion, execute spark-sql again to enter Spark's SQLCli, and run the command showtables to see the tables created in Hive.
Example:
./spark-sql--masteryarn--driver-class-path/data/hadoop/hive/lib/mysql-connector-java-5.1.38-bin. jar
6. Common Errors
6.1. Error 1: unknownqueue:thequeue
Run:
./bin/spark-submit--classorg. apache.spark.examples.SparkPi--masteryarn--deploy-modecluster--driver-memory4g--executor-memory2g--executor-cores1--queuethequeuelib/spark-examples*.jar10
reports the following error, Just change "--queuethequeue" to "--queuedefault".
|
6.2.SPARK_CLASSPATHwasdetected
SPARK_CLASSPATHwasdetected(setto'/data/hadoop/hive/lib/mysql-connector-java-5.1.38-bin.jar:').
ThisisdeprecatedinSpark1. 0 .
Pleaseinsteaduse:
-./spark-submitwith--driver-class-pathtoaugmentthedriverclasspath
-spark.executor.extraClassPathtoaugmenttheexecutorclasspath
means no It is recommended to set the environment variable SPARK_CLASSPATH in spark-env.sh, which can be changed to the following recommended method:
./spark-sql--masteryarn--driver-class-path/data/hadoop/hive/lib /mysql-connector-java-5.1.38-bin.jar
7. Related documents
"HBase-0.98.0 Distributed Installation Guide"
"Hive0. 12.0 Installation Guide"
"ZooKeeper-3.4.6 Distributed Installation Guide"
"Hadoop2.3.0 Source Code Reverse Engineering"
"Compiling Hadoop-2.4 on Linux .0》
《Accumulo-1.5.1 Installation Guide》
《Drill1.0.0 Installation Guide》
《Shark0.9.1 Installation Guide》
For more, please pay attention to the technology blog: http://aquester.culog.cn.

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics











In recent days, Ice Universe has been steadily revealing details about the Galaxy S25 Ultra, which is widely believed to be Samsung's next flagship smartphone. Among other things, the leaker claimed that Samsung only plans to bring one camera upgrade

OnLeaks has now partnered with Android Headlines to provide a first look at the Galaxy S25 Ultra, a few days after a failed attempt to generate upwards of $4,000 from his X (formerly Twitter) followers. For context, the render images embedded below h

Alongside announcing two new smartphones, TCL has also announced a new Android tablet called the NXTPAPER 14, and its massive screen size is one of its selling points. The NXTPAPER 14 features version 3.0 of TCL's signature brand of matte LCD panels

The Vivo Y300 Pro just got fully revealed, and it's one of the slimmest mid-range Android phones with a large battery. To be exact, the smartphone is only 7.69 mm thick but features a 6,500 mAh battery. This is the same capacity as the recently launc

Samsung has not offered any hints yet about when it will update its Fan Edition (FE) smartphone series. As it stands, the Galaxy S23 FE remains the company's most recent edition, having been presented at the start of October 2023. However, plenty of

Motorola has released countless devices this year, although only two of them are foldables. For context, while most of the world has received the pair as the Razr 50 and Razr 50 Ultra, Motorola offers them in North America as the Razr 2024 and Razr 2

In recent days, Ice Universe has been steadily revealing details about the Galaxy S25 Ultra, which is widely believed to be Samsung's next flagship smartphone. Among other things, the leaker claimed that Samsung only plans to bring one camera upgrade

OnePlus'sister brand iQOO has a 2023-4 product cycle that might be nearlyover; nevertheless, the brand has declared that it is not done with itsZ9series just yet. Its final, and possibly highest-end,Turbo+variant has just beenannouncedas predicted. T
