Table of Contents
Run spark-1.6.0 on Yarn
Directory
1. Agreement
2. Install Scala
2.1. Download
2.2. Installation
2.3. Set environment variables
HADOOP_CONF_DIR=/data/hadoop/current/etc/hadoop
5. Integrate with Hive
6. Common Errors
6.1. Error 1: unknownqueue:thequeue
6.2.SPARK_CLASSPATHwasdetected
7. Related documents
Home Backend Development PHP Tutorial Running spark-1.6.0_PHP Tutorial on Yarn

Running spark-1.6.0_PHP Tutorial on Yarn

Jul 12, 2016 am 08:58 AM
android

Run spark-1.6.0 on Yarn

Running spark-1.6.0_PHP Tutorial on YarnRun spark-1.6.0.pdf on Yarn

Directory

Directory 1

1. Agreement 1

2. Install Scala 1

2.1. Download 2

2.2. Install 2

2.3. Set environment variables 2

3. Install Spark 2

3.1. Download 2

3.2. Install 2

3.3. Configuration 3

3.3. 1. Modify conf/spark-env.sh 3

4. Start Spark 3

4.1. Run the built-in example 3

4.2.SparkSQLCli4

5. Integrate with Hive 4

6. Common errors 5

6.1. Error 1: unknownqueue:thequeue 5

6.2.SPARK_CLASSPATHwasdetected6

7. Related Document 6

1. Agreement

This article agrees that Hadoop2.7.1 is installed in /data/hadoop/current, and Spark1.6.0 is installed in /data/hadoop/spark , where /data/hadoop/spark points to /data/hadoop/spark.

Spark’s official website is: http://spark.apache.org/ (Shark’s official website is: http://shark.cs.berkeley.edu/. Shark has become a module of Spark and no longer needs to be used separately. Install).

Run Spark in cluster mode and do not introduce client mode.

2. Install Scala

Martin Odersky of the Ecole Polytechnique Fédérale de Lausanne (EPFL) started designing Scala in 2001 based on the work of Funnel.

Scala is a multi-paradigm programming language, designed to integrate various features of pure object-oriented programming and functional programming. It runs on the Java virtual machine JVM, is compatible with existing Java programs, and can call Java class libraries. Scala includes a compiler and class libraries and is released under the BSD license.

2.1. Download

Spark is developed using Scala. Before installing Spark, install Scala in each section. Scala’s official website is: http://www.scala-lang.org/, and the download URL is: http://www.scala-lang.org/download/. This article downloads the binary installation package scala-2.11.7. tgz.

2.2. Installation

This article uses the root user (actually it can also be a non-root user, it is recommended to plan in advance) to install Scala in /data/scala, where /data/scala points to /data Soft link to /scala-2.11.7.

The installation method is very simple, upload scala-2.11.7.tgz to the /data directory, and then decompress scala-2.11.7.tgz in the /data/ directory.

Next, create a soft link: ln-s/data/scala-2.11.7/data/scala.

2.3. Set environment variables

After Scala is installed, you need to add it to the PATH environment variable. You can directly modify the /etc/profile file and add the following content:

exportSCALA_HOME=/data/scala

exportPATH=$SCALA_HOME/bin:$PATH

exportSCALA_HOME=/data/scala

exportPATH=$SCALA_HOME/bin:$PATH

3. Install Spark

Spark is installed as a non-root user. This article installs it as the hadoop user.

3.1. Download the binary installation package downloaded in this article

. This method is recommended, otherwise you will have to worry about compilation. The download URL is: http://spark.apache.org/downloads.html. This article downloads spark-1.6.0-bin-hadoop2.6.tgz, which can be run directly on YARN.

3.2. Installation

1) Upload spark-1.6.0-bin-hadoop2.6.tgz to the directory /data/hadoop

2) Unzip: tarxzfspark -1.6.0-bin-hadoop2.6.tgz

3) Create a soft link: ln-sspark-1.6.0-bin-hadoop2.6spark

in To run spark on yarn, you do not need to install spark on every machine. You can install it on only one machine. But spark can only be run on the machine where it is installed. The reason is simple: the file that calls spark is needed.

3.3. Configuration

3.3.1. Modify conf/spark-env.sh

HADOOP_CONF_DIR=/data/hadoop/current/etc/hadoop

YARN_CONF_DIR=/data/hadoop/current/etc/hadoop

You can make a copy of spark-env.sh.template, and then add the following content:

HADOOP_CONF_DIR=/data/hadoop/current/etc/hadoop

YARN_CONF_DIR=/data/hadoop/current/etc/hadoop

./bin/spark-submit--classorg.apache.spark.examples.SparkPi

--masteryarn--deploy-modecluster

--driver-memory4g

--executor-memory2g

--executor-cores1

--queuedefault

lib/spark-examples*.jar10

4. Start Spark Since it runs on Yarn, there is no process of starting Spark. Instead, when the command spark-submit is executed, Spark is scheduled to run by Yarn. 4.1. Run the built-in example
./bin/spark-submit--classorg.apache.spark.examples.SparkPi --masteryarn--deploy-modecluster --driver-memory4g --executor-memory2g --executor-cores1 --queuedefault lib/spark -examples*.jar10

运行输出:

16/02/0316:08:33INFOyarn.Client:Applicationreportforapplication_1454466109748_0007(state:RUNNING)

16/02/0316:08:34INFOyarn.Client:Applicationreportforapplication_1454466109748_0007(state:RUNNING)

16/02/0316:08:35INFOyarn.Client:Applicationreportforapplication_1454466109748_0007(state:RUNNING)

16/02/0316:08:36INFOyarn.Client:Applicationreportforapplication_1454466109748_0007(state:RUNNING)

16/02/0316:08:37INFOyarn.Client:Applicationreportforapplication_1454466109748_0007(state:RUNNING)

16/02/0316:08:38INFOyarn.Client:Applicationreportforapplication_1454466109748_0007(state:RUNNING)

16/02/0316:08:39INFOyarn.Client:Applicationreportforapplication_1454466109748_0007(state:RUNNING)

16/02/0316:08:40INFOyarn.Client:Applicationreportforapplication_1454466109748_0007(state:FINISHED)

16/02/0316:08:40INFOyarn.Client:

clienttoken:N/A

diagnostics:N/A

ApplicationMasterhost:10.225.168.251

ApplicationMasterRPCport:0

queue:default

starttime:1454486904755

finalstatus:SUCCEEDED

trackingURL:http://hadoop-168-254:8088/proxy/application_1454466109748_0007/

user:hadoop

16/02/0316:08:40INFOutil.ShutdownHookManager:Shutdownhookcalled

16/02/0316:08:40INFOutil.ShutdownHookManager:Deletingdirectory/tmp/spark-7fc8538c-8f4c-4d8d-8731-64f5c54c5eac

16/02/0316:08:33INFOyarn.Client:Applicationreportforapplication_1454466109748_0007(state:RUNNING)

16/02/0316:08:34INFOyarn.Client:Applicationreportforapplication_1454466109748_0007(state:RUNNING)

16/02/0316:08:35INFOyarn.Client:Applicationreportforapplication_1454466109748_0007(state:RUNNING)

./bin/spark-sql--masteryarn

16/02/0316:08:36INFOyarn.Client:Applicationreportforapplication_1454466109748_0007(state:RUNNING) 16/02/0316:08:37INFOyarn.Client:Applicationreportforapplication_1454466109748_0007(state:RUNNING) 16/02/0316:08:38INFOyarn.Client:Applicationreportforapplication_1454466109748_0007(state:RUNNING) 16/02/0316:08:39INFOyarn.Client:Applicationreportforapplication_1454466109748_0007(state:RUNNING) 16/02/0316:08:40INFOyarn.Client:Applicationreportforapplication_1454466109748_0007(state:FINISHED) 16/02/0316:08:40INFOyarn.Client: clienttoken:N/A diagnostics:N/A ApplicationMasterhost:10.225.168.251 ApplicationMasterRPCport:0 queue:default starttime:1454486904755 finalstatus:SUCCEEDED trackingURL:http://hadoop-168-254:8088/proxy/application_1454466109748_0007/ user:hadoop 16/02/0316:08:40INFOutil.ShutdownHookManager:Shutdownhookcalled 16/02/0316:08:40INFOutil.ShutdownHookManager:Deletingdirectory/tmp/spark-7fc8538c-8f4c-4d8d-8731-64f5c54c5eac
4.2.SparkSQLCli通过运行即可进入SparkSQLCli交互界面,但要在Yarn上以cluster运行,则需要指定参数--master值为yarn(注意不支持参数--deploy-mode的值为cluster,也就是只能以client模式运行在Yarn上):
./bin/spark-sql--masteryarn

Why can SparkSQLCli only run in client mode? In fact, it is easy to understand. Since it is interactive and you need to see the output, the cluster mode cannot do it at this time. Because of the cluster mode, the machine on which ApplicationMaster runs is dynamically determined by Yarn.

5. Integrate with Hive

Spark integrating Hive is very simple, just the following steps:

1) Add HIVE_HOME to spark-env.sh, such as: exportHIVE_HOME =/data/hadoop/hive

2) Copy Hive’s hive-site.xml and hive-log4j.properties files to Spark’s conf directory.

After completion, execute spark-sql again to enter Spark's SQLCli, and run the command showtables to see the tables created in Hive.

Example:

./spark-sql--masteryarn--driver-class-path/data/hadoop/hive/lib/mysql-connector-java-5.1.38-bin. jar

6. Common Errors

6.1. Error 1: unknownqueue:thequeue

Run:

./bin/spark-submit--classorg. apache.spark.examples.SparkPi--masteryarn--deploy-modecluster--driver-memory4g--executor-memory2g--executor-cores1--queuethequeuelib/spark-examples*.jar10

reports the following error, Just change "--queuethequeue" to "--queuedefault".

16/02/0315:57:36INFOyarn.Client:Applicationreportforapplication_1454466109748_0004(state:FAILED)

16/02/0315:57:36INFOyarn.Client:

clienttoken:N/A

diagnostics:Applicationapplication_1454466109748_0004submittedbyuserhadooptounknownqueue:thequeue

ApplicationMasterhost:N/A

ApplicationMasterRPCport:-1

queue:thequeue

starttime:1454486255907

finalstatus:FAILED

trackingURL:http://hadoop-168-254:8088/proxy/application_1454466109748_0004/

user:hadoop

16/02/0315:57:36INFOyarn.Client:Deletingstagingdirectory.sparkStaging/application_1454466109748_0004

Exceptioninthread"main"org.apache.spark.SparkException:Applicationapplication_1454466109748_0004finishedwithfailedstatus

atorg.apache.spark.deploy.yarn.Client.run(Client.scala:1029)

atorg.apache.spark.deploy.yarn.Client$.main(Client.scala:1076)

atorg.apache.spark.deploy.yarn.Client.main(Client.scala)

atsun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethod)

atsun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

atsun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

atjava.lang.reflect.Method.invoke(Method.java:606)

atorg.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)

atorg.apache.spark.deploy.SparkSubmit$.doRunMain(SparkSubmit.scala:181)

atorg.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)

atorg.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)

atorg.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

16/02/0315:57:36INFOutil.ShutdownHookManager:Shutdownhookcalled

16/02/0315:57:36INFOutil.ShutdownHookManager:Deletingdirectory/tmp/spark-54531ae3-4d02-41be-8b9e-92f4b0f05807

16/02/0315:57:36INFOyarn.Client:Applicationreportforapplication_1454466109748_0004(state:FAILED) 16/02/0315:57:36INFOyarn.Client: clienttoken:N /A diagnostics:Applicationapplication_1454466109748_0004submittedbyuserhadooptounknownqueue:thequeue ApplicationMasterhost:N/A ApplicationMasterRPCport:-1 queue:thequeue starttime :1454486255907 finalstatus:FAILED trackingURL:http://hadoop-168-254:8088/proxy/application_1454466109748_0004/ user:hadoop 16/02/0315:57:36INFOyarn.Client:Deletingstagingdirectory.sparkStaging/application_1454466109748_0004 Exceptioninthread"main"org.apache.spark.SparkException:Applicationapplication_1454466109748_0004finishedwithfailed status aorg.apache.spark.deploy. yarn.Client.run(Client.scala:1029) aorg.apache.spark.deploy.yarn.Client$.main(Client.scala:1076) aorg.apache.spark .deploy.yarn.Client.main(Client.scala) atsun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethod) atsun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) atsun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) atjava.lang.reflect.Method.invoke(Method.java:606) atorg.apache .spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) aorg.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala :181) aorg.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) aorg.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala :121) aorg.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 16/02/0315:57:36INFOutil.ShutdownHookManager:Shutdownhookcalled 16/02/0315:57:36INFOutil.ShutdownHookManager:Deletingdirectory/tmp/spark-54531ae3-4d02-41be-8b9e-92f4b0f05807

6.2.SPARK_CLASSPATHwasdetected

SPARK_CLASSPATHwasdetected(setto'/data/hadoop/hive/lib/mysql-connector-java-5.1.38-bin.jar:').

ThisisdeprecatedinSpark1. 0 .

Pleaseinsteaduse:

-./spark-submitwith--driver-class-pathtoaugmentthedriverclasspath

-spark.executor.extraClassPathtoaugmenttheexecutorclasspath

means no It is recommended to set the environment variable SPARK_CLASSPATH in spark-env.sh, which can be changed to the following recommended method:

./spark-sql--masteryarn--driver-class-path/data/hadoop/hive/lib /mysql-connector-java-5.1.38-bin.jar

7. Related documents

"HBase-0.98.0 Distributed Installation Guide"

"Hive0. 12.0 Installation Guide"

"ZooKeeper-3.4.6 Distributed Installation Guide"

"Hadoop2.3.0 Source Code Reverse Engineering"

"Compiling Hadoop-2.4 on Linux .0》

《Accumulo-1.5.1 Installation Guide》

《Drill1.0.0 Installation Guide》

《Shark0.9.1 Installation Guide》

For more, please pay attention to the technology blog: http://aquester.culog.cn.


www.bkjia.comtruehttp: //www.bkjia.com/PHPjc/1103191.htmlTechArticleRunning spark-1.6.0 on Yarn Running spark-1.6.0.pdf on Yarn Table of contents 1 1. Convention 1 2. Install Scala 1 2.1. Download 2 2.2. Install 2 2.3. Set environment variables 2 3. Install Spark 2 3.1. Download 2 3.2. Install...
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Roblox: Bubble Gum Simulator Infinity - How To Get And Use Royal Keys
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Nordhold: Fusion System, Explained
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Mandragora: Whispers Of The Witch Tree - How To Unlock The Grappling Hook
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Java Tutorial
1671
14
PHP Tutorial
1276
29
C# Tutorial
1256
24
New report delivers damning assessment of rumoured Samsung Galaxy S25, Galaxy S25 Plus and Galaxy S25 Ultra camera upgrades New report delivers damning assessment of rumoured Samsung Galaxy S25, Galaxy S25 Plus and Galaxy S25 Ultra camera upgrades Sep 12, 2024 pm 12:23 PM

In recent days, Ice Universe has been steadily revealing details about the Galaxy S25 Ultra, which is widely believed to be Samsung's next flagship smartphone. Among other things, the leaker claimed that Samsung only plans to bring one camera upgrade

Samsung Galaxy S25 Ultra leaks in first render images with rumoured design changes revealed Samsung Galaxy S25 Ultra leaks in first render images with rumoured design changes revealed Sep 11, 2024 am 06:37 AM

OnLeaks has now partnered with Android Headlines to provide a first look at the Galaxy S25 Ultra, a few days after a failed attempt to generate upwards of $4,000 from his X (formerly Twitter) followers. For context, the render images embedded below h

IFA 2024 | TCL\'s NXTPAPER 14 won\'t match the Galaxy Tab S10 Ultra in performance, but it nearly matches it in size IFA 2024 | TCL\'s NXTPAPER 14 won\'t match the Galaxy Tab S10 Ultra in performance, but it nearly matches it in size Sep 07, 2024 am 06:35 AM

Alongside announcing two new smartphones, TCL has also announced a new Android tablet called the NXTPAPER 14, and its massive screen size is one of its selling points. The NXTPAPER 14 features version 3.0 of TCL's signature brand of matte LCD panels

Vivo Y300 Pro packs 6,500 mAh battery in a slim 7.69 mm body Vivo Y300 Pro packs 6,500 mAh battery in a slim 7.69 mm body Sep 07, 2024 am 06:39 AM

The Vivo Y300 Pro just got fully revealed, and it's one of the slimmest mid-range Android phones with a large battery. To be exact, the smartphone is only 7.69 mm thick but features a 6,500 mAh battery. This is the same capacity as the recently launc

Samsung Galaxy S24 FE billed to launch for less than expected in four colours and two memory options Samsung Galaxy S24 FE billed to launch for less than expected in four colours and two memory options Sep 12, 2024 pm 09:21 PM

Samsung has not offered any hints yet about when it will update its Fan Edition (FE) smartphone series. As it stands, the Galaxy S23 FE remains the company's most recent edition, having been presented at the start of October 2023. However, plenty of

Motorola Razr 50s shows itself as possible new budget foldable in early leak Motorola Razr 50s shows itself as possible new budget foldable in early leak Sep 07, 2024 am 09:35 AM

Motorola has released countless devices this year, although only two of them are foldables. For context, while most of the world has received the pair as the Razr 50 and Razr 50 Ultra, Motorola offers them in North America as the Razr 2024 and Razr 2

New report delivers damning assessment of rumoured Samsung Galaxy S25, Galaxy S25 Plus and Galaxy S25 Ultra camera upgrades New report delivers damning assessment of rumoured Samsung Galaxy S25, Galaxy S25 Plus and Galaxy S25 Ultra camera upgrades Sep 12, 2024 pm 12:22 PM

In recent days, Ice Universe has been steadily revealing details about the Galaxy S25 Ultra, which is widely believed to be Samsung's next flagship smartphone. Among other things, the leaker claimed that Samsung only plans to bring one camera upgrade

iQOO Z9 Turbo Plus: Reservations begin for the potentially beefed-up series flagship iQOO Z9 Turbo Plus: Reservations begin for the potentially beefed-up series flagship Sep 10, 2024 am 06:45 AM

OnePlus'sister brand iQOO has a 2023-4 product cycle that might be nearlyover; nevertheless, the brand has declared that it is not done with itsZ9series just yet. Its final, and possibly highest-end,Turbo+variant has just beenannouncedas predicted. T

See all articles