How to test Debian Hadoop

Apr 12, 2025 pm 10:03 PM
apache red

How to test Debian Hadoop

This article guides you to install and test Apache Hadoop on your Debian system. The following steps will provide detailed description of the configuration process and verification methods.

Step 1: Install Java

Make sure that the system has Java 8 or higher installed. Install OpenJDK 8 using the following command:

 sudo apt update
sudo apt install openjdk-8-jdk
Copy after login

Verify installation:

 java -version
Copy after login

Step 2: Download and decompress Hadoop

Download the latest version of Hadoop from the official website of Apache Hadoop and unzip it to the specified directory (for example /usr/local/hadoop ):

 wget https://downloads.apache.org/hadoop/common/hadoop-3.3.1/hadoop-3.3.1.tar.gz
tar -xzvf hadoop-3.3.1.tar.gz -C /usr/local/hadoop
``` (Please replace `hadoop-3.3.1` with the actual version number)


**Step 3: Configure environment variables**

Edit the `~/.bashrc` file and add the following environment variables:

```bash
export HADOOP_HOME=/usr/local/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
Copy after login

Make the changes take effect:

 source ~/.bashrc
Copy after login

Step 4: Configure Hadoop configuration file

Modify the configuration file in the Hadoop directory:

  • core-site.xml :
<configuration>
  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://namenode:9000</value>
  </property>
</configuration>
Copy after login
  • hdfs-site.xml :
<configuration>
  <property>
    <name>dfs.replication</name>
    <value>3</value>
  </property>
  <property>
    <name>dfs.namenode.name.dir</name>
    <value>/usr/local/hadoop/dfs/name</value>
  </property>
  <property>
    <name>dfs.datanode.data.dir</name>
    <value>/usr/local/hadoop/dfs/data</value>
  </property>
</configuration>
Copy after login
  • mapred-site.xml :
<configuration>
  <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>
</configuration>
Copy after login
  • yarn-site.xml :
<configuration>
  <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>
  <property>
    <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
  </property>
</configuration>
Copy after login

Step 5: Format HDFS

On the NameNode node, execute the following command to format HDFS:

 hdfs namenode -format
Copy after login

Step 6: Start Hadoop Service

Start NameNode and DataNode on the NameNode node:

 start-dfs.sh
Copy after login

Start YARN on the ResourceManager node:

 start-yarn.sh
Copy after login

Step 7: Local Mode Testing

Switch to Hadoop user:

 su - hadoop
Copy after login

Create input directories and files:

 mkdir ~/input
vi ~/input/data.txt
Copy after login

Enter test data (for example "Hello World", "Hello Hadoop"), save and exit.

Run WordCount example:

 hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar wordcount ~/input/data.txt ~/output
``` (Please adjust it according to the actual jar package file name)

View the results:

```bash
ls ~/output
cat ~/output/part-r-00000
Copy after login

The correct output indicates that Hadoop local mode is running successfully. Please note that the above steps assume that you are testing in a stand-alone environment. For cluster environments, corresponding configuration modifications are required. Be sure to refer to the official Hadoop documentation for more detailed and latest configuration information.

The above is the detailed content of How to test Debian Hadoop. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Using Dicr/Yii2-Google to integrate Google API in YII2 Using Dicr/Yii2-Google to integrate Google API in YII2 Apr 18, 2025 am 11:54 AM

VprocesserazrabotkiveB-enclosed, Мнепришлостольностьсясзадачейтерациигооглапидляпапакробоглесхетсigootrive. LEAVALLYSUMBALLANCEFRIABLANCEFAUMDOPTOMATIFICATION, ČtookazaLovnetakProsto, Kakaožidal.Posenesko

How to use the Redis cache solution to efficiently realize the requirements of product ranking list? How to use the Redis cache solution to efficiently realize the requirements of product ranking list? Apr 19, 2025 pm 11:36 PM

How does the Redis caching solution realize the requirements of product ranking list? During the development process, we often need to deal with the requirements of rankings, such as displaying a...

Title: How to use Composer to solve distributed locking problems Title: How to use Composer to solve distributed locking problems Apr 18, 2025 am 08:39 AM

Summary Description: Distributed locking is a key tool for ensuring data consistency when developing high concurrency applications. This article will start from a practical case and introduce in detail how to use Composer to install and use the dino-ma/distributed-lock library to solve the distributed lock problem and ensure the security and efficiency of the system.

What should I do if the Redis cache of OAuth2Authorization object fails in Spring Boot? What should I do if the Redis cache of OAuth2Authorization object fails in Spring Boot? Apr 19, 2025 pm 08:03 PM

In SpringBoot, use Redis to cache OAuth2Authorization object. In SpringBoot application, use SpringSecurityOAuth2AuthorizationServer...

Why is the return value empty when using RedisTemplate for batch query? Why is the return value empty when using RedisTemplate for batch query? Apr 19, 2025 pm 10:15 PM

Why is the return value empty when using RedisTemplate for batch query? When using RedisTemplate for batch query operations, you may encounter the returned results...

What is the reason why the browser does not respond after the WebSocket server returns 401? How to solve it? What is the reason why the browser does not respond after the WebSocket server returns 401? How to solve it? Apr 19, 2025 pm 02:21 PM

The browser's unresponsive method after the WebSocket server returns 401. When using Netty to develop a WebSocket server, you often encounter the need to verify the token. �...

In a multi-node environment, how to ensure that Spring Boot's @Scheduled timing task is executed only on one node? In a multi-node environment, how to ensure that Spring Boot's @Scheduled timing task is executed only on one node? Apr 19, 2025 pm 10:57 PM

The optimization solution for SpringBoot timing tasks in a multi-node environment is developing Spring...

See all articles