How to test Debian Hadoop
This article guides you to install and test Apache Hadoop on your Debian system. The following steps will provide detailed description of the configuration process and verification methods.
Step 1: Install Java
Make sure that the system has Java 8 or higher installed. Install OpenJDK 8 using the following command:
sudo apt update sudo apt install openjdk-8-jdk
Verify installation:
java -version
Step 2: Download and decompress Hadoop
Download the latest version of Hadoop from the official website of Apache Hadoop and unzip it to the specified directory (for example /usr/local/hadoop
):
wget https://downloads.apache.org/hadoop/common/hadoop-3.3.1/hadoop-3.3.1.tar.gz tar -xzvf hadoop-3.3.1.tar.gz -C /usr/local/hadoop ``` (Please replace `hadoop-3.3.1` with the actual version number) **Step 3: Configure environment variables** Edit the `~/.bashrc` file and add the following environment variables: ```bash export HADOOP_HOME=/usr/local/hadoop export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
Make the changes take effect:
source ~/.bashrc
Step 4: Configure Hadoop configuration file
Modify the configuration file in the Hadoop directory:
- core-site.xml :
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://namenode:9000</value> </property> </configuration>
- hdfs-site.xml :
<configuration> <property> <name>dfs.replication</name> <value>3</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>/usr/local/hadoop/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>/usr/local/hadoop/dfs/data</value> </property> </configuration>
- mapred-site.xml :
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>
- yarn-site.xml :
<configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> </configuration>
Step 5: Format HDFS
On the NameNode node, execute the following command to format HDFS:
hdfs namenode -format
Step 6: Start Hadoop Service
Start NameNode and DataNode on the NameNode node:
start-dfs.sh
Start YARN on the ResourceManager node:
start-yarn.sh
Step 7: Local Mode Testing
Switch to Hadoop user:
su - hadoop
Create input directories and files:
mkdir ~/input vi ~/input/data.txt
Enter test data (for example "Hello World", "Hello Hadoop"), save and exit.
Run WordCount example:
hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar wordcount ~/input/data.txt ~/output ``` (Please adjust it according to the actual jar package file name) View the results: ```bash ls ~/output cat ~/output/part-r-00000
The correct output indicates that Hadoop local mode is running successfully. Please note that the above steps assume that you are testing in a stand-alone environment. For cluster environments, corresponding configuration modifications are required. Be sure to refer to the official Hadoop documentation for more detailed and latest configuration information.
The above is the detailed content of How to test Debian Hadoop. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

VprocesserazrabotkiveB-enclosed, Мнепришлостольностьсясзадачейтерациигооглапидляпапакробоглесхетсigootrive. LEAVALLYSUMBALLANCEFRIABLANCEFAUMDOPTOMATIFICATION, ČtookazaLovnetakProsto, Kakaožidal.Posenesko

How does the Redis caching solution realize the requirements of product ranking list? During the development process, we often need to deal with the requirements of rankings, such as displaying a...

JDBC...

Summary Description: Distributed locking is a key tool for ensuring data consistency when developing high concurrency applications. This article will start from a practical case and introduce in detail how to use Composer to install and use the dino-ma/distributed-lock library to solve the distributed lock problem and ensure the security and efficiency of the system.

In SpringBoot, use Redis to cache OAuth2Authorization object. In SpringBoot application, use SpringSecurityOAuth2AuthorizationServer...

Why is the return value empty when using RedisTemplate for batch query? When using RedisTemplate for batch query operations, you may encounter the returned results...

The browser's unresponsive method after the WebSocket server returns 401. When using Netty to develop a WebSocket server, you often encounter the need to verify the token. �...

The optimization solution for SpringBoot timing tasks in a multi-node environment is developing Spring...
