VOOZH about

URL: https://www.geeksforgeeks.org/data-analysis/mapreduce-program-weather-data-analysis-for-analyzing-hot-and-cold-days/

⇱ MapReduce Program - Weather Data Analysis For Analyzing Hot And Cold Days - GeeksforGeeks


  • Courses
  • Tutorials
  • Interview Prep

MapReduce Program - Weather Data Analysis For Analyzing Hot And Cold Days

Last Updated : 11 Aug, 2025

In this article, we demonstrate how a MapReduce program can process large-scale weather datasets to identify temperature extremes. By harnessing Hadoop’s parallel processing capabilities, program efficiently pinpoints hot and cold days an essential step for climate trend analysis, anomaly detection and building reliable forecasting systems.

Problem Statement

Analyze semi-structured weather data collected by sensors globally. We will focus on temperature values (maximum and minimum) and identify hot days (temperature > 30°C) and cold days (temperature < 15°C) using MapReduce.

Dataset Overview

We used weather data from the NCEI, available in line-based ASCII text format. Each file contains fields like Date, Latitude, Longitude, Max Temp and Min Temp.

FileName: CRND0103-2020-AK_Fairbanks_11_NE.txt. Download the file from here.

Step-by-Step Implementation

This section walks you through the implementation of the MapReduce program to extract hot and cold days from large-scale weather data using Hadoop.

Step 1: Understand Data Format

Below is the example of our dataset where column 6 and column 7 is showing Maximum and Minimum temperature, respectively.

👁 minnimum-and-maximum-temprature-field-in-dataset

Step 2: Set Up Java Project

Make a project in Eclipse with below steps:

First Open Eclipse -> then, select File -> New -> Java Project -> Name it MyProject -> then, select use an execution environment -> choose, JavaSE-1.8 then, next -> Finish.

👁 create-java-project

In this Project Create Java class with name MyMaxMin -> then, click Finish.

👁 create-java-class

Step 3: Java Source Code

Copy the below source code to this MyMaxMin java class

Step 4: Add External JARs

To ensure imported packages work correctly, you need to add external JAR files to your project. Download the Hadoop Common and Hadoop MapReduce Core JAR files that match your installed Hadoop version.

Check Hadoop version with below command:

hadoop version

👁 check-hadoop-version

Now, to add external jars to MyProject:

Right Click on MyProject -> then, Build Path -> Click on, Configure Build Path and select Add External jars then Add jars from it's download location then click -> Apply and Close.

👁 adding-external-jar-files-to-our-project

Step 5: Export Project as JAR

Now export the project as jar file.

Right-click on MyProject choose Export -> go to, Java -> JAR file -> click, Next then, choose your export destination then click -> Next

👁 export-java-MyProject

Choose Main Class as MyMaxMin by clicking -> Browse and then click -> Finish -> Ok.

👁 select-main-class

Step 6: Start Hadoop Services

Start HDFS and YARN daemons:

start-dfs.sh
start-yarn.sh

Step 7: Move Dataset to HDFS

Command:

hdfs dfs -put /path/to/CRND0103-2020-AK_Fairbanks_11_NE.txt /

To verify:

hdfs dfs -ls /

👁 copying-the-dataset-to-our-HDFS

Step 8: Run the MapReduce Job

Now Run your Jar File with below command and produce the output in MyOutput File.

Syntax:

hadoop jar /path/to/Project.jar /input_file_in_HDFS /output_directory

Example:

hadoop jar /home/user/Documents/Project.jar /CRND0103-2020-AK_Fairbanks_11_NE.txt /MyOutput

👁 running-our-jar-file-for-analysis

Step 9: View Output

After the MapReduce job completes, you can check the final results through the Hadoop web interface.

Visit:

http://localhost:50070/

Then navigate to: Utilities -> Browse the file system -> /MyOutput -> part-r-00000.

👁 hdfs-view-1
👁 hdfs-view-2

Download the result file.

Step 10: Interpret Output

Each line in the output shows:

  • Label: Hot Day or Cold Day
  • Date: yyyyMMdd format (e.g., 20200101 = Jan 1, 2020)
  • Temperature reading
👁 top-10-result-obtained
Comment