![]() |
VOOZH | about |
Natural disasters are major events that can cause serious harm to people and property. Thanks to modern technology, we can now predict these events more accurately. This article explains how to use the R programming language to analyze data on natural disasters.
Natural Disaster Prediction models are tools or methods used to forecast future events or outcomes based on historical data. By analyzing patterns and trends in past information, these models help us make educated guesses about what might happen next, whether it's predicting the weather, stock prices, or the likelihood of a natural disaster.
To build a prediction model in R, several key steps are essential to ensure the data is ready for analysis and that the model created is effective. Here's how the overall process works -
The natural disaster dataset provides detailed information about different types of natural disasters, such as earthquakes, hurricanes, and floods. Each record in the dataset has a unique ID and includes the type of disaster, the date it happened, and the exact location where it occurred, with details like latitude and longitude.
For events like earthquakes, the dataset includes the magnitude and depth, while for hurricanes, it provides wind speed and rainfall data. The dataset also records environmental conditions like temperature and humidity at the time of the disaster. Additionally, it includes data on how often similar disasters have occurred in the past, helping to understand patterns and predict future events.
Dataset link : NaturalDisaster
Now we implement stepwise to build the prediction model stepwise in R programing Language.
Output:
ID Disaster_Type Date Location Latitude Longitude Magnitude
1 1 Wildfire 2021-09-05 Joyceshire 15.50625 158.23697 NA
2 2 Wildfire 2021-08-20 West Glenn 39.61520 -113.39889 NA
3 3 Earthquake 2024-02-08 Lake Taylor 32.63846 -117.50888 6.1
4 4 Flood 2022-06-24 West Samantha -26.39599 167.67467 NA
5 5 Wildfire 2023-09-05 Whiteville 74.94508 -50.26314 NA
6 6 Wildfire 2020-11-17 Lorettaland 71.90164 112.23947 NA
Depth Wind_Speed Rainfall Temperature Humidity Historical_Frequency
1 NA NA NA -11.8 24.7 17
2 NA NA NA -19.9 97.8 7
3 677.1 NA NA -28.6 94.2 5
4 NA NA 321.6 42.4 73.9 14
5 NA NA NA 33.1 98.2 6
6 NA NA NA -4.9 99.3 18
Date column to the Date format.Disaster_Type column to a factor (categorical variable).EDA helps in understanding patterns and insights from the data. Visualize different aspects to gain a better understanding.
Display the distribution of different types of natural disasters.
Output:
Analyze Magnitude Over Time
Show how the magnitude of disasters changes over time.
Output:
Location-Based Analysis
Display the geographical distribution of disasters.
Output:
createDataPartition from caret to split the data into training (70%) and testing (30%) sets.trainControl object for 10-fold cross-validation.Latitude, Longitude, etc.) and the dependent variable (Disaster_Type).tuneLength = 5 to try 5 different values of mtry (number of variables randomly sampled as candidates at each split).Output:
Random Forest
702 samples
9 predictor
4 classes: 'Earthquake', 'Flood', 'Hurricane', 'Wildfire'
No pre-processing
Resampling: Cross-Validated (10 fold)
Summary of sample sizes: 630, 631, 633, 632, 632, 631, ...
Resampling results across tuning parameters:
mtry Accuracy Kappa
2 0.9899983 0.9866495
3 0.9928761 0.9904918
5 0.9928566 0.9904663
7 0.9928566 0.9904663
9 0.9928566 0.9904663
Accuracy was used to select the optimal model using the largest value.
The final value used for the model was mtry = 3.
Now we will Print the Confusion Matrix to Evaluate Model Performance.
Output:
Confusion Matrix:
Confusion Matrix and Statistics
Reference
Prediction Earthquake Flood Hurricane Wildfire
Earthquake 80 0 0 0
Flood 0 73 0 0
Hurricane 0 0 67 0
Wildfire 0 1 3 74
Overall Statistics
Accuracy : 0.7866
95% CI : (0.966, 0.9963)
No Information Rate : 0.2685
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 0.9821
Mcnemar's Test P-Value : NA
Statistics by Class:
Class: Earthquake Class: Flood Class: Hurricane
Sensitivity 1.0000 0.9865 0.9571
Specificity 1.0000 1.0000 1.0000
Pos Pred Value 1.0000 1.0000 1.0000
Neg Pred Value 1.0000 0.9956 0.9870
Prevalence 0.2685 0.2483 0.2349
Detection Rate 0.2685 0.2450 0.2248
Detection Prevalence 0.2685 0.2450 0.2248
Balanced Accuracy 1.0000 0.9932 0.9786
Class: Wildfire
Sensitivity 1.0000
Specificity 0.9821
Pos Pred Value 0.9487
Neg Pred Value 1.0000
Prevalence 0.2483
Detection Rate 0.2483
Detection Prevalence 0.2617
Balanced Accuracy 0.9911
Accuracy: 78.66 %
mtry value and the cross-validation accuracy.Output:
Out-of-Bag (OOB) Error Rate:
OOB Earthquake Flood Hurricane Wildfire
[1,] 0.05200000 0.000000000 0.00000000 0.12698413 0.076923077
[2,] 0.04941176 0.026315789 0.01904762 0.06796117 0.087378641
[3,] 0.06156716 0.041095890 0.01515152 0.10156250 0.092307692
[4,] 0.05272109 0.025316456 0.02097902 0.08633094 0.081081081
[5,] 0.05537975 0.023668639 0.01910828 0.10273973 0.081250000
[6,] 0.03963415 0.022857143 0.01851852 0.06535948 0.054216867
[7,] 0.04154303 0.016759777 0.04191617 0.03750000 0.071428571
[8,] 0.05102041 0.027322404 0.04705882 0.05555556 0.076023392
[9,] 0.04310345 0.021505376 0.04624277 0.04268293 0.063583815
[10,] 0.04005722 0.021505376 0.04022989 0.04848485 0.051724138..................................................................................
Now we will Predict values using model.
Output:
titlePanel("Interactive Disaster Data Analysis & Prediction").selectInput() for choosing disaster types.dateRangeInput() for selecting the date range.numericInput() for entering numeric values related to disaster parameters (latitude, longitude, magnitude, etc.).actionButton() to trigger data updates and predictions.filtered_data() reacts to the "Update" button to filter data based on the selected disaster type and date range.renderPlot() creates a bar plot showing the distribution of disaster types.renderPlot() generates a line plot depicting disaster magnitude over time.renderLeaflet() creates an interactive map to visualize disaster locations.observeEvent() listens for the "Predict" button click.shinyApp() function combines the UI and server logic to run the Shiny application.Predicting natural disasters using data analysis in R helps us prepare better and respond more effectively. This article showed how to analyze disaster data and build a prediction model, helping us understand and manage natural disasters more efficiently.