Artificial Intelligence Machine Learning

Capstone Project: A Deep Learning Tool to Bridge the Situational Awareness Gap in Disaster Management

Project Background

The early stages of a disaster are characterized by a dynamic and rapidly evolving environment. This makes it extremely difficult to assess the initial conditions of a disaster, especially when the information available to make decisions is limited. Understanding these initial conditions is an important step in figuring out the possible causes of the current events and determining the direction in which the events are evolving. Establishing situational awareness as early as possible during a disaster is a very crucial step in planning for the mitigation, response and recovery of the disaster [1]. Lack of adequate situational awareness in a disaster makes search and rescue operations and damage assessments quite difficult for any teams sent on-site due to the lack of up-to-date information on the disaster [2]. To get wind of the status quo and appropriately distribute resources to affected areas, disaster management teams need to work with sufficient information on the disaster, affected areas, damages and losses, affected people, areas to be evacuated and available resources [3]. 

Social media and microblogging platforms, particularly Twitter, are a suitable alternative source to provide information on victims, their needs and their situation in real-time. Social media streams are considered important due to the fast and frequent localised updates on disaster events [4]. Due to the nature of social media, people can express their views in real-time to a multitude of people. This real-time nature makes it an important tool for disaster management teams as they can relay crucial information to the public, acquire their status information, gather sensitive information about a disaster from the public, facilitate early warnings detection and help coordinate relief efforts [5], [6].

Problem Statement

Slow flow of information to humanitarian organizations’ during the early stages of a disaster can stall their response to the disaster. With the dynamic nature of events at this stage of the disaster, this could have negative effects on the entire disaster response operations. Despite this, affected individuals share a lot of information about their status or their environment on social media platforms. This information is mostly shared with the intention of updating their loved ones about their current status or informing others about current events. This makes social media a potential alternative source of disaster-related information that humanitarian organizations can turn to for real-time and localized updates to gain situational awareness.


Fig 1. Disaster Tracker Logo

Disaster Tracker is a tool that filters real-time disaster-related tweets, uses machine learning models to process the tweets and displays tweet insights on a crowdmap to serve as a common operational picture (COP) for disaster response teams.

Project Objectives

  1. Collect data from multiple data sources i.e. Twitter API, CrisisLex and CrisisNLP and preprocess it.
  2. Build a Machine Learning pipeline with two models: the first filters out tweets based on their informativeness with regards to a disaster and the second classifies the tweets into one of 5 categories depending on the message and the theme of the tweets
  3. Build a dashboard/ crowdmap that reports on insights gathered through the machine learning pipeline

Data Collection

To develop the Machine Learning models, and evaluate their performance, supervised machine learning approaches are employed. Thus, data to train the machine learning models needs to be annotated with the correct labels. Annotated tweets are collected from CrisisLex and CrisisNLP repositories which contain large collections of data collected through the Twitter API but have been labelled for supervised machine learning. This project makes use of several datasets collected from the repositories:

  1. CrisisLexT26

This is a collection of tweets collected from 26 disasters that occurred between 2012 and 2013. The tweets are labelled based on their informativeness, source and information type.

  1. CrisisNLP

This is a collection of tweets collected from different disasters that occurred between 2013 and 2015 labelled using different categories depending on the nature of the disaster

  1. SWDM2013_dataset
  2. ISCRAM2013_dataset

Machine Learning Models

  1. Filtering Model

The first model we developed and trained was the filtering model. The purpose of the model is to identify whether a tweet is disaster-related or not. If the tweet is classified as irrelevant, it is discarded.  If a tweet is disaster-related, i.e. relevant, it’s passed onto the next stage of the machine learning pipeline.

Fig 2. Data collection and preprocessing pipeline for the filtering model

We trained and tuned 4 different model architectures for the filtering model. This allowed us to evaluate and compare the model results and pick the best performing model architecture as our filtering model. The best performing model architecture was a model comprising of a distilBERT layer, followed by BiLSTM (Bidirectional Long Short-Term Memory) layers, an Attention layer and fully connected dense layers. In this model, we tune the distilBERT layer by allowing its parameters to be trainable.

Fig 3. Best performing model architecture for the filtering model

This model achieved an accuracy of 84%. Accuracy is the ratio of predictions the model predicted correctly over the total number of predictions. A higher accuracy score is better. Below is a plot of the metrics (accuracy, precision, recall and f1-score) used to evaluate the different model architectures.

Fig 4. Plot of model accuracy, precision, recall and f1-scores of the different model architectures for the filtering model

From the evaluation results, the BERT (Bidirectional Encoder Representations from Transformers) models outperform all the other model architectures. This could be due to the fact that the BERT model architectures use transfer learning from pre-trained BERT models. These models have been pre-trained on very large corpora and are good at identifying context within a text. For our models, we used a distilBERT model which is a lightweight BERT model that is easy to deploy to applications and faster to train. From the first CNN model that we trained, there was a 7% increase in the performance of the BERT_LSTM_ATTENTION_Tuning.  The latter model, despite using transfer learning, tunes its model weights to the current classification task hence is able to achieve higher accuracy than the other untuned BERT models. This model also registers an increase in its recall, precision and f1-score indicating that it has less bias in identifying whether a tweet is relevant or irrelevant compared to other model architectures.

2. Categorizing Model

The categorizing model was developed and trained after the filtering model was completed. This model aims to categorize tweets into 5 different humanitarian response activities categories based on the messages of the tweets.

Fig 5. Data collection and preprocessing pipeline for the categorizing model

We created, trained and evaluated 5 different model architectures for the categorizing model. The best performing model architecture was a model comprising of a distilBERT layer, followed by BiLSTM layers, an Attention layer and fully connected dense layers. In this model, we tune the distilBERT layer by allowing its parameters to be trainable.

Fig 6. Best performing model architecture for the categorizing model

This model also achieved an accuracy of 84%.

Fig 7. Plot of model accuracy, precision, recall and f1-scores of the different model architectures for the categorizing model

From the evaluation results on the categorizing model, the BERT models outperformed the Base Model and BiLSTM model in all the metrics. As discussed in the filtering model, BERT models are better at identifying context in text hence perform better at categorizing the tweets. As with the filtering model, we used a distilBERT model for all the architectures with a BERT model to allow faster training and easier deployment. The best performing model architecture, the BERT_LSTM_Attention_Tuning features a fine-tuned BERT model with BiLSTM layers, an Attention head and dense layers. The BERT model is fine-tuned, meaning, its layers are unfrozen and its weights get updated in the training process to allow it to better understand the context of our dataset and accurately categorize the tweets. The BERT_LSTM_Attention_Tuning model registered an 11% increase in its accuracy and a 10% increase in its precision, recall and f1-score compared to the Base model. This indicates that the model is much better at understanding the message of the tweets and correctly categorizing them.

System Design

Fig 8: UML Sequence Diagram

The sequence diagram illustrates how actions are performed in the context of time. It describes the sequential execution of events and processes in the system. The first process is initiated by the user which is entering the hashtags and country they wish to stream tweets to the Disaster Tracker web app. The app submits the information to the Disaster Tracker API which sends a success message as a response to the request. The API streams tweets from the Twitter API. This is an asynchronous process hence does not wait for the results to be retrieved before sending a response. The process continues in a loop and does not automatically end its execution. The web app gets tweets from the API and receives preprocessed tweets as a response. The step executes every 10 seconds in a loop.

Web App Design

  1. Front-End

The home page features a form that collects two primary information from the user that we use to stream tweets: hashtag(s) or topic(s) and a country. On submit, the front-end application makes a post request to our API, which returns a success message if the request was completed successfully. On receipt of a success message, the app redirects to the dashboard page and sends a GET request to the API to retrieve any streamed tweets. The app sends a GET request to the API every 10 seconds to collect newly streamed tweets and ensure the data on the app is real-time data.

Fig 9. The homepage of the application features a form to collect information to stream the tweets.

The dashboard is an interactive tool that features a crowd map, distribution plots of several properties of the tweets, and a tweet list. The crowdmap plots each tweet on a map using its coordinates and identifies the categories of the tweets using different colors. The crowdmap, which is built on Google Maps, allows you to drill down to the exact location of a tweet and identify landmarks, buildings and the surrounding area from which the tweet was shared from. The dashboard also features a bar plot that shows the distribution of the tweets in each of the 6 categories that the tweets were classified. A list of the tweets is also available to provide more information about the tweets as they stream in.

Fig 10. The dashboard page showing the crowdmap and a section of the tweets list
Fig 12. Zoomed in image of a tweet location showing the surrounding area, landmarks and roads
Fig 13.  Tweets list section. Tweets are displayed with the author, date, tweet text and predicted category.
Fig 14. Image showing the category distribution of tweets.

2. Back-End

The API has a single endpoint that can be accessed using two HTTP methods, the GET method and the POST method. The first API call from the front-end application will be using the POST method. This is when the user submits their input for the hashtag(s) or topic(s) and country information. This information is retrieved from the body of the request object. A new thread is created and initialized with the information obtained from the request object. This thread is responsible for streaming tweets from the Twitter Streaming API, passing the tweets through a preprocessing pipeline and a machine learning pipeline before converting the tweets to GeoJSON, storing them temporarily. The new thread is then started, and a success message is sent to the front-end application to indicate successful submission of information and initialization of the streaming process. The main thread then stops execution.

The next API call will be after the successful submission and initialization of the streaming thread. After the front-end receives the success message, it sends multiple GET requests to the same API endpoint every 10 seconds. A GET request returns data that has been temporarily stored in the API by the streaming thread.

Analysis and Discussion

The evaluation results from the filtering model strongly suggest that by using machine learning models, we can accurately (with some degree of error) identify the relevance of tweets to disasters. This can be applied when collecting data from tweet streams e.g. the Twitter Streaming API, reducing the amount of noise present in the data and only allocating resources to relevant data. This will greatly improve the accuracy and speed of processing and analysing the data as the focus is solely on the relevant data. For our filtering model, the model was biased towards irrelevant tweets. The model was much better at identifying irrelevant tweets compared to relevant ones. One possible reason for this could be the use of an imbalanced dataset. The training data had relatively more samples of irrelevant tweets compared to relevant ones and thus the model could have optimized its weights in favour of the irrelevant tweets as it had more samples to train on.

Results from the categorization model suggest that it is possible to use machine learning models to correctly and automatically categorize disaster-related tweets into humanitarian response activities based on their messages. This allows us to build a robust system that streams live disaster tweets, filters out the noise and categorizes them accordingly. This process, despite its error rate, is much more time and resource-efficient compared to using human resources to do these tasks.


Our recommendations for this system are to: 

  1. Humanitarian Organizations and Disaster Response Teams

Our system, Disaster Tracker, will allow humanitarian organizations and disaster response teams stay up-to-date with events happening during the early stages of a disaster. The system gives these teams an overview of prevailing events as they happen on a crowdmap tagged with the appropriate response activities that the messages should trigger from the team. The crowdmap allows the teams to quickly identify the locations of tweets to their exact street, roads and buildings. Gaining situational awareness for these teams is important as this allows them to better prepare and prioritize their resources based on the situation.


The aim of this project was to build a tool to help disaster response teams get better situational awareness of a disaster using tweets as an alternative data source. To do so, we built two machine learning models that process the tweets. The first identifies the relevance of a tweet just like a human annotator would do and discards irrelevant tweets i.e. tweets not related to a disaster. The second model categorizes the tweet as one of five humanitarian response activities. The models were deployed and integrated into a system that streams live tweets, processes them and maps the tweets on a crowdmap using their geographic location. Basic insights about the tweets and information they contain are analysed and displayed on a dashboard alongside the crowdmap. Results from this project support the viability of Twitter and microblogs as alternative sources of information during a disaster for disaster response teams. The biggest challenge was the lack of geographical information on the majority of the tweets that were streamed hence a need for further research into how this can be improved. This project has paved the way for more work to be done to improve the efficiency of disaster response teams using Twitter as a source of data.

GitHub Repo

All the code from this project has been uploaded to GitHub. You can access the repository at the following link:


[1] D. Johnson, L. K. Comfort, D. Johnson, A. Zagorecki, J. M. Gelman, and L. K. Comfort, “Improved Situational Awareness in Emergency Management through Automated Data Analysis and Modeling Improved Situational Awareness in Emergency Management through Automated Data Analysis and Modeling,” vol. 8, no. 1, 2011, doi: 10.2202/1547-7355.1873.

[2] D. Yang et al., “Providing real-time assistance in disaster relief by leveraging crowdsourcing power,” Pers. Ubiquitous Comput., vol. 18, no. 8, pp. 2025–2034, 2014, doi: 10.1007/s00779-014-0758-3.

[3] L. Cheng, J. Li, K. S. Candan, and H. Liu, “Tracking Disaster Footprints with Social Streaming Data,” Proc. AAAI Conf. Artif. Intell., vol. 34, no. 01, pp. 370–377, 2020, doi: 10.1609/aaai.v34i01.5372.

[4] M. S. Dao, P. Quang Nhat Minh, A. Kasem, and M. S. Haja Nazmudeen, “A context-aware late-fusion approach for disaster image retrieval from social media,” ICMR 2018 – Proc. 2018 ACM Int. Conf. Multimed. Retr., no. April, pp. 266–273, 2018, doi: 10.1145/3206025.3206047.

[5] Y. Kryvasheyeu et al., “Rapid assessment of disaster damage using social media activity,” Sci. Adv., vol. 2, no. 3, 2016, doi: 10.1126/sciadv.1500779.

[6] J. P. Singh, Y. K. Dwivedi, N. P. Rana, A. Kumar, and K. K. Kapoor, “Event classification and location prediction from tweets during disasters,” Ann. Oper. Res., vol. 283, no. 1–2, pp. 737–757, 2019, doi: 10.1007/s10479-017-2522-3.

[7] A. Olteanu, S. Vieweg, and C. Castillo, “What to Expect When the Unexpected Happens : Social Media Communications Across Crises.”

[8] M. Imran, P. Mitra, and C. Castillo, “Twitter as a lifeline: Human-annotated Twitter corpora for NLP of crisis-related messages,” Proc. 10th Int. Conf. Lang. Resour. Eval. Lr. 2016, no. May, pp. 1638–1643, 2016.

[9] M. Imran, C. Castillo, F. Diaz, and P. Meier, “Practical Extraction of Disaster-Relevant Information from Social Media,” pp. 1–4.

[10] M. Imran, S. Elbassuoni, C. Castillo, F. Diaz, and P. Meier, “Extracting information nuggets from disaster- Related messages in social media,” ISCRAM 2013 Conf. Proc. – 10th Int. Conf. Inf. Syst. Cris. Response Manag., no. May, pp. 791–801, 2013.

Leave a Reply

Your email address will not be published. Required fields are marked *