3. Model Deployment w/ Streamlit | Twitter (X.com) Sentiment Analysis
End-to-end machine learning project on sentiment analysis. In this post, we will walk through the steps of creating a dashboard and deploying our model as a web app with Streamlit.
Feb 25, 2025 ยท Timotius Marselo
In the previous post, we have build a LSTM model that can predict the sentiment of a tweet. Now we will create a dashboard and deploy our model using Streamlit. Our Streamlit app will automatically scrape the latest tweets based on the entered search term, then perform inference in real-time and visualize the results.
Creating Dashboard with Streamlit
First we will make a folder with the following structure:
The app.py
will contain the code for the dashboard design, while the helper_functions.py
will contain the code for retrieving the tweets, preprocessing the tweets, performing inference and visualizing the results. The requirements.txt
will contain the list of libraries that we will need to install. The static
folder contain supporting files, such as the trained model (lstm_model.h5
), tokenizer (tokenizer.pickle
), stopwords for sentiment prediction pipeline (en_stopwords.txt
), stopwords for visualization purpose (en_stopwords_viz.txt
), twitter image for wordcloud (twitter_mask.png
) and font type for wordcloud (quartzo.ttf
). All of the supporting files are available in the GitHub repository.
helper_functions.py
We will populate the helper_functions.py
with the necessary functions. First we will import all necessary libraries.
Then we will create get_latest_tweet_df
function which will retrieve the latest tweets based on the entered search term and number of tweets requested. We will use the snscrape
library to retrieve the tweets, which does not require using the Twitter API. The get_latest_tweet_df
function will return a dataframe containing the username of the poster, date, number of likes and the tweet itself.
Next we will create text_preprocessing
function which will preprocess the tweets. The preprocessing steps have been explained in the previous post.
Next we will create predict_sentiment
function which acts as the inference pipeline, with tweet dataframe (ouput of get_latest_tweet_df
function) as the input. The function will first preprocess the tweets by applying the text_preprocessing
function we have created earlier, then convert the preprocessed tweets into sequences of integers using the pre-fitted tokenizer. The sequences of integers are then padded to the same length as the training data. The padded sequences are then passed to the pre-trained model to perform inference. The predict_sentiment
function will return the original dataframe with the predicted score and sentiment.
Now we will create some functions related to visualization. First we will create plot_sentiment
function which will plot the number of positive and negative tweets in a pie chart. The function will take the tweet dataframe (output of the predict_sentiment
function) as input and return a plotly figure.
Next we will create plot_wordcloud
function which will show the top words in form of a wordcloud. The function will take the tweet dataframe (output of the predict_sentiment
function) and color map as inputs and return a matplotlib figure. First we load the stopwords from en_stopwords_viz.txt
, which is a list of words that we want to remove from the wordcloud. We use a different stopwords for visualization and text preprocessing because we want to remove more words for visualization purpose. We also load the image from twitter_mask.png
to be used as the mask for the wordcloud, and the font type from quartzo.ttf
. Next we create a custom colormap using the LinearSegmentedColormap
function from the matplotlib
library. We do not use the default colormap because we want to control the color intensity of the wordcloud. Finally, we will generate a wordcloud based on the words frequencies of the combined preprocessed tweets. The wordcloud is returned as a matplotlib figure.
We also want to visualize the top words and bigrams in form of a bar chart. First we create get_top_n_gram
function which takes the tweet dataframe (output of the predict_sentiment
function), range of n-gram (e.g. (1,1) for word and (2,2) for bigram) and number of top words as inputs and return a dataframe containing the top n-grams and their frequencies. We then create plot_top_n_gram
function which takes the n-gram dataframe (output of the get_top_n_gram
function), title, and color as inputs; and then plot the top n-grams in form of a bar chart.
app.py
We will design our dashboard by populating the app.py
file. First we will import the necessary libraries, which include the helper_functions
that we have created earlier.
Next we will set the page configuration (title, icon and layout) of the app.
We also want to adjust the layout of the app to make the interface more visually appealing. We will use HTML and CSS to adjust the top padding of the block container.
Now we will define the callback function for the 'Search' button (which we will discuss next). Whenever the 'Search' button is clicked, the search_callback
function will be called. search_callback
function will retrieve the latest tweets (in form of dataframe) by passing the entered search term and number of tweets to get_latest_tweet_df
function; and then perform inference by calling predict_sentiment
function.
Next we will use the st.sidebar
to create a sidebar for the app. First we will include some information about the app in the top of the sidebar. Then we will create a form (using st.form
) for the user to enter the search term (using st.text_input
) and number of tweets (using st.slider
). By using st.form
, the input widgets will be grouped and submitted in batch once the user click the 'Search' button (st.form_submit_button
). Clicking the 'Search' button will update the session state (associated with the widget key) and trigger the search_callback
function that we have created earlier. Last we will include the Github link of the project and the author's name in the bottom of the sidebar.
Now we will design the main part of the app. First we will check if st.session_state.df
exists. If it exists, it means that the user has already clicked the 'Search' button and the search_callback
function has been called, so we will display the dashboard. The dashboards are created using the make_dashboard
function, which takes the tweet dataframe, bar chart color and wordcloud color as inputs. The make_dashboard
function will create a container with two rows. The first row will contain the pie chart of sentiment distribution, bar chart of the top 10 words, and bar chart of the top 10 bigrams. The second row will contain the dataframe (containing the tweets and their sentiments) and the wordcloud. We rely on the functions from the helper_functions
to create the visualizations. We will use st.tabs
to create 3 tabs in the dashboard, which are 'All', 'Positive', and 'Negative'. Each tab will display the dashboard for the tweets in that sentiment category.
requirements.txt
We will populate the requirements.txt
with the libraries required for the Streamlit app. The version of the libraries are included to ensure that the app will run as expected.
Deploying the Streamlit App
Testing the App Locally
To test our Streamlit app locally, we will change the directory to the folder containing the app.py
and run the following command in the terminal:
If the app runs successfully, the browser will automatically open the app.
We can test the app by entering a search term and number of tweets to retrieve. After we click the 'Search' button, the app should show the dashboard.
Deploying the App to Streamlit Community Cloud
First we need to upload the project folder to GitHub. Then we will go to the Streamlit Community Cloud and click the 'Get Started' button. After signing in and connecting our GitHhub account, we will click the 'New App' button. We will then enter the repository that we have created earlier, along with the branch and main file path.
After clicking the 'Deploy' button, the app will be deployed to the Streamlit Community Cloud (might take a few minutes for the initial deployment). We can open the app by clicking the app name.
In this post, we have successfully made our Streamlit app and deploy it on Streamlit Community Cloud. If you are following this series from part 1, we have gone through an end-to-end machine learning project, starting from data collection and preprocessing, model building, creating a dashboard, and finally deploying our model and dashboard as an online application. Hopefully you learned something from the series!
Further reading
Read More
1. Data Collection | Twitter (X.com) Sentiment Analysis
End-to-end machine learning project on sentiment analysis. In this post, we will walk through the data collection process with distant supervision method.
2. NLP, Sentiment Model Building | Twitter (X.com) Sentiment Analysis
End-to-end deep learning with LSTM on tweets from X.com. In this post, we will walk through the steps of data preprocessing and model building.
4. Use Cases | Twitter (X.com) Sentiment Analysis
In this final installment, we will explore some use cases of Twitter sentiment analysis in the field of business and social science.
Tags: lstm, streamlit, deep learning