This article takes you through the creation of a COVID vaccination rate notifier. This project works by scraping data from the web and then displaying system notifications with Python.
It hasn’t been the greatest year – but there’s light at the end of the tunnel. We’ve waited for the vaccine, and now we wait for people to get the vaccine.
To keep sane through the winter, we’re all watching the vaccination rate slowly climb – hopefully to the point where we can go to the pub again.
The Project Idea
Hitting the refresh button constantly on Wikipedia is no fun, so I figured I’d build a Python script to scrape the data of interest (number of people who have been vaccinated) from
https://en.wikipedia.org/wiki/COVID-19_vaccination_programme_in_the_United_Kingdom
and notify me whenever the number increases with a celebratory message.
The Failed Attempt
Initially, I wanted the number to appear on a 2×16 LCD display I had kicking about. It’d be driven by a Raspberry Pi. I wired it up, drew some schematics ready for publishing… but when I powered it on:
…nothing. Nothing I did would make it work. Either I had it wired wrong, or the display is a dud, so I had to abandon the idea.
Reworking the Concept
I’m not letting a broken display get in the way of Saturday tinkering, so I re-worked the idea.
The script would instead pop up a notification on my computer rather than showing it on an LCD display. Just as useful, just slightly less cool. Oh well.
Requirements
This project assumes you are running Ubuntu with the default Gnome desktop interface.
This script needs to target the notification system in Gnome specifically, but check the comments for details on adapting it to other Linux distributions.
Installing Dependencies
If you’re not on Ubuntu, you can follow our installation guide to install it.
Install Python and dependencies using apt – these should already be installed on a default Ubuntu desktop 20.04 installation:
sudo apt install python3 python3-gi
The Code
As with the previous photo re-sizer Python project, I’ll let the code do the talking. This code example is well commented and written to be as straight forward as possible (I hope).
# SCREWSCRAPER # A Python script to scrape data from a table on wikipedia and present a notification in Ubuntu when the data is updated # In this case, the current count of people vaccinated for COVID-19 in the United Kingdom is monitored, so you can count down when you're gettin back in the pub # This script is intended for use with Python 3 # To run this script, run 'python3 scraper.py' in your terminal # Run the following in terminal to download the required Python Dependencies # pip3 install beautifulsoup4 lxml requests # Import required libraries import requests import time from bs4 import BeautifulSoup # These libraries are used for notifications and may be distribution-specific # You may need to run # sudo apt-get install python3-gi # to install dependencies from gi.repository import Notify # Define the URL for the page containing the data we want to extract webUrl = 'https://en.wikipedia.org/wiki/COVID-19_vaccination_programme_in_the_United_Kingdom#Progress_to_date' # Define the table caption for the table containing the data we will be looking for global captionSearchText captionSearchText = 'Cumulative totals of first doses' # Global Variable to hold the value of the number of people vaccinated when pulled from Wikipedia global vaccinated vaccinated = 0 # The following population data can be hard-coded as it won't be changing in any meaningful way, so won't need to be updated from online sources # Global Variable to hold the population of the UK as of mid 2019, according to Wikipedia global ukPopulation ukPopulation = 66796807 # According to Wikipedia, in 2011 23.9% of the UK population were between 0-19 years of age. # As under 18's aren't being vaccinated, we'll exclude them from the calculations and round down to 23% # which should be a good enough estimate of the percentage of the population which does not get the vaccine due to being too young global ukPopulationUnder18Percent ukPopulationUnder18Percent = 23 # Variable to hold how often we want to check for changes to the page, in seconds - I've set it to 6 hours global checkInterval checkInterval = 60 * 60 * 6 # Initialise notifications for the script Notify.init('SCREWSCRAPER') # Function to scrape the data from Wikipedia using requests and BeautifulSoup def scrapeData(): print('Scraping Wikipedia data...') #Download the page and store it as a text variable using the requests library dataSourceHtml = requests.get(webUrl).text # Initialise BeautifulSoup with the web data source, and lxml library to parse the HTML webpage we will receive # Why lxml? Check out https://www.crummy.com/software/BeautifulSoup/bs4/doc/ soup = BeautifulSoup(dataSourceHtml,'lxml') # Variable to hold the table containing the data we are looking for (if it is found) # Initialised as an empty variable so we can check if it has been populated later table = None # To find the table containing the data we are looking for, we will search for the table with the matching caption # Find all 'caption' tags in our BeautifulSoup instance which contains the HTML from the Wikipedia page # For each caption, check the caption text contains the value we are looking for # Then, check if parent table is the right HTML class ('wikitable') for caption in soup.find_all('caption'): if captionSearchText in caption.get_text(): # checking the table caption for a match - if there is one we can do stuff table = caption.find_parent('table', {'class': 'wikitable'}) # If a matching table has been found, the table variable will now be populated if table: # The value we are looking for should be in the last column, in the last row of the table # We can get these by getting all rows/columns and then getting the last one by it's index (-1) lastRow = table('tr')[-1] lastCol = lastRow('td')[-1] # Wikipedia formats numbers with commas to separate the thousands, hundreds - remove them from the last Column text and assign it to the latestVaccinated variable latestVaccinated = lastCol.getText().replace(',', '') # The value pulled from the HTML will still be a string - convert the string to an integer variable latestVaccinated = int(latestVaccinated) # Cool! Now, to see if the value has increased and pop up a notification if it has global vaccinated # Tell python we'll be reading/updating the global variable rather than trying to redefine it locally if latestVaccinated > vaccinated: print('Hooray! More people got jabbed!') # Update the global vaccinated variable to the new value vaccinated = latestVaccinated # and fire off an alert doAlert() # Function to create a pop up notification in Ubuntu # If you're using arch, you can check out https://wiki.archlinux.org/index.php/Desktop_notifications for how to create notifications for your distribution # If you're using a different distribution, you should be able to find similar documentation in your distributions documentation def doAlert(): # Calculate the percentage of the population that have been vaccinated pop = vaccinated / (ukPopulation * ((100 - ukPopulationUnder18Percent)/100) ) * 100 # Lots of brackets to ensure order of operations! # Round the calculated value to 2 decimals pop = round(pop, 2) # Give the notification a title and text title = 'UK COVID Vaccination Update!' text = str(vaccinated) + ' have been vaccinated! That\'s ' + str(pop) + ' percent of folks!' # Integer variables cast as a string using str() notification = Notify.Notification.new(title, text) # Make the notification occur for a full 60 seconds before being automatically dismissed notification.set_timeout(60 * 1000) # 60 * 1000 Milliseconds = 60 seconds # More options can be found at https://developer.gnome.org/libnotify/0.7/NotifyNotification.html # Add a button to the notification so we can tell it that it is awesome (and easily dismiss it without having to hunt for the little X in the corner of the notification) notification.add_action( 'action_click', 'Awesome!', awesome, # This references the function below None # Arguments for the action - we have none ) notification.show() # This function doesn't really do anything, but notification actions need a function to call, so this one just prints 'awesome' to the console def awesome(): print('Awesome!') # The main function which will run when this script is executed def main(): print('Hello! Let\'s begin!') # Notice the escaped single quote! # Run scrapeData() to scrape the data - scrapeData() will call doAlert() when it's found new data scrapeData() # Wait for the specified interval before checking again time.sleep(checkInterval) # To check again, just re-run this main() function! main() # Launch the main() function when this script is executed # Catch KeyboardInterrupt so that if the application is quitting, notifications can be uninitialised try: main() except KeyboardInterrupt: pass finally: # Done using notifications for now - uninit will clean things up. print('Done, cleaning up notifications') Notify.uninit() # End of file, that's all folks!
Further Information on Notifications in Ubuntu
If you want to customize your notifications further, you can view the documentation at the link below. The Python library we are using provides a bridge to these notification features:
https://developer.gnome.org/libnotify/0.7/NotifyNotification.html/
What if they Change the URL or Table Name on Wikipedia
Easy! Change the URL and the table caption to search for in the code, and it’ll pick up where it left off.
Running the Script in the Background
Append the & operator to the end of a Linux shell command to run it in the background – you can now close the terminal and still receive alerts from the script when the count is updated.
python3 scraper.py &
In Action
Here’s how it looks in action!
Conclusion
No more page refreshing!
The script will keep me up to date with vaccination progress. Boris should put a big screen up on Big Ben running this script.
I like your blog! FYI Wikipedia data does not update daily. Right now the data for the UK is 4 days old. I think a much better solution is to just parse this massive JSON endpoint here: https://covid.ourworldindata.org/data/owid-covid-data.json. This is easy to do in just a few lines of Python and is not as brittle as screen scraping. I’ve checked the data for the US at least and it is quite accurate. Cheers!