Member-only story
How To Scrape Worldwide Coronavirus Data In 5 Minutes With Node.js & Puppeteer
Knowing how to scrape data from websites is a powerful skill to have in today’s data driven economy. Not only is scraping data useful, but it can also be kind of fun. My goal is to show you how it’s done with as little pain as possible.
To do so, we will use a website that provides worldwide coronavirus data to the public. The information that we are interested in resides in a client-side rendered table, which we will traverse and extract the data from.
The site can be visited at https://www.worldometers.info/coronavirus/. You will need to scroll down the page a little bit to get to the table.
Breaking Things Down Into Steps
Now that we understand what our goal is, let’s create a list to help break down the steps we need to take to achieve it.
- Setup the code
- Visit the website
- Wait for the table to load
- Select all of the rows
- Iterate through the rows and extract the data
- Celebrate!
If that doesn’t sound easy, then I don’t know what does. Let’s get started by setting up the code first.