Puppets

Member-only story

Web Scraping With Express And Puppeteer

Josh Hicks

--

Using Headless Browsers To Surf The Web

Today we’re going to be building an app that will scrape two websites for anything that matches a certain keyword. The two websites will be Medium.com and YouTube.com. Here we’ll be looking for article or video that contains the words “Headless Browser”. The hope is that the resulting list of resources will help us expand our knowledge on this topic.

In this project we’ll be using the following technologies:

Node will be the environment that’s used to run our code. Express will be used to run the server that serves the template. We will create the HTML content with the Pug template engine. Last but not least is the star of the show, Puppeteer! This is the headless browser tool that will make this whole thing possible.

What Is A Headless Browser, And Why Does It Exist?

It’s important to know a little bit about why headless browsers are so interesting and useful. The concept of headless browsers can be a little hard to wrap your head around (no pun intended!) at first. The simplest way of thinking about it is, a regular web browser without a GUI (Graphical User…

--

--

Responses (2)