Web Scraping in Node with Cheerio - 10 Apr 2016
If you’re looking to write a simple bot or script that does web scraping, then node might be a great option. The cheerio library makes it easy to work with HTML. Here’s a quick example:
npm install request --save
npm install cheerio --save
Here’s a sample script for parsing article information from a list:
request(articleListUrl, async function (err, resp, body) {
const $ = cheerio.load(body)
const article = $('ul#articles-list li.article:first-of-type')
const articleLink = article.find('.media-body a:first-of-type')
const articleTitle = articleLink.text()
const articlePath = articleLink.attr('href')
})
It’s surprising how well the jQuery API lends itself to web scraping.