Tutorial: Parallel web scraping with CasperJS and GNU Parallel
A few weeks ago, I had to write a web scraper that processed a long list of similar web pages and pulled data from each website into a single text file. I used CasperJS as the web scraping engine, which is fairly simple to set up. My first approach was to launch a single instance of Casper, then visit each web …