Web Scraping with Scala [closed]

First there is a plethora of HTML scraping libs in JVM all you need to do is pimp one of them (pimp my library pattern).

The four I have used are:

  • HtmlUnit – Will emulate the browser and even run Javascript
  • Jericho – Preserves formatting and ideal if you want to edit the scraped HTML
  • NekoHtml
  • JSoup — does not work with Scala. Might work

I have used Selenium but never for scraping. Scala has a wrapper around selenium.

I would recommend pimping an existing Java library over some half baked Scala lib.

Leave a Comment