Some websites are making heavy use of Javascript and AJAX call.

Typically for those websites, when you visit them with your web browser the page returned by the server is an empty HTML skeleton.

Once the HTML hit your browser, the Javascript framework will call it's internal code and fire one or many HTTP requests to call some other backend APIs and populate the page with useful data.

It's basically what most Single Page Application frameworks do. This process can take a few seconds on when loading the first page.

This is usually what is going on when you visit a web page that displays a loader before showing useful information.



And this is exactly trying to scrape those websites without a headless browser will not work well.

By default our API scrape pages through a real web browser (documentation).
Was this article helpful?
Cancel
Thank you!