Go to website
Everything about web scraping and API usage.
ScrapingBee is a really simple API that allows you to extract HTML from every website in one single API call If you need more to use our custom options such as JS rendering or our Premium Proxy take a look at our full documentation. To get your API Key, you just need to create an account here. Of course, don't forget to replace "YOUR_URL" by the URL of the page you want to scrape.
What to do if my request fails?
Please find below the most commons ways to fix your API call. Check that your URL is correctly encoded. Most of the errors you will encounter will be because you haven't correctly encoded your URL. To do this quickly you can go on this website, and click on the encode button. If you need to do this programmatically, learn here how to do it. Disable block_resources. To speed up your r
When you are making an API call to ScrapingBee, you can pass different parameters as query strings. Here is an example: Now what happens when the URL you want to scrape also contains a query parameter named param1? In this case, the API
How to intercept XHR / Ajax requests?
Some web pages will load data dynamically using Ajax requests and not load them on the main webpage. So if you are requesting those web pages using our API, you might not be able to find the data you're interested in in the returned HTML. But you can use the response_json=True parameter (documentation). If you use this parameter, your API requests will return a JSON response. In that JSON response, you will be able to access all t
How to use concurrency?
According to the plan you chose, you will have access to a specific number of concurrent request. This means that you'll be able to only do a specific number of request at the same time. For example, if you need to make 100 requests and have an allowed concurrency of 5, it means that you can send 5 requests at the same time. The simplest way for you to take advantage of this concurrency is to set up 5 workers / threads and having each of them send 20 requests. Below you'll find some resources
By default, ScrapingBee waits for 2000 milliseconds before returning the HTML. You can increase this value by adding using the wait parameter (documentation).
Why am I not seeing images when using the API in my browser?
The webpage I'm trying to scrape haven't rendered completely
How to make screenshots?
You can get back a screenshot of the webpage you want to scrape by using the screenshot=True parameter. You can find below how to do it: Curl Python
Scraping an E-commerce product page
Don't hesitate to take a look at our Python web scraping 101 to read a detailed introduction to these libraries. We also have a lot of tutorials in different languages in our web scraping blog. We are going to extract the price, image URL, and product name for this product: https://clever-lichterman-044f16.netlify.app/products/taba-cream.1/ ![E-commerce product page with Chom
SSL certificate errors
If you're getting SSL errors from our ScrapingBee, here are the most common ways to fix it: Update your OpenSSL version https://letsencrypt.org/docs/dst-root-ca-x3-expiration-september-2021/ We're using Let's Encrypt certificate on the app.scrapingbee.com subdomain. Since September 30th 2021, Let's Encrypt uses SRG Root X1 certificates. Those certificates are only accepted by OpenSSL version 1.10 or later. Disable SSL verification with your HTTP client In Python, you can d
How to forward headers?
You might need to forward specific headers to the website that you want to scrape. In order to forward headers, you must set forward_headers to true and then pass your custom headers. You must then prefix the headers to forward to the website with "Spb-" (for ScraPingBee). This prefix will be trimmed by ScrapingBee and headers will be forwarded to the target web page. Example : If you want to send the header Accept-Language: En-US, add the header: Spb-Accept-Language: En-US an
Do you support LinkedIn scraping?
Currently ScrapingBee is not able to scrape LinkedIn.com.
What kind of headers are returned by the API?
The API returns two types of header. The one coming from our web app server, and the one coming from the scraped website. In normal mode, we prefix headers coming from the scraped website by Spb- in order to differentiate them from the ones from our web app server. On top of that, we also add 3 headers: Spb-cost: "Request cost in credits." Spb-initial-status-code: "The initial status code returned by the scraped page. Useful when the page redirects" Spb-resolved-url: "The resolv