API | ScrapingBee Knowledge Base

Getting started
ScrapingBee is a really simple API that allows you to extract HTML from every website in one single API call If you need more to use our custom options such as JS rendering or our Premium Proxy take a look at our full documentation. To get your API Key, you just need to create an account here. Of course, don't forget to replace "YOUR_URL" by the URL of the page you want to scrape. | Curl curl "htPopular
What to do if my request fails?
Check what error code you received. Our API should always return the reason why your request failed. It will be one of these status codes. If you receive a status code 500, you can double-check the response body to see the exact error code returned by the target website: Below are the most common ways to fix your API call. ChecPopular
URL encoding
When you are making an API call to ScrapingBee, you can pass different parameters as query strings. Here is an example: https://app.scrapingbee.com/api/v1/?url=https://example.com/&premium_proxies=True&api_key=YOUR-API-KEY Now what happens when the URL you want to scrape also contains a query parameter named param1? curl 'https://app.scrapingbee.com/api/v1/?url=https://example.com?param1=value1&param2=value2&premium_proxy=True&api_key=YOUR-API-KEY' In this case, the API wPopular
The webpage I'm trying to scrape haven't rendered completely
If you don't use JavaScript rendering: If you're not using JavaScript rendering (render_js=False) and your page doesn't render completely, chances are that you are scraping a webpage that needs javascript rendering to work correctly. Some websites need to be rendered inside a real browser to work and load all information correctly. To enable this using our API you need to use render_js=True on your API call [(documentation)](https://www.scrapingbee.com/documentation/#javascript-renderiPopular
How can I bypass cookie consent page?
Most sites have cookie consent pop-ups, which can be closed via our API. We will use Google as an example, but you can apply the same logic to any website. || Keep in mind that every website is built differently, so cookie consent buttons are not standardized across sites. Each page may use its own unique layout, elements, or wording for cookie banners, which means the buttons must often be identified and handled on a case-by-case basis. So to fix this issue and bypass it, we need to accept GoPopular
How to use concurrency?
According to the plan you chose, you will have access to a specific number of concurrent request. This means that you'll be able to only do a specific number of request at the same time. For example, if you need to make 100 requests and have an allowed concurrency of 5, it means that you can send 5 requests at the same time. The simplest way for you to take advantage of this concurrency is to set up 5 workers / threads and having each of them send 20 requests. Below you'll find some resourcesSome readers
How to intercept XHR / Ajax requests?
Some web pages will load data dynamically using Ajax requests and not load them on the main webpage. So if you are requesting those web pages using our API, you might not be able to find the data you're interested in in the returned HTML. But you can use the response_json=True parameter (documentation). If you use this parameter, your API requests will return a JSON response. In that JSON response, you will be able to access all tSome readers
Why am I not seeing images when using the API in my browser?
If you try to put an API URL (`https://app.scrapingbee.com/api/v1/...) in your browser, the website might look a bit weird: DuckDuckGo.com rendered in a browser through ScrapingBee API The website renders weirdly in your browser because ScrapingBee API will only returns you the HTML of the page, it won't return you all the other assets loaded by the page. And while those assets (images, JavaScript, CSS, fSome readers
Credit system explained
The number of credits will change depending on a few parameters. Here is a full list of costs: Costs for the HTML API: Classic proxies without JS rendering: 1 credit Classic proxies with JS rendering: 5 credits Premium proxies without JS rendering: 10 credits Premium proxies with JS rendering: 25 credits Stealth proxies (no option to disable JS rendering at the moment): 75 credits AI query Adding an AI query to your request would add an additional 5 API credits. Classic would cost 10, PremiumSome readers
Understanding JavaScript rendering.
Some websites are making heavy use of Javascript and AJAX call. Typically for those websites, when you visit them with your web browser the page returned by the server is an empty HTML skeleton. Once the HTML hit your browser, the Javascript framework will call it's internal code and fire one or many HTTP requests to call some other backend APIs and populate the page with useful data. It's basically what most Single Page Application frameworks do. This process can take a few seconds on when lSome readers
What kind of headers are returned by the API?
The API returns two types of header. The one coming from our web app server, and the one coming from the scraped website. In normal mode, we prefix headers coming from the scraped website by Spb- in order to differentiate them from the ones from our web app server. On top of that, we also add 3 headers: Spb-cost: "Request cost in credits." Spb-initial-status-code: "The initial status code returned by the scraped page. Useful when the page redirects" Spb-resolved-url: "The resolvFew readers
Common proxy questions
Do you provide Mobile/Residential/Data center proxies? We currently do not provide direct access to ScrapingBee's proxies, however, you can use our premium proxies to perform web scraping tasks through our API. To do that, you'll have to add the parameter premium_proxy=true to your request. You can also specify a specific geolocation for the proxy using the parameter country_code, for example, by putting country_code=de you will use the German IP addresses. This option only works withFew readers
How to make screenshots?
You can get back a screenshot of the webpage you want to scrape by using the screenshot=True parameter. You can find below how to do it: Curl curl "https://app.scrapingbee.com/api/v1/?api_key=YOUR-API-KEY&url=YOUR-URL&screenshot=True" > ./screenshot.png Python # Install the Python ScrapingBee library: # pip install scrapingbee client = ScrapingBeeClient(api_key='YOUR-API-KEY') response = client.get( 'YOUR-URL', params={ 'screenshot': True, } ) ifFew readers
Scraping an E-commerce product page
Don't hesitate to take a look at our Python web scraping 101 to read a detailed introduction to these libraries. We also have a lot of tutorials in different languages in our web scraping blog. We are going to extract the price, and product name for this product: [https://clever-lichterman-044f16.netlify.app/products/taba-cream.1/](https://clever-lichterman-044f16.netlify.app/products/taba-creFew readers
How to forward headers?
You might need to forward specific headers to the website that you want to scrape. In order to forward headers, you must set forward_headers to true and then pass your custom headers. You must then prefix the headers to forward to the website with "Spb-" (for ScraPingBee). This prefix will be trimmed by ScrapingBee, and headers will be forwarded to the target web page. Example : If you want to send the header Accept-Language: En-US, add the header: Spb-Accept-Language: En-US aFew readers
Waiting for some Javascript code to execute.
By default, ScrapingBee waits for 2000 milliseconds before returning the HTML. You can increase this value by using the wait parameter (documentation).Few readers
SSL certificate errors
If you're getting SSL errors from our ScrapingBee, here are the most common ways to fix it: Update your OpenSSL version https://letsencrypt.org/docs/dst-root-ca-x3-expiration-september-2021/ We're using Let's Encrypt certificate on the app.scrapingbee.com subdomain. Since September 30th 2021, Let's Encrypt uses SRG Root X1 certificates. Those certificates are only accepted by OpenSSL version 1.10 or later. Disable SSL verification with your HTTP client In Python, you can dFew readers
Do you cache requests?
We do not cache the requests you make using ScrapingBee's API. You will always get the latest, most fresh version of the website or page you're scraping.Few readers
How to use Proxy Mode in Postman
If you need to use our proxies in the Postman app, please follow the below steps: In the Postman app, open your workspace and click on the agent selector on the bottom right side of your screen Click on 'Desktop Agent' and then click on 'Download Desktop Agent' Install and Run the 'Postman Agent' app Click on the gear icon on the top right side of your screen and tFew readers
Preconfigured request settings
We are excited to announce a new time-saving feature: Preconfigured Request Settings! Many of our users have asked for an easier way to manage request settings, and we’ve listened. With this new feature, you can now preconfigure settings for any website and reuse them whenever needed—eliminating the hassle of manually entering them every time. What is Preconfigured Request Settings? Preconfigured Request Settings allow you to save commonly used settings for specific websites. Once savFew readers
ChatGPT citations
Currently, the ScrapingBee ChatGPT API is not able to return citations 100% of the time. Our team is aware of this issue and is looking into possibilities to either always return them or increase the return rate to a reasonable level. Citations with the ChatGPT scraper also depend on the prompt used; some prompts may have a higher success rate than others. They are visible in our API at the bottom of the response, in results_markdown. If you are checking the full HTML, it would be under thisFew readers
Can I scrape/download PDF with ScrapingBee?
Yes! You can scrape PDF while using ScrapingBee API. To achieve that, you would simply need to disable JavaScript rendering and send a standard request to the PDF URL. Our API will return you the raw binary data of the PDF file you targeted, that data you can save with the extension .pdf afterwards. This approach works for publicly available PDF files, that does not require JavaScript execution or browser rendering. Here's a simple Python example that would help you to achieve that:Few readers
Do you have *This* API
Currently, this is our roster of dedicated API's and their available targets: Google API supports Google search, news, maps, images, lens, shopping, and AI mode. You can read more about this API here Amazon API supports Amazon product search and specific product lookup. You can read more about this API here YouTube API supports YouTube search, trainability, metadata, and transFew readers
Can I scrape *This* page?
With our HTML API, you should be able to scrape almost any page online. One thing to note is that our API only allows pre-login scraping; any post-login scraping is not allowed: “Scraping under login credentials is strictly prohibited. If any violation of this policy is detected, we reserve the right to suspend or otherwise limit your account access.” The difference is that pre-login refers to publicly available information that does not require a login, while post-login refersFew readers
Practical uses of screenshot responses
You can select that your request is returned as a screenshot: The most common use case is when you need to see particular information on the page and do not necessarily need to parse the full HTML. In reality, there are multiple ways you can use this function. If you want to execute a particular JS scenario and it seems like nothing is working, try enabling the screenshot response to see how farFew readers
How to download video with ScrapingBee?
With ScrapingBee you can use our HTML API with JS rendering disabled to download raw binary data of the video. This way our API simply proxies the request and returns the binary data of your selected video. You then can save that binary data to the .mp4 file and you'll have your video ready! This works for publicly accessible media format, as an example .mp4 extension files. Here is a quick python request example that you can directly save inFew readers
Website's internal requests
Firstly, you need to identify the internal requests. On the page you want to scrape, right-click and select Inspect (or Inspect Element). Then, open the Network tab and refresh the page. You should see a large number of requests appear. Next, review these requests to determine where the data you need is being loaded from. For example, Best Buy loads most of its product information through multiple GraphQL requests: ![](https://storage.crisp.chat/users/helpdesk/website/-/6/6/a/2/66a2Few readers
Beehive is broken error
You may encounter this specific error in the response: It would look like this in browser preview, screenshot, or when trying to enter your dashboard: This specific error means that there is an ongoing outage with our API or dashboard. Changing the settings would usually not help in resolving thiFew readers