Getting started
ScrapingBee is a really simple API that allows you to extract HTML from every website in one single API call If you need more to use our custom options such as JS rendering or our Premium Proxy take a look at our full documentation. To get your API Key, you just need to create an account here. Of course, don't forget to replace "YOURURL" by the URL of the page you want to scrape.PopularURL encoding
When you are making an API call to ScrapingBee, you can pass different parameters as query strings. Here is an example: Now what happens when the URL you want to scrape also contains a query parameter named param1? In this case, the API wPopularWhat to do if my request fails?
Please find below the most commons ways to fix your API call. Check that your URL is correctly encoded. Most of the errors you will encounter will be because you haven't correctly encoded your URL. To do this quickly you can go on this website, and click on the encode button. If you need to do this programmatically, learn here how to do it. Disable blockresources. To speed up yourPopularThe webpage I'm trying to scrape haven't rendered completely
If you don't use JavaScript rendering: If you're not using JavaScript rendering (renderjs=False) and your page doesn't render completely, chances are that you are scraping a webpage that needs javascript rendering to work correctly. Some websites need to be rendered inside a real browser to work and load all information correctly. To enable this using our API you need to use renderjs=True on your API call (documentation) (https://www.scrapingbee.com/documentation/javascript-renderiSome readersHow to use concurrency?
According to the plan you chose, you will have access to a specific number of concurrent request. This means that you'll be able to only do a specific number of request at the same time. For example, if you need to make 100 requests and have an allowed concurrency of 5, it means that you can send 5 requests at the same time. The simplest way for you to take advantage of this concurrency is to set up 5 workers / threads and having each of them send 20 requests. Below you'll find some resourcesSome readersHow to intercept XHR / Ajax requests?
Some web pages will load data dynamically using Ajax requests and not load them on the main webpage. So if you are requesting those web pages using our API, you might not be able to find the data you're interested in in the returned HTML. But you can use the responsejson=True parameter (documentation). If you use this parameter, your API requests will return a JSON response. In that JSON response, you will be able to access all tSome readersHow can I bypass Google’s cookie consent page?
According to Google's EU User Consent Policy, the company must make certain disclosures to users in the European Economic Area (EEA) and the UK, and obtain their consent for the use of cookies or other local storage where legally required. And this is why you get that popup agreement page whenever you try to scrape Google. So to fix this issue and bypass it, we need to accept Google's terms, and we have to approaches to do that: 1. Using a CONSENT Cookie: Whenever you accept Google's termSome readersWhy am I not seeing images when using the API in my browser?
If you try to put an API URL (https://app.scrapingbee.com/api/v1/...) in your browser, the website might look a bit weird: DuckDuckGo.com rendered in a browser through ScrapingBee API The website renders weirdly in your browser because ScrapingBee API will only returns you the HTML of the page, it won't return you all the other assets loaded by the page. And while those assets (images, JavaScript, CSS, fSome readersUnderstanding JavaScript rendering.
Some websites are making heavy use of Javascript and AJAX call. Typically for those websites, when you visit them with your web browser the page returned by the server is an empty HTML skeleton. Once the HTML hit your browser, the Javascript framework will call it's internal code and fire one or many HTTP requests to call some other backend APIs and populate the page with useful data. It's basically what most Single Page Application frameworks do. This process can take a few seconds on when lFew readersDo you support LinkedIn scraping?
Currently ScrapingBee is not able to scrape LinkedIn.com.Few readersScraping an E-commerce product page
Don't hesitate to take a look at our Python web scraping 101 to read a detailed introduction to these libraries. We also have a lot of tutorials in different languages in our web scraping blog. We are going to extract the price, image URL, and product name for this product: https://clever-lichterman-044f16.netlify.app/products/taba-cream.1/ (https://clever-lichterman-044f16.netlify.app/producFew readersHow to make screenshots?
You can get back a screenshot of the webpage you want to scrape by using the screenshot=True parameter. You can find below how to do it: Curl PythonFew readersWaiting for some Javascript code to execute.
By default, ScrapingBee waits for 2000 milliseconds before returning the HTML. You can increase this value by adding using the wait parameter (documentation).Few readersWhat kind of headers are returned by the API?
The API returns two types of header. The one coming from our web app server, and the one coming from the scraped website. In normal mode, we prefix headers coming from the scraped website by Spb- in order to differentiate them from the ones from our web app server. On top of that, we also add 3 headers: Spb-cost: "Request cost in credits." Spb-initial-status-code: "The initial status code returned by the scraped page. Useful when the page redirects" Spb-resolved-url: "The resolvFew readersCommon proxy questions
Do you provide Mobile/Residential/Data center proxies? We currently do not provide direct access to ScrapingBee's proxies, however, you can use our premium proxies to perform web scraping tasks through our API. To do that, you'll have to add the parameter premiumproxy=true to your request. You can also specify a specific geolocation for the proxy using the parameter countrycode, for example, by putting countrycode=de you will use the German IP addresses. This option only works withFew readersSSL certificate errors
If you're getting SSL errors from our ScrapingBee, here are the most common ways to fix it: Update your OpenSSL version https://letsencrypt.org/docs/dst-root-ca-x3-expiration-september-2021/ We're using Let's Encrypt certificate on the app.scrapingbee.com subdomain. Since September 30th 2021, Let's Encrypt uses SRG Root X1 certificates. Those certificates are only accepted by OpenSSL version 1.10 or later. Disable SSL verification with your HTTP client In Python, you can dFew readersHow to forward headers?
You might need to forward specific headers to the website that you want to scrape. In order to forward headers, you must set forwardheaders to true and then pass your custom headers. You must then prefix the headers to forward to the website with "Spb-" (for ScraPingBee). This prefix will be trimmed by ScrapingBee and headers will be forwarded to the target web page. Example : If you want to send the header Accept-Language: En-US, add the header: Spb-Accept-Language: En-US anFew readersDo you cache requests?
We do not cache the requests you make using ScrapingBee's API. You will always get the latest, most fresh version of the website or page you're scraping.Few readersHow to use Proxy Mode in Postman
If you need to use our proxies in the Postman app, please follow the below steps: In the Postman app, open your workspace and click on the agent selector on the bottom right side of your screen Click on 'Desktop Agent' and then click on 'Download Desktop Agent' Install and Run the 'Postman Agent' app Click on the gear icon on the top right side of your screen and tFew readers