Web scraping recipes
These examples apply for both /scrape
and /scrape-js
endpoints. We use /scrape
in code below, for brevity.
Specify proxy geo
curl --request POST \
--url https://scrapeninja.p.rapidapi.com/scrape \
--header 'X-RapidAPI-Host: scrapeninja.p.rapidapi.com' \
--header 'X-RapidAPI-Key: YOUR KEY' \
--header 'content-type: application/json' \
--data '{
"url": "https://website.com/post.php",
"geo": "fr"
}'
Sometimes anti-scraping protections perform cookie calculations based on IP address, and in this case sticky proxies are required (Sticky proxy
means that several HTTP requests can be performed using the same source IP address).
ScrapeNinja default proxy pools provide shared rotating proxies only (this means each HTTP request is performed using new IP address), with basic IP reputation.
If you need wider selection of countries to make your requests from, higher proxy quality, and sticky proxies, ScrapeNinja premium proxies are recommended. Refer to ScrapeNinja proxy setup page for more information.
Building 2-step scrapers
It is possible to build a 2-step scrapers using ScrapeNinja, where first step executes expensive
(slow) call to /scrape-js
to perform some JS calculations of target website and dump cookies to a file, and then re-use these cookies in subsequent requests using /scrape
.
Here is an example of such implementation:
https://github.com/restyler/airlines-scraper-example