/v2/scrape-js: scraping with JS rendering, new engine
POST/v2/scrape-js
Launches real Chrome browser engine, better success rate for websites protected by Cloudflare, Datadome, Kasada, PermimeterX. ATTENTION: Only available via APIRoad!
Request
- application/json
Body
required
URL to scrape
CSS selector to wait to appear in DOM tree before page is considered as loaded.
Wait for specified amount of seconds after page load (from 1 to 12s). Use this only if ScrapeNinja failed to wait for required page elements automatically.
Custom headers to send with the request. By default, regular Chrome browser headers are sent to the target URL.
Amount of attempts.
Default value: us
Geo location for basic proxy pools (you can purchase premium ScrapeNinja proxies for wider country selection and higher proxy quality). Read more about ScrapeNinja proxy setup
Premium or your own proxy URL (overrides geo
field). Read more about ScrapeNinja proxy setup
Default value: 16
Timeout per attempt, in seconds. Each retry will take [timeout] number of seconds.
Text which will trigger a retry from another proxy address.
Default value: 403,502
HTTP response statuses which will trigger a retry from another proxy address.
Block images from loading. This will speed up page loading and reduce bandwidth usage.
Block (CSS, fonts) from loading. This will speed up page loading and reduce bandwidth usage.
Take a screenshot of the page. Pass "false" to increase the speed of the request.
Custom JS function to extract JSON values from scraped HTML. Write&test your own extractor on https://scrapeninja.net/cheerio-sandbox/
Responses
- 200
OK
- application/json
- Schema
- Example (from schema)
Schema
info object
catchedAjax object
Array of request headers
If dumpIframe
is activated, this property contains iframe HTML.
Object with response headers
HTML body of the rendered page.
{
"info": {
"statusCode": 200,
"finalUrl": "https://example.com/url",
"catchedAjax": {
"url": "https://example.com/api/data.json",
"method": "GET",
"headers": [
"content-type: xxx",
"header2: val2"
],
"body": "<html><body><h1>Hello World!</h1></body></html>",
"bodyIframe": "<html><body><h1>Iframe content</h1></body></html>",
"status": 200,
"responseHeaders": {
"content-type": "application/json"
}
},
"headers": [
"content-type: xxx",
"header2: val2"
]
},
"body": "<html><body><h1>Hello World!</h1></body></html>"
}