/scrape-js: scraping with JS rendering

POST /scrape-js

Launches real Chrome browser engine, use only when /scrape endpoint features are not enough.

Request

application/json

Body

required

url stringrequired

URL to scrape

waitForSelector string

CSS selector to wait to appear in DOM tree before page is considered as loaded.

dumpIframe string

If some particular iframe needs to be dumped, specify its name HTML value in this argument. The ScrapeNinja JS renderer will wait for to appear in DOM, then use waitForSelectorIframe CSS selector to wait for iframe DOM elements to appear inside.

waitForSelectorIframe string

If dumpIframe is activated, this property allows to wait for CSS selector inside this iframe.

extractorTargetIframe boolean

If dumpIframe is activated, this property allows to run JS extractor function against iframe HTML instead of running it against base body. This is only useful if dumpIframe is activated.

headers string[]

Custom headers to send with the request. By default, regular Chrome browser headers are sent to the target URL.

retryNum integer

Amount of attempts.

geo string

Default value: us

Geo location.

proxy string

Custom proxy URL (overrides geo field). Available starting from ULTRA plan.

timeout integer

Default value: 16

Timeout per attempt, in seconds. Each retry will take [timeout] number of seconds.

textNotExpected string[]

Text which will trigger a retry from another proxy address.

statusNotExpected integer[]

Default value: 403,502

HTTP response statuses which will trigger a retry from another proxy address.

blockImages boolean

Block images from loading. This will speed up page loading and reduce bandwidth usage.

blockMedia boolean

Block (CSS, fonts) from loading. This will speed up page loading and reduce bandwidth usage.

screenshot boolean

Take a screenshot of the page. Pass "false" to increase the speed of the request.

catchAjaxHeadersUrlMask string

Useful to dump some XHR response. Pass URL mask here. For example, if you need to catch all requests to https://example.com/api/data.json, pass "api/data.json" here. In response, you will get new property .info.catchedAjax with the XHR response data - { url, method, headers[], body , status, responseHeaders{} }

viewport object

Advanced. Set custom viewport size. By default, viewport size is 1920x1080.

width integer

height integer

deviceScaleFactor integer

hasTouch boolean

isMobile boolean

isLandscape boolean

extractor string

Custom JS function to extract JSON values from scraped HTML. Write&test your own extractor on https://scrapeninja.net/cheerio-sandbox/

Responses

application/json

Schema
Example (from schema)

Schema

info object

statusCode integerrequired

finalUrl stringrequired

catchedAjax object

url string

method string

headers string[]

Array of request headers

body string

bodyIframe string

If dumpIframe is activated, this property contains iframe HTML.

status integer

responseHeaders object

Object with response headers

headers string[]required

body string

HTML body of the rendered page.

{
  "info": {
    "statusCode": 200,
    "finalUrl": "https://example.com/url",
    "catchedAjax": {
      "url": "https://example.com/api/data.json",
      "method": "GET",
      "headers": [
        "content-type: xxx",
        "header2: val2"
      ],
      "body": "<html><body><h1>Hello World!</h1></body></html>",
      "bodyIframe": "<html><body><h1>Iframe content</h1></body></html>",
      "status": 200,
      "responseHeaders": {
        "content-type": "application/json"
      }
    },
    "headers": [
      "content-type: xxx",
      "header2: val2"
    ]
  },
  "body": "<html><body><h1>Hello World!</h1></body></html>"
}

/scrape-js: scraping with JS rendering

/scrape-js

Request​

Body

Responses​

Request

Responses