Skip to main content

/scrape-js: scraping with JS rendering

POST 

/scrape-js

Launches real Chrome browser engine, use only when /scrape endpoint features are not enough.

Request

Body

required
    url stringrequired

    URL to scrape

    waitForSelector string

    CSS selector to wait to appear in DOM tree before page is considered as loaded.

    dumpIframe string

    If some particular iframe needs to be dumped, specify its name HTML value in this argument. The ScrapeNinja JS renderer will wait for CSS selector to wait for iframe DOM elements to appear inside.

    waitForSelectorIframe string

    If dumpIframe is activated, this property allows to wait for CSS selector inside this iframe.

    extractorTargetIframe boolean

    If dumpIframe is activated, this property allows to run JS extractor function against iframe HTML instead of running it against base body. This is only useful if dumpIframe is activated.

    headers string[]

    Custom headers to send with the request. By default, regular Chrome browser headers are sent to the target URL.

    retryNum integer

    Amount of attempts.

    geo string

    Default value: us

    Geo location for basic proxy pools (you can purchase premium ScrapeNinja proxies for wider country selection and higher proxy quality). Read more about ScrapeNinja proxy setup

    proxy string

    Premium or your own proxy URL (overrides geo field). Read more about ScrapeNinja proxy setup

    timeout integer

    Default value: 16

    Timeout per attempt, in seconds. Each retry will take [timeout] number of seconds.

    textNotExpected string[]

    Text which will trigger a retry from another proxy address.

    statusNotExpected integer[]

    Default value: 403,502

    HTTP response statuses which will trigger a retry from another proxy address.

    blockImages boolean

    Block images from loading. This will speed up page loading and reduce bandwidth usage.

    blockMedia boolean

    Block (CSS, fonts) from loading. This will speed up page loading and reduce bandwidth usage.

    screenshot boolean

    Take a screenshot of the page. Pass "false" to increase the speed of the request.

    catchAjaxHeadersUrlMask string

    Useful to dump some XHR response. Pass URL mask here. For example, if you need to catch all requests to https://example.com/api/data.json, pass "api/data.json" here. In response, you will get new property .info.catchedAjax with the XHR response data - { url, method, headers[], body , status, responseHeaders{} }

    viewport object

    Advanced. Set custom viewport size. By default, viewport size is 1920x1080.

    width integer
    height integer
    deviceScaleFactor integer
    hasTouch boolean
    isMobile boolean
    isLandscape boolean
    extractor string

    Custom JS function to extract JSON values from scraped HTML. Write&test your own extractor on https://scrapeninja.net/cheerio-sandbox/

Responses

OK

Schema
    info object
    statusCode integerrequired
    finalUrl stringrequired
    catchedAjax object
    url string
    method string
    headers string[]

    Array of request headers

    body string
    bodyIframe string

    If dumpIframe is activated, this property contains iframe HTML.

    status integer
    responseHeaders object

    Object with response headers

    headers string[]required
    body string

    HTML body of the rendered page.

Loading...