Scraping as an Authenticated User
In many scenarios, extracting data from behind an authentication wall is necessary. To effectively perform this task with ScrapeNinja, it’s essential to understand how web authentication works in general.
Using a Chrome active session and Chrome DevTools will be instrumental in debugging how a website authenticates you.
Cookie-Based Authentication
Cookie-based authentication remains the standard on the web. When you submit a login form on a website, it sends a set of headers, with set-cookie: <name>=<val>
being the most significant. Your browser then sends this back to the server in the cookie: <name>=<val>
format.
Login Request
Login Cookie Response
Subsequent Requests with Cookie Attached
As long as you provide this cookie, you remain authenticated.
Typically, you can use the Copy as cURL command on an authenticated request you want to scrape, ensure the cookie:
header is present, and then use the ScrapeNinja cURL to Scraper tool to convert this request for ScrapeNinja request.
Security Considerations
Be aware that sharing your "cookie" values with someone essentially gives them access to your user session. Protect your cookies as you would any sensitive information.