https://github.com/hasdata/login-examples
This repository provides practical examples of handling authentication in web scraping using both Python and Node.js. It covers multiple login techniques - from simple form submissions to handling cookies, sessions and CSRF tokens.
https://github.com/hasdata/login-examples
browser-automation cookie-handling csrf login-automation nodejs playwright puppeteer python selenium seleniumbase session-persistence
Last synced: 5 days ago
JSON representation
This repository provides practical examples of handling authentication in web scraping using both Python and Node.js. It covers multiple login techniques - from simple form submissions to handling cookies, sessions and CSRF tokens.
- Host: GitHub
- URL: https://github.com/hasdata/login-examples
- Owner: HasData
- Created: 2025-05-07T14:42:42.000Z (12 months ago)
- Default Branch: main
- Last Pushed: 2025-05-19T13:35:27.000Z (11 months ago)
- Last Synced: 2025-06-19T02:37:50.212Z (10 months ago)
- Topics: browser-automation, cookie-handling, csrf, login-automation, nodejs, playwright, puppeteer, python, selenium, seleniumbase, session-persistence
- Language: JavaScript
- Homepage: https://hasdata.com/blog/scrape-website-with-login-python
- Size: 325 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README


# Login Automation Examples (Python & Node.js)
[](https://hasdata.com/)
This repository contains practical login automation examples using **Python** and **Node.js**. It demonstrates how to handle different authentication flows: from basic form submissions to CSRF-protected logins and browser automation with Selenium, Puppeteer, SeleniumBase and Playwright.
## Table of Contents
1. [Requirements](#requirements)
2. [Project Structure](#project-structure)
3. [Login Examples](#login-examples)
- [Simple Login](#simple-login)
- [CSRF Token Login](#csrf-token-login)
- [Selenium/Puppeteer Login](#seleniumpuppeteer-login)
- [SeleniumBase/Playwright Login](#seleniumbaseplaywright-login)
- [Session Persistence](#session-persistence)
- [Cookie Handling](#cookie-handling)
4. [Disclaimer](#disclaimer)
## Requirements
**Python 3.10+** or **Node.js 18+**
### Python Setup
Required packages include:
* `requests`
* `beautifulsoup4`
* `selenium`
* `seleniumbase`
Command:
```bash
pip install requests beautifulsoup4 selenium seleniumbase
```
### Node.js Setup
Includes:
* `axios`
* `puppeteer`
* `playwright`
Command:
```bash
npm install axios puppeteer playwright
```
## Project Structure
```
login-examples/
│
├── python/
│ ├── simple_login_get.py
│ ├── simple_login_post.py
│ ├── csrf_token_login.py
│ ├── selenium_login.py
│ ├── seleniumbase_login.py
│ ├── session_persistence.py
│ ├── cookie_handling.py
│
├── nodejs/
│ ├── simple_login.js
│ ├── simple_login_post.js
│ ├── csrf_token_login.js
│ ├── puppeteer_login.js
│ ├── playwright_login.js
│ ├── session_persistence.js
│ ├── cookie_handling.js
│
└── README.md
```
Each script is self-contained and targets a different type of login logic.
## Login Examples
Replace `URL`, `username`, and `password` with your actual values before running.
### Simple Login
Demonstrates a simple POST/GET login using email and password. This method works by sending the credentials directly via a POST/GET request.
| Description | Python | Node.js |
| ----------------------- | -------------------------------------------- | ----------------------------------------------------------------- |
| **Authenticate (POST)** | `requests.post(login_url, data=payload)` | `axios.post(loginUrl, payload)` |
| **Authenticate (GET)** | `requests.get(login_url, auth=('user', 'pass'))` | `axios.get(loginUrl, { auth: { username: 'user', password: 'pass' } })` |
### CSRF Token Login
Shows how to handle CSRF tokens during login. It extracts the CSRF token from the page and includes it in the login request for protection against CSRF attacks.
| Description | Python | Node.js |
| ---------------------------- | ----------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------- |
| **GET login page** | `resp = session.get(login_url)`
`resp.raise_for_status()` | `const getResp = await axios.get(loginUrl, { withCredentials: true });` |
| **Extract CSRF token** | `csrf_token = soup.select_one("input[name='_token']")['value']` | `const $ = cheerio.load(getResp.data);`
`const csrf = $('input[name="_token"]').attr('value');` |
| **POST credentials + token** | `session.post(login_url, data=payload)`
`post_resp.raise_for_status()` | `await axios.post(loginUrl, payload.toString(), { headers: { 'Content-Type': 'application/x-www-form-urlencoded' }, withCredentials: true });` |
| **GET protected page** | `protected = session.get(protected_url)`
`protected.raise_for_status()` | `const protectedResp = await axios.get(protectedUrl, { withCredentials: true });` |
### Selenium/Puppeteer Login
Uses **Selenium** to automate login by interacting with the page and filling in the login form (useful for sites that require interaction with JavaScript elements or CAPTCHAs).
| Description | Python | Node.js |
| ------------------ | ----------------------------------------------------------------- | ------------------------------------------------------------- |
| **Init browser** | `driver = webdriver.Chrome(...)` | `const browser = await puppeteer.launch({ headless: true });` |
| **Go to page** | `driver.get("URL")` | `await page.goto("URL")` |
| **Enter login** | `driver.find_element(...,"email").send_keys("admin@example.com")` | `await page.type('input[name="email"]','admin@example.com')` |
| **Enter password** | `driver.find_element(...,"password").send_keys("password")` | `await page.type('input[name="password"]','password')` |
| **Submit form** | `driver.find_element(...,'button[type=\"submit\"]').click()` | `await page.click('button[type=\"submit\"]')` |
### SeleniumBase/Playwright Login
An advanced version of Selenium, **SeleniumBase**, is used for browser automation. It simplifies Selenium operations, making it easier to write tests and automation scripts.
| Description | Python | Node.js |
| ------------------------ | --------------------------------------------- | ---------------------------------------------------------- |
| **Open login page** | `sb.open(login_url)` | `await page.goto(login_url, { waitUntil: 'networkidle' })` |
| **Fill username** | `sb.type('input[name="username"]', username)` | `await page.fill('input[name="username"]', username)` |
| **Fill password** | `sb.type('input[name="password"]', password)` | `await page.fill('input[name="password"]', password)` |
| **Submit form** | `sb.click('button[type="submit"]')` | `await page.click('button[type="submit"]')` |
### Session Persistence
This example shows how to maintain session state across multiple requests, preserving cookies and authentication tokens after login.
| Description | Python | Node.js |
| ----------------------- | ------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------------------------------------- |
| **Create session** | `session = requests.Session()` | `const jar = new tough.CookieJar();`
`const client = wrapper(axios.create({ jar, withCredentials: true }));` |
| **Authenticate (POST)** | `login_resp = session.post("URL", data=payload)` | `client.post("URL", payload)` |
| **Access protected** | `dash_resp = session.get("protected_URL")` | `client.get("protected_URL")` |
### Cookie Handling
Demonstrates how to handle cookies during login. This is useful for keeping the session alive across requests or saving cookies for future use.
| Description | Python | Node.js |
| ---------------------- | --------------------------------------------------------- | ------------------------------------------------------------------------------------------------------- |
| **Save Cookies** | `import pickle`
`pickle.dump(session.cookies, f)` | `const fs = require('fs');`
`fs.writeFileSync(COOKIE_FILE, JSON.stringify(await jar.serialize()))` |
| **Load Cookies** | `pickle.load(f)` | `const data = fs.readFileSync(COOKIE_FILE);`
`await jar.deserialize(data)` |
| **Save Cookies File** | `pickle.dump(session.cookies, open('cookies.pkl', 'wb'))` | `fs.writeFileSync('cookies.json', JSON.stringify(await jar.serialize()))` |
| **Reuse Cookies File** | `load_cookies(session, 'cookies.pkl')` | `await loadCookies()` |
## Disclaimer
These examples are for **educational purposes** only. Learn more about [the legality of web scraping]("https://hasdata.com/blog/is-web-scraping-legal").