https://github.com/henrique-coder/perplexity-webui-scraper
Python scraper to extract AI responses from Perplexity's web interface.
https://github.com/henrique-coder/perplexity-webui-scraper
ai perplexity scraper webui
Last synced: 4 months ago
JSON representation
Python scraper to extract AI responses from Perplexity's web interface.
- Host: GitHub
- URL: https://github.com/henrique-coder/perplexity-webui-scraper
- Owner: henrique-coder
- License: mit
- Created: 2025-03-31T13:55:33.000Z (about 1 year ago)
- Default Branch: prod
- Last Pushed: 2026-02-09T19:47:22.000Z (5 months ago)
- Last Synced: 2026-02-09T23:12:58.698Z (5 months ago)
- Topics: ai, perplexity, scraper, webui
- Language: Python
- Homepage: https://pypi.org/project/perplexity-webui-scraper
- Size: 985 KB
- Stars: 44
- Watchers: 2
- Forks: 12
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Perplexity WebUI Scraper
Python scraper to extract AI responses from [Perplexity's](https://www.perplexity.ai) web interface.
[](https://pypi.org/project/perplexity-webui-scraper)
[](https://pypi.org/project/perplexity-webui-scraper)
[](./LICENSE)
---
## Installation
### As a Library
```bash
# From PyPI (stable)
uv add perplexity-webui-scraper
# From GitHub prod branch (latest fixes)
uv add git+https://github.com/henrique-coder/perplexity-webui-scraper.git@prod
```
### As MCP Server
No installation required - `uvx` handles everything automatically:
```bash
# From PyPI (stable)
uvx --from perplexity-webui-scraper[mcp]@latest perplexity-webui-scraper-mcp
# From GitHub prod branch (latest fixes)
uvx --from "perplexity-webui-scraper[mcp]@git+https://github.com/henrique-coder/perplexity-webui-scraper.git@prod" perplexity-webui-scraper-mcp
# From local directory (for development)
uv --directory /path/to/perplexity-webui-scraper run perplexity-webui-scraper-mcp
```
## Requirements
- **Perplexity Pro/Max account**
- **Session token** (`__Secure-next-auth.session-token` cookie)
### Getting Your Session Token
#### Option 1: Automatic (CLI Tool)
```bash
uv run get-perplexity-session-token
```
This interactive tool will:
1. Ask for your Perplexity email
2. Send a verification code to your email
3. Accept either a 6-digit code or magic link
4. Extract and display your session token
5. Optionally save it to your `.env` file
#### Option 2: Manual (Browser)
1. Log in at [perplexity.ai](https://www.perplexity.ai)
2. Open DevTools (`F12`) → Application/Storage → Cookies
3. Copy the value of `__Secure-next-auth.session-token`
4. Store in `.env`: `PERPLEXITY_SESSION_TOKEN="your_token"`
## Quick Start
```python
from perplexity_webui_scraper import Perplexity
client = Perplexity(session_token="YOUR_TOKEN")
conversation = client.create_conversation()
conversation.ask("What is quantum computing?")
print(conversation.answer)
# Follow-up (context is preserved)
conversation.ask("Explain it simpler")
print(conversation.answer)
```
### Streaming
```python
for chunk in conversation.ask("Explain AI", stream=True):
print(chunk.answer)
```
### With Options
```python
from perplexity_webui_scraper import (
ConversationConfig,
Coordinates,
Models,
SourceFocus,
)
config = ConversationConfig(
model=Models.DEEP_RESEARCH,
source_focus=[SourceFocus.WEB, SourceFocus.ACADEMIC],
language="en-US",
coordinates=Coordinates(latitude=12.3456, longitude=-98.7654),
)
conversation = client.create_conversation(config)
conversation.ask("Latest AI research", files=["paper.pdf"])
```
## API Reference
### `Perplexity(session_token, config?)`
| Parameter | Type | Description |
| --------------- | -------------- | ------------------ |
| `session_token` | `str` | Browser cookie |
| `config` | `ClientConfig` | Timeout, TLS, etc. |
### `Conversation.ask(query, model?, files?, citation_mode?, stream?)`
| Parameter | Type | Default | Description |
| --------------- | ----------------------- | ------------- | ------------------- |
| `query` | `str` | - | Question (required) |
| `model` | `Model` | `Models.BEST` | AI model |
| `files` | `list[str \| PathLike]` | `None` | File paths |
| `citation_mode` | `CitationMode` | `CLEAN` | Citation format |
| `stream` | `bool` | `False` | Enable streaming |
### Models
| Model | Description | Tier |
| ---------------------------------- | ------------------------------------------------------------------------- | ---- |
| `Models.BEST` | Pro - Automatically selects the most responsive model based on the query | pro |
| `Models.DEEP_RESEARCH` | Deep research - Fast and thorough for routine research | pro |
| `Models.SONAR` | Sonar - Perplexity's latest model | pro |
| `Models.GEMINI_3_FLASH` | Gemini 3 Flash - Google's fast model | pro |
| `Models.GEMINI_3_FLASH_THINKING` | Gemini 3 Flash Thinking - Google's fast model | pro |
| `Models.GEMINI_31_PRO` | Gemini 3.1 Pro - Google's latest model | pro |
| `Models.GEMINI_31_PRO_THINKING` | Gemini 3.1 Pro Thinking - Google's latest model with thinking | pro |
| `Models.GPT_52` | GPT-5.2 - OpenAI's latest model | pro |
| `Models.GPT_52_THINKING` | GPT-5.2 Thinking - OpenAI's latest model with thinking | pro |
| `Models.CLAUDE_46_SONNET` | Claude Sonnet 4.6 - Anthropic's fast model | pro |
| `Models.CLAUDE_46_SONNET_THINKING` | Claude Sonnet 4.6 Thinking - Anthropic's newest reasoning model | pro |
| `Models.CLAUDE_46_OPUS` | Claude Opus 4.6 - Anthropic's most advanced model | max |
| `Models.CLAUDE_46_OPUS_THINKING` | Claude Opus 4.6 Thinking - Anthropic's Opus reasoning model with thinking | max |
| `Models.GROK_41` | Grok 4.1 - xAI's latest model | pro |
| `Models.GROK_41_THINKING` | Grok 4.1 Thinking - xAI's latest model | pro |
| `Models.KIMI_K25_THINKING` | Kimi K2.5 - Moonshot AI's latest model | pro |
### CitationMode
| Mode | Output |
| ---------- | --------------------- |
| `DEFAULT` | `text[1]` |
| `MARKDOWN` | `text[1](url)` |
| `CLEAN` | `text` (no citations) |
### ConversationConfig
| Parameter | Default | Description |
| ----------------- | ------------- | ------------------ |
| `model` | `Models.BEST` | Default model |
| `citation_mode` | `CLEAN` | Citation format |
| `save_to_library` | `False` | Save to library |
| `search_focus` | `WEB` | Search type |
| `source_focus` | `WEB` | Source types |
| `time_range` | `ALL` | Time filter |
| `language` | `"en-US"` | Response language |
| `timezone` | `None` | Timezone |
| `coordinates` | `None` | Location (lat/lng) |
## Exceptions
| Exception | Description |
| ---------------------------------- | -------------------------------------------------- |
| `PerplexityError` | Base exception for all library errors |
| `HTTPError` | HTTP error with status code and response body |
| `AuthenticationError` | Session token is invalid or expired (HTTP 401/403) |
| `RateLimitError` | Rate limit exceeded (HTTP 429) |
| `FileUploadError` | File upload failed |
| `FileValidationError` | File validation failed (size, type, etc.) |
| `ResearchClarifyingQuestionsError` | Research mode asking clarifying questions |
| `ResponseParsingError` | API response could not be parsed |
| `StreamingError` | Error during streaming response |
## MCP Server (Model Context Protocol)
The library includes an MCP server for AI assistants like Claude Desktop and Antigravity.
Each AI model is exposed as a separate tool - enable only the ones you need to reduce agent context size.
### Configuration
Add to your MCP config file (no installation required):
**Claude Desktop** (`~/.config/claude/claude_desktop_config.json`):
```json
{
"mcpServers": {
"perplexity-webui-scraper": {
"command": "uvx",
"args": [
"--from",
"perplexity-webui-scraper[mcp]@latest",
"perplexity-webui-scraper-mcp"
],
"env": {
"PERPLEXITY_SESSION_TOKEN": "your_token_here"
}
}
}
}
```
**From GitHub dev branch:**
```json
{
"mcpServers": {
"perplexity-webui-scraper": {
"command": "uvx",
"args": [
"--from",
"perplexity-webui-scraper[mcp]@git+https://github.com/henrique-coder/perplexity-webui-scraper.git@prod",
"perplexity-webui-scraper-mcp"
],
"env": {
"PERPLEXITY_SESSION_TOKEN": "your_token_here"
}
}
}
}
```
**From local directory (for development):**
```json
{
"mcpServers": {
"perplexity-webui-scraper": {
"command": "uv",
"args": [
"--directory",
"/absolute/path/to/perplexity-webui-scraper",
"run",
"perplexity-webui-scraper-mcp"
],
"env": {
"PERPLEXITY_SESSION_TOKEN": "your_token_here"
}
}
}
}
```
### Available Tools
Each tool uses a specific AI model. Enable only the ones you need:
| Tool | Model | Description | Tier |
| ------------------------- | -------------------------- | ------------------------------------------------------------------ | ---- |
| `pplx_ask` | Pro | Automatically selects the most responsive model based on the query | pro |
| `pplx_deep_research` | Deep research | Fast and thorough for routine research | pro |
| `pplx_sonar` | Sonar | Perplexity's latest model | pro |
| `pplx_gemini_flash` | Gemini 3 Flash | Google's fast model | pro |
| `pplx_gemini_flash_think` | Gemini 3 Flash Thinking | Google's fast model | pro |
| `pplx_gemini31_pro` | Gemini 3.1 Pro | Google's latest model | pro |
| `pplx_gemini31_pro_think` | Gemini 3.1 Pro Thinking | Google's latest model with thinking | pro |
| `pplx_gpt54` | GPT-5.4 | OpenAI's latest model | pro |
| `pplx_gpt54_thinking` | GPT-5.4 Thinking | OpenAI's latest model with thinking | pro |
| `pplx_claude_s46` | Claude Sonnet 4.6 | Anthropic's fast model | pro |
| `pplx_claude_s46_think` | Claude Sonnet 4.6 Thinking | Anthropic's newest reasoning model | pro |
| `pplx_claude_o46` | Claude Opus 4.6 | Anthropic's most advanced model | max |
| `pplx_claude_o46_think` | Claude Opus 4.6 Thinking | Anthropic's Opus reasoning model with thinking | max |
| `pplx_grok41` | Grok 4.1 | xAI's latest model | pro |
| `pplx_grok41_think` | Grok 4.1 Thinking | xAI's latest model | pro |
| `pplx_kimi_k25_think` | Kimi K2.5 | Moonshot AI's latest model | pro |
**All tools support `source_focus`:** `web`, `academic`, `social`, `finance`, `all`
## Disclaimer
This is an **unofficial** library. It uses internal APIs that may change without notice. Use at your own risk.
By using this library, you agree to Perplexity AI's Terms of Service.