Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/smyja/blackmaria

Python package for webscraping in Natural language
https://github.com/smyja/blackmaria

gpt-3 nlp openai python webscraping

Last synced: 3 months ago
JSON representation

Python package for webscraping in Natural language

Awesome Lists containing this project

README

        

## Black Maria

### Getting Started
#### Prerequisites
- [Python 3.6+](https://www.python.org/downloads/)

#### Installation
- export `OPEN_AI_KEY` to your environment variables
- `pip install blackmaria`

### What is Black Maria?
Black Maria is a Python library for web scraping any webpage using natural language.

### How to use Black Maria?
Black Maria uses [guardrails](https://github.com/ShreyaR/guardrails). Guardrails are a set of instructions that tell the LLM what the output should look like.

```python
from blackmaria import maria

url="https://yellowjackets.fandom.com/wiki/F_Sharp"
spec=("""



















Query string here.

@xml_prefix_prompt

{output_schema}

@json_suffix_prompt_v2_wo_none


""")
query="provide details about the movie,summary,cast,cast.starring,cast.guest_starring,cast.co-starring"
query_response=maria.night_crawler(url=url,spec=spec,query=query)
print(query_response)

```
### Output
```json
{
"movie": {
"summary": "As the teens get their bearings among the wreckage, Misty finds hell on earth quite becoming. In the present: revenge, sex homework and the policeman formerly known as Goth.",
"cast": {
"starring": [
"Lottie Matthews",
"Vanessa Palmer",
"Misty Quigley",
"Shauna Sadecki",
"Natalie Scatorccio",
"Taissa Turner"
],
"guest_starring": [
"Akilah",
"Laura Lee",
"Mari",
"Adam Martin",
"Javi Martinez",
"Travis Martinez",
"Jessica Roberts",
"Jeff Sadecki",
"Ben Scott",
"Jackie Taylor"
],
"co-starring": ["Kevyn Tan", "Simone"]
}
}
}

```