Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/samber/the-great-gpt-firewall

๐Ÿค– A curated list of websites that restrict access to AI Agents, AI crawlers and GPTs
https://github.com/samber/the-great-gpt-firewall

agent anthropic blocklist censorship crawler firewall genai generative-ai gpt gpt-4 llm openai robots-txt user-agent

Last synced: 5 days ago
JSON representation

๐Ÿค– A curated list of websites that restrict access to AI Agents, AI crawlers and GPTs

Awesome Lists containing this project

README

        

# The Great GPT Firewall ๐Ÿ“›

This collection is a curated list of websites that employ the `robots.txt` file to restrict access to AI Agents, AI crawlers and GPTs.

It will be updated monthly.

We need a plan!

## User agents & robots.txt

The `robots.txt` file allows website owners to control and limit the access of these user agents to certain areas of their website by specifying rules and directives.

```txt
# OpenAIโ€™s web crawler: GPT3.5, GPT4, ChatGPT
# https://platform.openai.com/docs/bots
User-agent: GPTBot

# ChatGPT plugins
# https://platform.openai.com/docs/bots
User-agent: ChatGPT-User

# OpenAI Search bot
# https://platform.openai.com/docs/bots
User-agent: OAI-SearchBot

# Google's web crawler: Bard, VertexAI, Gemini
# https://blog.google/technology/ai/an-update-on-web-publisher-controls/
User-agent: Google-Extended

# Apple's web crawler, dedicated to GenAI projects
# https://support.apple.com/en-us/119829
User-agent: Applebot-Extended

# Claude
User-agent: anthropic-ai

# Claude Bot
User-agent: ClaudeBot

# Claude web
User-agent: Claude-Web

# Cohere
User-agent: Cohere-ai

# Perplexity
User-agent: PerplexityBot

# Common Crawl
# https://commoncrawl.org/ccbot
User-agent: CCBot

# Omglibot: webz.io
# https://webz.io/blog/web-data/what-is-the-omgili-bot-and-why-is-it-crawling-your-website/
User-agent: Omgilibot
User-agent: Omgili
User-agent: Webzio-Extended

# Facebook: Llama
# https://developers.facebook.com/docs/sharing/bot/
User-agent: FacebookBot

# ByteDance: Duobao
User-agent: Bytespider

# Censorship area
Disallow: /
```

## Disclaimer

Please note that this blocklist is intended for informational purposes only. Despite the provoking project name, it's fine to disallow web crawling and protect content ownership.

## 2024-05 update

### Category: Press

- Scanned: 66
- โœ… Passing: 38 %
- ๐Ÿ” Blocked: 62 %
- โ“ Unknown: 0 %

| Name | Country | Status |
| --------------------------------------------------------------- | ------- | ------ |
| [The Times](https://www.thetimes.co.uk) | ๐Ÿ‡ฌ๐Ÿ‡ง | ๐Ÿ” |
| [BBC](https://www.bbc.com) | ๐Ÿ‡ฌ๐Ÿ‡ง | ๐Ÿ” |
| [The Guardian](https://www.theguardian.com) | ๐Ÿ‡ฌ๐Ÿ‡ง | ๐Ÿ” |
| [The Economist](https://www.economist.com) | ๐Ÿ‡ฌ๐Ÿ‡ง | ๐Ÿ” |
| [Financial Times](https://www.ft.com) | ๐Ÿ‡ฌ๐Ÿ‡ง | ๐Ÿ” |
| [The Independent](https://www.independent.co.uk) | ๐Ÿ‡ฌ๐Ÿ‡ง | โœ… |
| [The Telegraph](https://www.telegraph.co.uk) | ๐Ÿ‡ฌ๐Ÿ‡ง | ๐Ÿ” |
| [Daily Mail](https://www.dailymail.co.uk) | ๐Ÿ‡ฌ๐Ÿ‡ง | ๐Ÿ” |
| [The Sun](https://www.thesun.co.uk) | ๐Ÿ‡ฌ๐Ÿ‡ง | ๐Ÿ” |
| [Daily Mirror](https://www.mirror.co.uk) | ๐Ÿ‡ฌ๐Ÿ‡ง | ๐Ÿ” |
| [Daily Express](https://www.express.co.uk) | ๐Ÿ‡ฌ๐Ÿ‡ง | ๐Ÿ” |
| [Washington Post](https://www.washingtonpost.com) | ๐Ÿ‡บ๐Ÿ‡ธ | ๐Ÿ” |
| [USA Today](https://www.usatoday.com) | ๐Ÿ‡บ๐Ÿ‡ธ | โœ… |
| [Fox News](https://www.foxnews.com) | ๐Ÿ‡บ๐Ÿ‡ธ | โœ… |
| [ABC News](https://abcnews.go.com) | ๐Ÿ‡บ๐Ÿ‡ธ | ๐Ÿ” |
| [NBC News](https://www.nbcnews.com) | ๐Ÿ‡บ๐Ÿ‡ธ | ๐Ÿ” |
| [CBS News](https://www.cbsnews.com) | ๐Ÿ‡บ๐Ÿ‡ธ | ๐Ÿ” |
| [Los Angeles Times](https://www.latimes.com) | ๐Ÿ‡บ๐Ÿ‡ธ | ๐Ÿ” |
| [Chicago Tribune](https://www.chicagotribune.com) | ๐Ÿ‡บ๐Ÿ‡ธ | โœ… |
| [New York Post](https://nypost.com) | ๐Ÿ‡บ๐Ÿ‡ธ | ๐Ÿ” |
| [New York Daily News](https://www.nydailynews.com) | ๐Ÿ‡บ๐Ÿ‡ธ | โœ… |
| [The New Yorker](https://www.newyorker.com) | ๐Ÿ‡บ๐Ÿ‡ธ | ๐Ÿ” |
| [Vice](https://www.vice.com) | ๐Ÿ‡บ๐Ÿ‡ธ | โœ… |
| [New York Times](https://www.nytimes.com) | ๐Ÿ‡บ๐Ÿ‡ธ | ๐Ÿ” |
| [Wall Street Journal](https://www.wsj.com) | ๐Ÿ‡บ๐Ÿ‡ธ | ๐Ÿ” |
| [CNN](https://cnn.com) | ๐Ÿ‡บ๐Ÿ‡ธ | ๐Ÿ” |
| [El Paรญs](https://elpais.com) | ๐Ÿ‡ช๐Ÿ‡ธ | โœ… |
| [Sรผddeutsche Zeitung](https://www.sueddeutsche.de) | ๐Ÿ‡ฉ๐Ÿ‡ช | ๐Ÿ” |
| [Der Spiegel](https://www.spiegel.de) | ๐Ÿ‡ฉ๐Ÿ‡ช | ๐Ÿ” |
| [Corriere della Sera](https://www.corriere.it) | ๐Ÿ‡ฎ๐Ÿ‡น | ๐Ÿ” |
| [La Repubblica](https://www.repubblica.it) | ๐Ÿ‡ฎ๐Ÿ‡น | ๐Ÿ” |
| [Le Monde](https://www.lemonde.fr) | ๐Ÿ‡ซ๐Ÿ‡ท | ๐Ÿ” |
| [Libรฉration](https://www.liberation.fr) | ๐Ÿ‡ซ๐Ÿ‡ท | ๐Ÿ” |
| [Le Figaro](https://www.lefigaro.fr) | ๐Ÿ‡ซ๐Ÿ‡ท | ๐Ÿ” |
| [20 Minutes](https://www.20minutes.fr) | ๐Ÿ‡ซ๐Ÿ‡ท | ๐Ÿ” |
| [Ouest France](https://www.ouest-france.fr) | ๐Ÿ‡ซ๐Ÿ‡ท | ๐Ÿ” |
| [Le Parisien](https://www.leparisien.fr) | ๐Ÿ‡ซ๐Ÿ‡ท | ๐Ÿ” |
| [L'Equipe](https://www.lequipe.fr) | ๐Ÿ‡ซ๐Ÿ‡ท | ๐Ÿ” |
| [Le Point](https://www.lepoint.fr) | ๐Ÿ‡ซ๐Ÿ‡ท | ๐Ÿ” |
| [Marianne](https://www.marianne.net) | ๐Ÿ‡ซ๐Ÿ‡ท | ๐Ÿ” |
| [Le Nouvel Observateur](https://www.nouvelobs.com) | ๐Ÿ‡ซ๐Ÿ‡ท | ๐Ÿ” |
| [L'Express](https://www.lexpress.fr) | ๐Ÿ‡ซ๐Ÿ‡ท | ๐Ÿ” |
| [France 24](https://www.france24.com) | ๐Ÿ‡ซ๐Ÿ‡ท | ๐Ÿ” |
| [BFMTV](https://www.bfmtv.com) | ๐Ÿ‡ซ๐Ÿ‡ท | ๐Ÿ” |
| [CNews](https://www.cnews.fr) | ๐Ÿ‡ซ๐Ÿ‡ท | โœ… |
| [Le Monde Diplomatique](https://www.monde-diplomatique.fr) | ๐Ÿ‡ซ๐Ÿ‡ท | โœ… |
| [Mediapart](https://www.mediapart.fr) | ๐Ÿ‡ซ๐Ÿ‡ท | ๐Ÿ” |
| [Courrier International](https://www.courrierinternational.com) | ๐Ÿ‡ซ๐Ÿ‡ท | ๐Ÿ” |
| [Brut](https://www.brut.media) | ๐Ÿ‡ซ๐Ÿ‡ท | โœ… |
| [IMDB](https://www.imdb.com) | ๐ŸŒ | โœ… |
| [Allocine](https://www.allocine.fr) | ๐Ÿ‡ซ๐Ÿ‡ท | โœ… |
| [Fakt](https://fakt.pl) | ๐Ÿ‡ต๐Ÿ‡ฑ | โœ… |
| [Super Express](https://www.se.pl) | ๐Ÿ‡ต๐Ÿ‡ฑ | โœ… |
| [Gazeta Wyborcza](https://wyborcza.pl) | ๐Ÿ‡ต๐Ÿ‡ฑ | ๐Ÿ” |
| [Rzeczpospolita](https://www.rp.pl) | ๐Ÿ‡ต๐Ÿ‡ฑ | โœ… |
| [Dziennik Gazeta Prawna](https://www.gazetaprawna.pl) | ๐Ÿ‡ต๐Ÿ‡ฑ | โœ… |
| [Polityka](https://www.polityka.pl) | ๐Ÿ‡ต๐Ÿ‡ฑ | โœ… |
| [Newsweek Polska](https://www.newsweek.pl) | ๐Ÿ‡ต๐Ÿ‡ฑ | โœ… |
| [Goล›ฤ‡ Niedzielny](https://www.gosc.pl) | ๐Ÿ‡ต๐Ÿ‡ฑ | โœ… |
| [Sieci](https://www.sieciprawdy.pl) | ๐Ÿ‡ต๐Ÿ‡ฑ | โœ… |
| [Do Rzeczy](https://dorzeczy.pl) | ๐Ÿ‡ต๐Ÿ‡ฑ | โœ… |
| [Twรณj Styl](https://twojstyl.pl) | ๐Ÿ‡ต๐Ÿ‡ฑ | โœ… |
| [Zwierciadล‚o](https://zwierciadlo.pl) | ๐Ÿ‡ต๐Ÿ‡ฑ | โœ… |
| [Wysokie Obcasy Extra](https://www.wysokieobcasy.pl) | ๐Ÿ‡ต๐Ÿ‡ฑ | ๐Ÿ” |
| [Pani](https://pani.pl) | ๐Ÿ‡ต๐Ÿ‡ฑ | โœ… |
| [Elle](https://www.elle.pl) | ๐Ÿ‡ต๐Ÿ‡ฑ | โœ… |

### Category: Video on demand

- Scanned: 9
- โœ… Passing: 56 %
- ๐Ÿ” Blocked: 44 %
- โ“ Unknown: 0 %

| Name | Country | Status |
| ----------------------------------------- | ------- | ------ |
| [Prime Video](https://www.primevideo.com) | ๐ŸŒ | โœ… |
| [Netflix](https://www.netflix.com) | ๐ŸŒ | โœ… |
| [Disney+](https://www.disneyplus.com) | ๐ŸŒ | ๐Ÿ” |
| [Hulu](https://www.hulu.com) | ๐Ÿ‡บ๐Ÿ‡ธ | ๐Ÿ” |
| [HBO Max](https://www.max.com) | ๐Ÿ‡บ๐Ÿ‡ธ | โœ… |
| [Canal+](https://www.canalplus.com) | ๐Ÿ‡ซ๐Ÿ‡ท | ๐Ÿ” |
| [FranceTV](https://www.france.tv) | ๐Ÿ‡ซ๐Ÿ‡ท | โœ… |
| [TF1](https://www.tf1.fr) | ๐Ÿ‡ซ๐Ÿ‡ท | ๐Ÿ” |
| [6Play](https://www.6play.fr) | ๐Ÿ‡ซ๐Ÿ‡ท | โœ… |

### Category: Music

- Scanned: 6
- โœ… Passing: 67 %
- ๐Ÿ” Blocked: 33 %
- โ“ Unknown: 0 %

| Name | Country | Status |
| -------------------------------------- | ------- | ------ |
| [Soundcloud](https://soundcloud.com) | ๐ŸŒ | ๐Ÿ” |
| [Youtube](https://www.youtube.com) | ๐ŸŒ | โœ… |
| [Apple Music](https://music.apple.com) | ๐ŸŒ | โœ… |
| [Spotify](https://open.spotify.com) | ๐ŸŒ | ๐Ÿ” |
| [Deezer](https://www.deezer.com) | ๐Ÿ‡ซ๐Ÿ‡ท | โœ… |
| [LastFM](https://www.last.fm) | ๐Ÿ‡ฌ๐Ÿ‡ง | โœ… |

### Category: Podcast

- Scanned: 8
- โœ… Passing: 75 %
- ๐Ÿ” Blocked: 25 %
- โ“ Unknown: 0 %

| Name | Country | Status |
| --------------------------------------------------- | ------- | ------ |
| [Google Podcasts](https://play.google.com) | ๐ŸŒ | โœ… |
| [Apple Podcast](https://podcasts.apple.com) | ๐ŸŒ | โœ… |
| [Spotify Podcaster](https://podcasters.spotify.com) | ๐ŸŒ | ๐Ÿ” |
| [Buzzsprout](https://www.buzzsprout.com) | ๐ŸŒ | โœ… |
| [Podbean](https://watchifyoudare.podbean.com) | ๐ŸŒ | โœ… |
| [Acast](https://www.acast.com) | ๐Ÿ‡ฌ๐Ÿ‡ง | โœ… |
| [AudioMeans](https://podcasts.audiomeans.fr) | ๐Ÿ‡ซ๐Ÿ‡ท | โœ… |
| [Radio France](https://www.radiofrance.fr) | ๐Ÿ‡ซ๐Ÿ‡ท | ๐Ÿ” |

### Category: X

- Scanned: 6
- โœ… Passing: 67 %
- ๐Ÿ” Blocked: 33 %
- โ“ Unknown: 0 %

| Name | Country | Status |
| ---------------------------------- | ------- | ------ |
| [PornHub](https://www.pornhub.com) | ๐ŸŒ | ๐Ÿ” |
| [YouPorn](https://www.youporn.com) | ๐ŸŒ | ๐Ÿ” |
| [Xnxx](https://www.xnxx.com) | ๐ŸŒ | โœ… |
| [Xvideos](https://www.xvideos.com) | ๐ŸŒ | โœ… |
| [Xhamster](https://xhamster.com) | ๐ŸŒ | โœ… |
| [OnlyFan](https://onlyfans.com) | ๐ŸŒ | โœ… |

### Category: Religion

- Scanned: 5
- โœ… Passing: 100 %
- ๐Ÿ” Blocked: 0 %
- โ“ Unknown: 0 %

| Name | Country | Status |
| --------------------------------------------- | ------- | ------ |
| [Bible](https://www.bible.com) | ๐Ÿ‡บ๐Ÿ‡ธ | โœ… |
| [Bible gateway](https://www.biblegateway.com) | ๐Ÿ‡บ๐Ÿ‡ธ | โœ… |
| [Jehovah's Witnesses](https://jw.org) | ๐Ÿ‡บ๐Ÿ‡ธ | โœ… |
| [Vatican](https://www.vatican.va) | ๐Ÿ‡ป๐Ÿ‡ฆ | โœ… |
| [Islamweb](https://www.islamweb.net) | ๐ŸŒ | โœ… |

### Category: Social media

- Scanned: 13
- โœ… Passing: 31 %
- ๐Ÿ” Blocked: 62 %
- โ“ Unknown: 8 %

| Name | Country | Status |
| ---------------------------------------------------- | ------- | ------ |
| [Facebook](https://www.facebook.com) | ๐ŸŒ | ๐Ÿ” |
| [Instagram](https://www.instagram.com) | ๐ŸŒ | ๐Ÿ” |
| [Reddit](https://www.reddit.com) | ๐ŸŒ | โœ… |
| [Hacker News](https://https://news.ycombinator.com/) | ๐ŸŒ | โ“ |
| [Lobsters](https://lobste.rs) | ๐ŸŒ | ๐Ÿ” |
| [Pinterest](https://www.pinterest.com) | ๐ŸŒ | ๐Ÿ” |
| [TikTok](https://www.tiktok.com) | ๐ŸŒ | โœ… |
| [Twitter](https://twitter.com) | ๐ŸŒ | ๐Ÿ” |
| [LinkedIn](https://linkedin.com) | ๐ŸŒ | โœ… |
| [Quora](https://quora.com) | ๐ŸŒ | ๐Ÿ” |
| [VK](https://vk.com) | ๐Ÿ‡ท๐Ÿ‡บ | โœ… |
| [TripAdvisor](https://www.tripadvisor.com) | ๐ŸŒ | ๐Ÿ” |
| [Yelp](https://www.yelp.com) | ๐ŸŒ | ๐Ÿ” |

### Category: Artist

- Scanned: 42
- โœ… Passing: 76 %
- ๐Ÿ” Blocked: 19 %
- โ“ Unknown: 5 %

| Name | Country | Status |
| -------------------------------------------------------------------- | ------- | ------ |
| [Michael Jackson](https://www.michaeljackson.com) | ๐Ÿ‡บ๐Ÿ‡ธ | โœ… |
| [Madonna](https://www.madonna.com) | ๐Ÿ‡บ๐Ÿ‡ธ | โœ… |
| [Taylor Swift](https://www.taylorswift.com) | ๐Ÿ‡บ๐Ÿ‡ธ | ๐Ÿ” |
| [Rihanna](https://www.rihanna.com) | ๐Ÿ‡บ๐Ÿ‡ธ | โœ… |
| [Bruno Mars](https://www.brunomars.com) | ๐Ÿ‡บ๐Ÿ‡ธ | โœ… |
| [Justin Bieber](https://www.justinbiebermusic.com) | ๐Ÿ‡บ๐Ÿ‡ธ | ๐Ÿ” |
| [Beyoncรฉ](https://www.beyonce.com) | ๐Ÿ‡บ๐Ÿ‡ธ | โœ… |
| [Katy Perry](https://www.katyperry.com) | ๐Ÿ‡บ๐Ÿ‡ธ | ๐Ÿ” |
| [Lady Gaga](https://www.ladygaga.com) | ๐Ÿ‡บ๐Ÿ‡ธ | ๐Ÿ” |
| [Hardwell](https://www.djhardwell.com) | ๐Ÿ‡บ๐Ÿ‡ธ | โœ… |
| [Dimitri Vegas & Like Mike](https://www.dimitrivegasandlikemike.com) | ๐Ÿ‡บ๐Ÿ‡ธ | โœ… |
| [Kanye West](https://www.kanyewest.com) | ๐Ÿ‡บ๐Ÿ‡ธ | โ“ |
| [Black Eyed Peas](https://www.blackeyedpeas.com) | ๐Ÿ‡บ๐Ÿ‡ธ | โœ… |
| [Imagine Dragons](https://www.imaginedragonsmusic.com) | ๐Ÿ‡บ๐Ÿ‡ธ | โœ… |
| [Twenty One Pilots](https://www.twentyonepilots.com) | ๐Ÿ‡บ๐Ÿ‡ธ | โœ… |
| [Maroon 5](https://www.maroon5.com) | ๐Ÿ‡บ๐Ÿ‡ธ | ๐Ÿ” |
| [Selena Gomez](https://www.selenagomez.com) | ๐Ÿ‡บ๐Ÿ‡ธ | ๐Ÿ” |
| [Usher](https://www.usherworld.com) | ๐Ÿ‡บ๐Ÿ‡ธ | ๐Ÿ” |
| [Stromae](https://www.stromae.com) | ๐Ÿ‡ง๐Ÿ‡ช | โœ… |
| [Aya Nakamura](https://www.ayanakamura.fr) | ๐Ÿ‡ซ๐Ÿ‡ท | โ“ |
| [Soprano](https://www.soprano-lesite.fr) | ๐Ÿ‡ซ๐Ÿ‡ท | โœ… |
| [Johnny Hallyday](https://www.johnnyhallyday.com) | ๐Ÿ‡ซ๐Ÿ‡ท | โœ… |
| [Grand Corps Malade](https://www.grandcorpsmalade.fr) | ๐Ÿ‡ซ๐Ÿ‡ท | โœ… |
| [Zaho](https://zahomusic.com) | ๐Ÿ‡ซ๐Ÿ‡ท | โœ… |
| [Jean Louis Aubert](https://www.jeanlouisaubert.com) | ๐Ÿ‡ซ๐Ÿ‡ท | โœ… |
| [Camelia Jordana](https://www.cameliajordana.com) | ๐Ÿ‡ซ๐Ÿ‡ท | โœ… |
| [Indochine](https://indo.fr) | ๐Ÿ‡ซ๐Ÿ‡ท | โœ… |
| [Tryo](https://tryo.com) | ๐Ÿ‡ซ๐Ÿ‡ท | โœ… |
| [David Guetta](https://davidguetta.com) | ๐Ÿ‡ซ๐Ÿ‡ท | โœ… |
| [Mc Solaar](https://www.mcsolaar.com) | ๐Ÿ‡ซ๐Ÿ‡ท | โœ… |
| [Zaz](https://www.zazofficial.com) | ๐Ÿ‡ซ๐Ÿ‡ท | โœ… |
| [Christine and the Queens](https://www.christineandthequeens.com) | ๐Ÿ‡ซ๐Ÿ‡ท | โœ… |
| [Boulevard des Airs](https://bda-boulevarddesairs.com) | ๐Ÿ‡ซ๐Ÿ‡ท | โœ… |
| [Calogero](https://calogero.fr) | ๐Ÿ‡ซ๐Ÿ‡ท | โœ… |
| [Hoshi](https://www.hoshimusic-store.com) | ๐Ÿ‡ซ๐Ÿ‡ท | โœ… |
| [Avicii](https://avicii.com) | ๐Ÿ‡ธ๐Ÿ‡ช | โœ… |
| [Adele](https://www.adele.com) | ๐Ÿ‡ฌ๐Ÿ‡ง | โœ… |
| [Calvin Harris](https://calvinharris.com) | ๐Ÿ‡ฌ๐Ÿ‡ง | โœ… |
| [Ed Sheeran](https://www.edsheeran.com) | ๐Ÿ‡ฌ๐Ÿ‡ง | โœ… |
| [Arctic Monkeys](https://arcticmonkeys.com) | ๐Ÿ‡ฌ๐Ÿ‡ง | โœ… |
| [Coldplay](https://www.coldplay.com) | ๐Ÿ‡ฌ๐Ÿ‡ง | โœ… |
| [The Weeknd](https://www.theweeknd.com) | ๐Ÿ‡จ๐Ÿ‡ฆ | ๐Ÿ” |

### Category: Gov

- Scanned: 3
- โœ… Passing: 100 %
- ๐Ÿ” Blocked: 0 %
- โ“ Unknown: 0 %

| Name | Country | Status |
| ----------------------------------------- | ------- | ------ |
| [White House](https://www.whitehouse.gov) | ๐Ÿ‡บ๐Ÿ‡ธ | โœ… |
| [Elysรฉe](https://www.elysee.fr) | ๐Ÿ‡ซ๐Ÿ‡ท | โœ… |
| [Europe](https://www.europa.eu) | ๐Ÿ‡ช๐Ÿ‡บ | โœ… |

### Category: Science

- Scanned: 28
- โœ… Passing: 82 %
- ๐Ÿ” Blocked: 18 %
- โ“ Unknown: 0 %

| Name | Country | Status |
| ------------------------------------------------------- | ------- | ------ |
| [Google Scholar](https://scholar.google.com) | ๐ŸŒ | โœ… |
| [Sci-Hub](https://sci-hub.se) | ๐ŸŒ | โœ… |
| [PubPeer](https://pubpeer.com) | ๐ŸŒ | โœ… |
| [Scopus](https://www.scopus.com) | ๐Ÿ‡ณ๐Ÿ‡ฑ | ๐Ÿ” |
| [Elsevier](https://www.elsevier.com) | ๐Ÿ‡ณ๐Ÿ‡ฑ | ๐Ÿ” |
| [ScienceDirect](https://www.sciencedirect.com) | ๐Ÿ‡ณ๐Ÿ‡ฑ | ๐Ÿ” |
| [MDPI](https://www.mdpi.com) | ๐Ÿ‡จ๐Ÿ‡ญ | โœ… |
| [Springer](https://www.springer.com) | ๐Ÿ‡ฉ๐Ÿ‡ช | โœ… |
| [Wiley](https://www.wiley.com) | ๐Ÿ‡บ๐Ÿ‡ธ | โœ… |
| [American Chemical Society](https://www.acs.org) | ๐Ÿ‡บ๐Ÿ‡ธ | โœ… |
| [PubMed](https://pubmed.ncbi.nlm.nih.gov) | ๐Ÿ‡บ๐Ÿ‡ธ | โœ… |
| [Academia](https://www.academia.edu) | ๐Ÿ‡บ๐Ÿ‡ธ | โœ… |
| [Science](https://www.science.org) | ๐Ÿ‡บ๐Ÿ‡ธ | ๐Ÿ” |
| [ArXiv](https://arxiv.org) | ๐Ÿ‡บ๐Ÿ‡ธ | โœ… |
| [American Physical Society](https://www.aps.org) | ๐Ÿ‡บ๐Ÿ‡ธ | โœ… |
| [Mendeley](https://www.mendeley.com) | ๐Ÿ‡ฌ๐Ÿ‡ง | โœ… |
| [Nature](https://www.nature.com) | ๐Ÿ‡ฌ๐Ÿ‡ง | ๐Ÿ” |
| [Taylor & Francis](https://www.taylorandfrancis.com) | ๐Ÿ‡ฌ๐Ÿ‡ง | โœ… |
| [Oxford University Press](https://www.oup.com) | ๐Ÿ‡ฌ๐Ÿ‡ง | โœ… |
| [Cambridge University Press](https://www.cambridge.org) | ๐Ÿ‡ฌ๐Ÿ‡ง | โœ… |
| [Royal Society of Chemistry](https://www.rsc.org) | ๐Ÿ‡ฌ๐Ÿ‡ง | โœ… |
| [ResearchGate](https://www.researchgate.net) | ๐Ÿ‡ฉ๐Ÿ‡ช | โœ… |
| [BNF](https://www.bnf.fr) | ๐Ÿ‡ซ๐Ÿ‡ท | โœ… |
| [Cairn](https://www.cairn.info) | ๐Ÿ‡ซ๐Ÿ‡ท | โœ… |
| [Persee](https://www.persee.fr) | ๐Ÿ‡ซ๐Ÿ‡ท | โœ… |
| [Gallica](https://gallica.bnf.fr) | ๐Ÿ‡ซ๐Ÿ‡ท | โœ… |
| [HAL](https://hal.archives-ouvertes.fr) | ๐Ÿ‡ซ๐Ÿ‡ท | โœ… |
| [OpenEdition](https://www.openedition.org) | ๐Ÿ‡ซ๐Ÿ‡ท | โœ… |

### Category: Dev

- Scanned: 3
- โœ… Passing: 67 %
- ๐Ÿ” Blocked: 33 %
- โ“ Unknown: 0 %

| Name | Country | Status |
| ------------------------------------------- | ------- | ------ |
| [Github](https://github.com) | ๐ŸŒ | โœ… |
| [Gitlab](https://gitlab.com) | ๐ŸŒ | โœ… |
| [Stack Overflow](https://stackoverflow.com) | ๐ŸŒ | ๐Ÿ” |

### Category: Other content

- Scanned: 19
- โœ… Passing: 74 %
- ๐Ÿ” Blocked: 26 %
- โ“ Unknown: 0 %

| Name | Country | Status |
| -------------------------------------------- | ------- | ------ |
| [Wikipedia](https://www.wikipedia.org) | ๐ŸŒ | โœ… |
| [Medium](https://medium.com) | ๐ŸŒ | ๐Ÿ” |
| [Substack](https://www.substack.com) | ๐ŸŒ | โœ… |
| [Common Crawl](https://commoncrawl.org) | ๐ŸŒ | โœ… |
| [Internet Archive](https://archive.org) | ๐ŸŒ | โœ… |
| [Wayback Machine](https://web.archive.org) | ๐ŸŒ | โœ… |
| [Notion](https://www.notion.so) | ๐ŸŒ | โœ… |
| [Weather](https://www.weather.com) | ๐Ÿ‡บ๐Ÿ‡ธ | ๐Ÿ” |
| [AccuWeather](https://www.accuweather.com) | ๐Ÿ‡บ๐Ÿ‡ธ | โœ… |
| [Mรฉtรฉo France](https://www.meteofrance.com) | ๐Ÿ‡ซ๐Ÿ‡ท | โœ… |
| [Getty Images](https://www.gettyimages.com) | ๐Ÿ‡บ๐Ÿ‡ธ | โœ… |
| [Shutterstock](https://www.shutterstock.com) | ๐Ÿ‡บ๐Ÿ‡ธ | ๐Ÿ” |
| [Adobe Stock](https://stock.adobe.com) | ๐Ÿ‡บ๐Ÿ‡ธ | ๐Ÿ” |
| [Unsplash](https://unsplash.com) | ๐Ÿ‡จ๐Ÿ‡ฆ | ๐Ÿ” |
| [Pexels](https://www.pexels.com) | ๐Ÿ‡ฉ๐Ÿ‡ช | โœ… |
| [Pixabay](https://www.pixabay.com) | ๐Ÿ‡ฉ๐Ÿ‡ช | โœ… |
| [Flickr](https://www.flickr.com) | ๐Ÿ‡บ๐Ÿ‡ธ | โœ… |
| [500px](https://500px.com) | ๐Ÿ‡จ๐Ÿ‡ฆ | โœ… |
| [Giphy](https://giphy.com) | ๐Ÿ‡บ๐Ÿ‡ธ | โœ… |

### Category: Other

- Scanned: 1
- โœ… Passing: 100 %
- ๐Ÿ” Blocked: 0 %
- โ“ Unknown: 0 %

| Name | Country | Status |
| -------------------------------- | ------- | ------ |
| [Indeed](https://www.indeed.com) | ๐Ÿ‡บ๐Ÿ‡ธ | โœ… |

## WTF list

A.k.a: do they understand their business model? ๐Ÿ’ธ

| Name | Status |
| ------------------------------------------- | ------ |
| [Getty Images](https://www.gettyimages.com) | โœ… |
| [Pexels](https://www.pexels.com) | โœ… |
| [500px](https://500px.com) | โœ… |

## Shame list

A.k.a: this is public interest. ๐Ÿ–•

| Name | Status |
| ---------------------------------------------- | ------ |
| [Medium](https://medium.com) | ๐Ÿ” |
| [Quora](https://quora.com) | ๐Ÿ” |
| [Elsevier](https://www.elsevier.com) | ๐Ÿ” |
| [Scopus](https://www.scopus.com) | ๐Ÿ” |
| [Science](https://www.science.org) | ๐Ÿ” |
| [ScienceDirect](https://www.sciencedirect.com) | ๐Ÿ” |
| [Nature](https://www.nature.com) | ๐Ÿ” |

## ๐Ÿค Contributing

Looking for contributions:
- Enrich website database
- Chinese websites
- New categories

Please open issues!

- Ping me on Twitter [@samuelberthe](https://twitter.com/samuelberthe) (DMs, mentions, whatever :))
- Fork the [project](https://github.com/samber/the-great-gpt-firewall)
- Fix [open issues](https://github.com/samber/the-great-gpt-firewall/issues) or request new features

Don't hesitate ;)

### Build

```bash
python -m venv venv
source ./venv/bin/activate
pip3 install -r requirements.txt
python3 scrape.py
# then copy the last version into readme
```

## ๐Ÿ‘ค Contributors

![Contributors](https://contrib.rocks/image?repo=samber/the-great-gpt-firewall)

## ๐Ÿ’ซ Show your support

Give a โญ๏ธ if this project helped you!

[![GitHub Sponsors](https://img.shields.io/github/sponsors/samber?style=for-the-badge)](https://github.com/sponsors/samber)

## ๐Ÿ“ License

Copyright ยฉ 2024 [Samuel Berthe](https://github.com/samber).

This project is [MIT](./LICENSE) licensed.