Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://github.com/tgxn/lemmy-explorer

Instance and Community Explorer for Lemmy
https://github.com/tgxn/lemmy-explorer

fediverse lemmy nodejs reactjs social-crawler social-media-analysis

Last synced: about 1 month ago
JSON representation

Instance and Community Explorer for Lemmy

Host: GitHub
URL: https://github.com/tgxn/lemmy-explorer
Owner: tgxn
Created: 2023-06-12T05:06:26.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2023-12-31T03:49:11.000Z (6 months ago)
Last Synced: 2024-04-15T02:01:02.428Z (3 months ago)
Topics: fediverse, lemmy, nodejs, reactjs, social-crawler, social-media-analysis
Language: JavaScript
Homepage: https://lemmyverse.net/
Size: 29.8 MB
Stars: 100
Watchers: 2
Forks: 8
Open Issues: 13
Metadata Files:
- Readme: README.md

Lists

awesome-lemmy - lemmy-explorer - explorer) ![GitHub commit activity](https://img.shields.io/github/commit-activity/y/tgxn/lemmy-explorer) (Projects / Tools)

README

[![publish-pages](https://github.com/tgxn/lemmy-explorer/actions/workflows/publish-pages.yaml/badge.svg)](https://github.com/tgxn/lemmy-explorer/actions/workflows/publish-pages.yaml)

# Lemmy Explorer https://lemmyverse.net/
Data Dumps: https://data.lemmyverse.net/

This project provides a simple way to explore Lemmy Instances and Communities.

![List of Communities](./docs/images/communities.png)

The project consists of four modules:
1. Crawler (NodeJS, Redis) `/crawler`
2. Frontend (ReactJS, MUI Joy, TanStack) `/frontend`
3. Deploy (Amazon CDK v2) `/cdk`
4. Data Site (GitHub Pages) `/pages`

## FAQ

## Q: How can I set a link to automatically set the home instance?

You can append `home_url` and (optionally) `home_type` to the URL to set the home instance and type.

`?home_url=lemmy.example.com`
`?home_url=kbin.example.com&home_type=kbin`

> `home_type` supports "lemmy" and "kbin" (default is "lemmy")

### Q: **How does discovery work?**
It uses a [seed list of communities](https://github.com/tgxn/lemmy-explorer/blob/main/crawler/src/lib/const.js#L47) and scans the equivalent of the `/instances` federation lists, and then creates jobs to scan each of those servers.

Additionally, instance tags and trust data is fetched from [Fediseer](https://gui.fediseer.com/).

### Q: **How does the NSFW filter work?**
The NSFW filter is a client-side filter that filters out NSFW communities and instances from results by default.
The "NSFW Toggle" checkbox has thress states that you can toggle through:
| State | Filter | Value |
| --- | --- | --- |
| Default | Hide NSFW | false |
| One Click | Include NSFW | null |
| Two Clicks | NSFW Only | true |

When you try to switch to a non-sfw state, a popup will appear to confirm your choice. You can save your response in your browsers cache and it will be remembered.

### Q: **How long till my instance shows up?**
How long it takes to discover a new instance can vary depending on if you post content that's picked up by one of these servers.

Since the crawler looks at lists of federated instances, we can't discover instances that aren't on those lists.

Additionally, the lists are cached for 24 hours, so it can take up to 24 hours for an instance to show up after it's been discovered till it shows up.

### Q: **Can I use your data in my app/website/project?**
I do not own any of the data retrieved by the crawler, it is available from public endpoints on the source instances.

You are free to pull data from the GitHub pages site:

[**Lemmyverse Data Site**](https://data.lemmyverse.net/)

**Please don't hotlink the files on the public website `https://lemmyverse.net/`**

### Q: **How often is the data updated?**

Currently, I upload a Redis dump generated by the crawler each night to s3, GitLab builds the JSON dump from that.

Data is also available from the artifacts of [this action](https://github.com/tgxn/lemmy-explorer/actions/workflows/publish-pages.yaml).
You can also download [Latest ZIP](https://nightly.link/tgxn/lemmy-explorer/workflows/publish-pages.yaml/main/dist-json-bundle.zip) _(using nightly.link)_

`dist-json-bundle.zip` file contains the data in JSON format:

- `communities.full.json` - list of all communities
- `instances.full.json` - list of all instances
- `overview.json` - metadata and counts

## Crawler
[Crawler README](./crawler/README.md)

## Frontend
[Frontend README](./frontend/README.md)

## Data Site
[Data Site README](./pages/README.md)

## Deploy

The deploy is an Amazon CDK v2 project that deploys the crawler and frontend to AWS.

`config.example.json` has the configuration for the deploy.

then run `cdk deploy --all` to deploy the frontend to AWS.

## Similar Sites

- https://browse.feddit.de/
- https://join-lemmy.org/instances
- https://github.com/maltfield/awesome-lemmy-instances
- https://lemmymap.feddit.de/
- https://browse.toast.ooo/
- https://lemmyfind.quex.cc/

## Lemmy Stats Pages
- https://lemmy.fediverse.observer/dailystats
- https://the-federation.info/platform/73
- https://fedidb.org/software/lemmy
- https://fedidb.org/current-events/threadiverse

## Thanks / Related Lemmy Tools

- https://github.com/db0/fediseer
- https://github.com/LemmyNet/lemmy-stats-crawler

# Credits

Logo made by Andy Cuccaro (@andycuccaro) under the CC-BY-SA 4.0 license.