Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/kxxoling/anime_spiders

A collection of self-using anime-related crawlers.
https://github.com/kxxoling/anime_spiders

anime django python scrapy vuejs2 web-crawler

Last synced: 15 days ago
JSON representation

A collection of self-using anime-related crawlers.

Awesome Lists containing this project

README

        

======================
Anime Related Crawlers
======================

A collection of self-using anime-related crawlers.

Supported sites:

- Image crawler for danbooru.donmai.us, deviantart.com
- File crawler for sakugabooru.com
- Torrent crawler for nyaa.si, share.dmhy.org, acg.rip, bangumi.moe
- Anime infomation crawler for bangumi.tv

Development
===========

Structure
---------

.. code-block::

.
├── Pipfile # Python package management
├── README.rst
├── Pipfile.lock
├── scrapy.cfg # scrapy config file
├── anime_spiders # Spiders
├── manage.py # Django manage.py
├── exhibition # Django backend application
├── db.sqlite3
├── package.json # Frontend package management
├── package-lock.json
├── node_modules # Frontend dependencies
├── index.html # index.html of frontend
├── src # Frontend application source
├── build # Frontend build
├── config # Frontend code build configs
├── dist # Distribution code of frontend
└── static # Frontend related static files

Installation & Running
----------------------

* Run frontend: ``npm run dev``;
* Run backend: ``./manage.py runserver``;
* Run a spider:
* Start a ElasticSearch server at ``192.168.2.10``;
* Install requirements: ``pipenv install && pipenv install --dev && pipenv shell``;
* Run spider ``scrapy crawl [spider_name]``;

Clean code
----------

Use ``yapf`` to format Python code::

yapf -irp -e "./.venv/**" -e "**/migrations/**" **/**.py

Usage
=====

Terminal
--------

Only scrapy commands supported for now.

Library
-------

You can use it as normal scrapy.Spider of course.

Scrapyd
-------

Not supported yet.