https://github.com/guoguang/flask_isomerism
Flask框架实现基础定时爬虫,为网站提供数据,集成Eureka,实现一个基础的异构系统
https://github.com/guoguang/flask_isomerism
Last synced: 25 days ago
JSON representation
Flask框架实现基础定时爬虫,为网站提供数据,集成Eureka,实现一个基础的异构系统
- Host: GitHub
- URL: https://github.com/guoguang/flask_isomerism
- Owner: GuoGuang
- License: mit
- Created: 2020-06-20T07:05:06.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2021-08-13T07:05:50.000Z (about 4 years ago)
- Last Synced: 2025-03-11T07:16:22.088Z (7 months ago)
- Language: Python
- Size: 21.5 KB
- Stars: 1
- Watchers: 1
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
## Flask isomerism
提供基础爬虫接口、爬虫脚本,集成到Eureka,主要实现异构系统使用。
如果需要添加新的脚本的在jobs\tasks下添加### 🏠 [Homepage](codeway.fun)
## Prerequisites
- python3
- Flask## Install
```sh
git clone https://github.com/GuoGuang/spider.git
```## Table structure
```
CREATE TABLE `movie` (
`id` varchar(100) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`name` varchar(200) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL DEFAULT '' COMMENT '电影名称',
`desc` text CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL COMMENT '电影描述',
`classify` varchar(100) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL DEFAULT '' COMMENT '类别',
`actor` varchar(500) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL DEFAULT '' COMMENT '主演',
`director` varchar(500) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL DEFAULT '' COMMENT '导演',
`cover_pic` varchar(300) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL DEFAULT '' COMMENT '封面图',
`pics` varchar(1000) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL DEFAULT '' COMMENT '图片地址',
`magnet_url` varchar(5000) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL DEFAULT '' COMMENT '磁力下载地址',
`online _url` varchar(5000) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL DEFAULT '' COMMENT '在线播放地址',
`pub_date` bigint(20) NOT NULL COMMENT '发布日期',
`rating` varchar(20) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL DEFAULT '' COMMENT '评分',
`source` varchar(20) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL DEFAULT '' COMMENT '来源',
`visits` int(11) NOT NULL DEFAULT 0 COMMENT '阅读数',
`is_recommend` int(11) NOT NULL DEFAULT 0 COMMENT '是否推荐,0不推荐,1推荐',
`update_at` bigint(20) NOT NULL,
`create_at` bigint(20) NOT NULL,
PRIMARY KEY (`id`) USING BTREE,
INDEX `idx_pu_date`(`pub_date`) USING BTREE
) ENGINE = InnoDB CHARACTER SET = utf8mb4 COLLATE = utf8mb4_0900_ai_ci ROW_FORMAT = Dynamic;// If you create a new entity use auto generate model
flask-sqlacodegen "mysql://root:123456@127.0.0.1/movie_cat" --tables user --outfile "common/models/user.py" --flask
```## Usage
```bash
# 使用以下命令启动爬虫
python manager.py runjob -m movie
# 使用以下命令启动Flask web
python manager.py runserver
```## Job task
Use Linux Crontab implementation
```bash
// 编辑文件
crontab -e# 编写脚本 自动执行爬虫
* */1 * * * { export ops_config=local && python3 /Yourdirectory/manager.py runjob -m movie }```
## Author
👤 **GuoGuang**
* Twitter: [@GuoGuang0536](https://twitter.com/GuoGuang0536)
* Github: [@GuoGuang0536](https://github.com/GuoGuang)## 🤝 Contributing
Contributions, issues and feature requests are welcome!
Feel free to check [issues page](https://github.com/GuoGuang0536/python_spider/issues).## Show your support
Give a ⭐️ if this project helped you!
## 📝 License
Copyright © 2019 [GuoGuang](https://github.com/GuoGuang).
This project is [GuoGuang](mit) licensed.