Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/hwywl/mzitu-crawler

爬取mzitu网站的妹子，注意营养
https://github.com/hwywl/mzitu-crawler

crawler mzitu python

Last synced: about 1 month ago
JSON representation

爬取mzitu网站的妹子，注意营养

Host: GitHub
URL: https://github.com/hwywl/mzitu-crawler
Owner: HWYWL
License: apache-2.0
Created: 2018-11-13T03:56:41.000Z (about 6 years ago)
Default Branch: master
Last Pushed: 2018-11-13T07:04:25.000Z (about 6 years ago)
Last Synced: 2024-11-10T19:41:18.373Z (3 months ago)
Topics: crawler, mzitu, python
Language: Python
Size: 7.81 KB
Stars: 1
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # mzitu-crawler

爬取mzitu网站的妹子，注意营养

[![license](https://img.shields.io/github/license/ZYSzys/Mzitu_Spider.svg)](https://github.com/HWYWL/mzitu-crawler/blob/master/LICENSE)

### 环境

python2.7, 3.6

### python库

http请求：requests  

图片提取：bs4  

存储相关: os  

### 下载安装

在终端输入如下命令：

```bash

git clone https://github.com/HWYWL/mzitu-crawler.git

```

### 使用方法

在当前目录下输入：

```bash

cd mzitu-crawler

pip install -r requirements.txt

python main.py

```

### 修改爬取的数量

```python

if __name__ == '__main__':

    # 当前页

    current = 1

    # 总页数

    total = 100

    while current < total:

        mz = MeiZe("http://www.mzitu.com/page/", current)

        mz.domainHtml()

        mz.getMaxPage()

        mz.downloading()

        current += 1

```

运行爬虫，如图所示  

![](https://i.imgur.com/6508MeF.jpg)

稍等几分钟后，当前目录下生成Mzitu文件夹，首页每套图以存储在其中  

![](https://i.imgur.com/6mbzr7u.jpg)

老板再来两瓶营养快线

![](https://i.imgur.com/vvjYCeP.jpg)

### 问题建议

- 联系我的邮箱：[email protected]

- 我的博客：http://www.hwy.ac.cn

- GitHub：https://github.com/HWYWL