Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/softmarshmallow/inked-news-crawler
🕷 korean news source crawler (realtime & bulk)
https://github.com/softmarshmallow/inked-news-crawler
crawler naver-news python3 scrapy
Last synced: 20 days ago
JSON representation
🕷 korean news source crawler (realtime & bulk)
- Host: GitHub
- URL: https://github.com/softmarshmallow/inked-news-crawler
- Owner: softmarshmallow
- Created: 2018-07-28T15:09:04.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2022-12-08T02:22:19.000Z (about 2 years ago)
- Last Synced: 2024-11-23T13:12:43.556Z (about 1 month ago)
- Topics: crawler, naver-news, python3, scrapy
- Language: Python
- Homepage:
- Size: 223 KB
- Stars: 6
- Watchers: 2
- Forks: 0
- Open Issues: 15
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# How to install virtualenv:
### Install **pip** first
sudo apt-get install python3-pip
### Then install **virtualenv** using pip3
sudo pip3 install virtualenv
### Now create a virtual environment
virtualenv venv
### Active your virtual environment:
source venv/bin/activate
### install pip packages
`pip install -r requirements.txt
### install chromedriver
> latest version from https://sites.google.com/a/chromium.org/chromedriver/downloads
```
sudo apt-get update
sudo apt-get install -y unzip xvfb libxi6 libgconf-2-4
sudo apt-get install default-jdkwget -N https://chromedriver.storage.googleapis.com/81.0.4044.69/chromedriver_linux64.zip
unzip chromedriver_linux64.zip
chmod +x chromedriversudo mv -f chromedriver /usr/local/share/chromedriver
sudo ln -s /usr/local/share/chromedriver /usr/local/bin/chromedriver
sudo ln -s /usr/local/share/chromedriver /usr/bin/chromedriverwget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | sudo apt-key add -
echo 'deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main' | sudo tee /etc/apt/sources.list.d/google-chrome.list
sudo apt-get update
sudo apt-get install google-chrome-stable
```### Using fish shell:
source venv/bin/activate.fish
### To deactivate:
deactivate
### Create virtualenv using Python3
virtualenv -p python3 myenv### Instead of using virtualenv you can use this command in Python3
python3 -m venv myenv
### Add python module to path
`export PYTHONPATH="${PYTHONPATH}:inkedNewsCrawler"`
`chmod +x crawler.sh`
## register service
```shell script
sudo cp crawler.service /etc/systemd/system/crawler.service
sudo chmod 664 /etc/systemd/system/crawler.servicesudo systemctl daemon-reload
sudo systemctl start crawler.service
sudo systemctl status crawler.service
sudo systemctl enable crawler.service
```