{"id":15697118,"url":"https://github.com/peter279k/acg-crawler","last_synced_at":"2025-05-08T23:34:29.577Z","repository":{"id":79146334,"uuid":"83817342","full_name":"peter279k/acg-crawler","owner":"peter279k","description":"A ACG crawler for crawling the ACG news!","archived":false,"fork":false,"pushed_at":"2017-03-18T23:10:32.000Z","size":367,"stargazers_count":7,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-03-31T19:21:33.991Z","etag":null,"topics":["anime","java","java-8","newsfeed","newsletter"],"latest_commit_sha":null,"homepage":"http://peter279k.com.tw","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/peter279k.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-03-03T16:16:00.000Z","updated_at":"2025-01-16T10:37:08.000Z","dependencies_parsed_at":"2023-05-23T19:15:23.671Z","dependency_job_id":null,"html_url":"https://github.com/peter279k/acg-crawler","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/peter279k%2Facg-crawler","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/peter279k%2Facg-crawler/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/peter279k%2Facg-crawler/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/peter279k%2Facg-crawler/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/peter279k","download_url":"https://codeload.github.com/peter279k/acg-crawler/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253165696,"owners_count":21864460,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["anime","java","java-8","newsfeed","newsletter"],"created_at":"2024-10-03T19:12:25.722Z","updated_at":"2025-05-08T23:34:29.552Z","avatar_url":"https://github.com/peter279k.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# acg-crawler\nA ACG crawler for crawling the ACG news!\n\n# To do lists\n~~- subscribe page~~\n\n~~- unscribe page~~\n\n~~- store subscribed email lists~~\n\n~~- get subscribed email lists~~\n\n- send email via GMAIL SMTP server (not yet...)\n\n~~- send email via MailGun API~~\n\n~~- send log email via MailGun API~~\n\n- (enhancement) crawl more resources\n\n- finish the Deployment section\n\n~~- (security) add CSRF-token for every pages\n(Including AnimeNews, AnimeHotNews, unscribe/subscribe email address)~~\n\n~~- (security) user input validation (email)~~\n\n# Requirement\n- Operating System: Ubuntu/16.04 (The Ubuntu 14.04 is not available for this project.)\n- Apache Tomcat version: 7\n- JAVA: 1.7+(recommendation version is 1.8)\n- Apache Tomcat 7 (version 8 is not sure to be worked well...)\n\n# Deployment (Manual approach)\nWe assume that we have installed the JSP environment in our target host.\n\n- target host: VPS (recommendation)\n- clone the repo\n- install the gradle (```sudo apt-get install gradle```) \n- using the command ```gradle tomcatRunWar``` to generate the ```acg-crawler.war```.i(The war file is in the /path/to/acg-crawler/build/libs)\n- create the ```Auth.ini``` to set the Mailgun info and GMAIL info.\n- export the runnable ```acg-crawler.jar```.\n- Remember to copy the ```assets``` folder to the WEB-INF folder in WAR file.\n- Remember to move the runnable jar file and ```auth.ini``` in the same directory path.\n- set the crontab command: ```java -jar /path/to/acg-crawler.jar ``` to crawl data, send email and send error log mail.\n- enjoy it!\n\n# auth.ini\n## The sample auth.ini file contents are as follows:\n\n```\n[MAILGUN]\napi-key=key-XXXXXXXXXXX\ndomain-name=peter279k.com.tw\napi-base-url=https://api.mailgun.net/v3/peter279k.com.tw/messages\nfrom-email-address=peter279k@gmail.com\nfrom-email-account=AnimeNews \u003cadmin@peter279k.com.tw\u003e\n[GMAIL]\naccount=your-gmail-addresss\npassword=your-gmail-password\n```\n# SETUP.sh\n```bash\n#!/bin/bash\n\necho \"This project has built in the Ubuntu 16.04LTS (in development environment)\"\n\nsudo apt-get install gradle git-core\nsudo apt-get install default-jdk default-jre tomcat7\n\ngit clone https://github.com/peter279k/acg-crawler.git\ncd acg-crawler/\n\ncp -r ./assets src/main/webapp/WEB-INF\n\n# remember to generate the runnable jar file from eclipse IDE.\n\ngradle clean\ngradle war\n\nsudo mkdir /home/tomcat7\nsudo chown -R tomcat7 /home/tomcat7\nsudo chmod u+wrx /home/tomcat7 -R\n\nsudo cp build/libs/acg-crawler.war /var/lib/tomcat7/webapps\nsudo service tomcat7 restart\n\necho \"Finish the deployment and visit the url: domain-name:8080/acg-crawler\"\necho \"\\n\"\necho \"Don't forget to use scp uploading your jar and war files\"\n\n```\n\n# Security\nIf you found some vulnerabilities about this web application project, please feel free to send the email to peter279k@gmail.com.\n\nThanks!\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpeter279k%2Facg-crawler","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpeter279k%2Facg-crawler","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpeter279k%2Facg-crawler/lists"}