{"id":20363968,"url":"https://github.com/ceylonai/apps-article-reader","last_synced_at":"2025-09-23T16:31:01.928Z","repository":{"id":262463140,"uuid":"887317241","full_name":"ceylonai/apps-article-reader","owner":"ceylonai","description":"📚 A powerful desktop app that extracts and analyzes web content using LLaMA AI. Features real-time processing, keyword extraction, and smart summarization. Built with Python + Tkinter.","archived":false,"fork":false,"pushed_at":"2024-11-12T15:54:46.000Z","size":117,"stargazers_count":3,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"master","last_synced_at":"2024-12-03T17:16:33.065Z","etag":null,"topics":["ai","crawler","gpt","ollama","openai"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ceylonai.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-11-12T14:35:44.000Z","updated_at":"2024-12-02T03:47:57.000Z","dependencies_parsed_at":"2024-11-12T15:37:30.767Z","dependency_job_id":"06b00418-9c3c-40b8-86ca-35a8a889260f","html_url":"https://github.com/ceylonai/apps-article-reader","commit_stats":null,"previous_names":["syigen/article-reader"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ceylonai%2Fapps-article-reader","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ceylonai%2Fapps-article-reader/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ceylonai%2Fapps-article-reader/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ceylonai%2Fapps-article-reader/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ceylonai","download_url":"https://codeload.github.com/ceylonai/apps-article-reader/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":233985838,"owners_count":18761531,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","crawler","gpt","ollama","openai"],"created_at":"2024-11-15T00:09:07.109Z","updated_at":"2025-09-23T16:30:56.589Z","avatar_url":"https://github.com/ceylonai.png","language":"Python","funding_links":["https://www.buymeacoffee.com/ceylon"],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n  \u003cimg src=\"docs/cover-photo.png\" alt=\"Content Extractor Dashboard Banner\" width=\"100%\"\u003e\n\n# Content Extractor\n\n📰 A powerful desktop application for extracting and analyzing content from web URLs\n\n[![Python 3.7+](https://img.shields.io/badge/Python-3.7+-blue.svg)](https://www.python.org/downloads/)\n[![MIT License](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)\n[![Ollama](https://img.shields.io/badge/LLM-Ollama-orange.svg)](https://ollama.ai)\n \n\u003c/div\u003e\n\n\u003cdiv align=\"center\"\u003e \n  \u003ca href=\"https://www.buymeacoffee.com/ceylon\"\u003e\n    \u003cimg src=\"https://cdn.buymeacoffee.com/buttons/v2/default-yellow.png\" alt=\"Buy Me A Coffee\" style=\"height: 41px !important;width: 174px !important;\"\u003e\n  \u003c/a\u003e\n\u003c/div\u003e\n\n## Introduction\n\nA powerful desktop application for extracting and analyzing content from web URLs. Built with Python and Tkinter, this\ntool provides a user-friendly interface for processing multiple URLs simultaneously, extracting key information, and\nsaving results locally.\n\n\n## Buy Me A Coffee\n\nIf you find this tool useful, you can consider supporting its development by buying me a coffee. This will help me\ncontinue to improve and maintain the tool. Your support is greatly appreciated!\n\n\n\n\n\n\n## ✨ Features\n\n- 🔗 **URL Processing**: Process multiple URLs simultaneously with a queue-based system\n- 🤖 **Content Analysis**: Extract titles, keywords, summaries, and generate relevant hashtags\n- 📊 **Progress Tracking**: Real-time status updates and progress monitoring for each task\n- 💾 **Auto-save**: Automatically save processed content to local files\n- ⚙️ **Task Management**: Pause, restart, or review completed tasks\n- 🎛️ **Configurable Settings**: Customize save directory and auto-save preferences\n\n## Screenshots\n\n\u003cdiv align=\"center\"\u003e\n  \u003cimg src=\"docs/screenshot.png\" alt=\"Content Extractor Dashboard\" width=\"25%\"\u003e\n  \u003cimg src=\"docs/tasks.png\" alt=\"Content Extractor Dashboard\" width=\"25%\"\u003e\n   \u003cimg src=\"docs/results.png\" alt=\"Content Extractor Dashboard\" width=\"25%\"\u003e\n\u003c/div\u003e\n\n## Prerequisites\n\n- Python 3.7 or higher\n- tkinter (usually comes with Python)\n- Ollama for running the LLaMA model\n\n## Installing Ollama\n\n1. Install Ollama based on your operating system:\n\n### Linux\n\n```bash\ncurl https://ollama.ai/install.sh | sh\n```\n\n### macOS\n\n```bash\nbrew install ollama\n```\n\n### Windows\n\nDownload and run the installer from [Ollama's official website](https://ollama.ai/download)\n\n2. Start the Ollama service:\n\n```bash\nollama serve\n```\n\n3. Pull the LLaMA model:\n\n```bash\nollama pull llama2\n```\n\n4. Verify the installation:\n\n```bash\nollama list\n```\n\nYou should see `llama2` in the list of available models.\n\n5. Configure the application to use Ollama:\n    - The application is pre-configured to use \"llama3.2\" as the model name\n    - Update the model name in `ContentExtractor` initialization if using a different model version\n\n## Installation\n\n1. Clone the repository:\n\n```bash\ngit clone \u003crepository-url\u003e\ncd content-extractor\n```\n\n2. Install required dependencies:\n\n```bash\npip install -r requirements.txt\n```\n\n3. Ensure Ollama is running:\n    - Start Ollama service if not already running\n    - Verify the LLaMA model is available\n\n## Usage\n\n1. Start the application:\n\n```bash\npython content_extractor_gui.py\n```\n\n2. Configure Settings:\n    - Click \"Browse\" to set your preferred project directory\n    - Toggle auto-save option as needed\n\n3. Process URLs:\n    - Enter a URL in the input field\n    - Click \"Add URL\" to start processing\n    - Monitor progress in the tasks list\n    - View results by clicking on completed tasks\n\n4. Managing Tasks:\n    - Click on any task to view its details\n    - Use the restart button (↻) to reprocess failed or completed tasks\n    - Save results manually using the \"Save Result\" button if auto-save is disabled\n\n## Task States\n\n- **Queued**: Task is waiting to be processed\n- **Processing**: Currently extracting content\n- **Completed**: Successfully processed\n- **Error**: Failed to process (can be restarted)\n\n## Output Format\n\nResults are saved as JSON files with the following structure:\n\n```json\n{\n  \"title\": \"Article Title\",\n  \"keywords\": [\n    \"keyword1\",\n    \"keyword2\",\n    ...\n  ],\n  \"content_summary\": \"Brief summary of the content\",\n  \"hashtags\": [\n    \"#hashtag1\",\n    \"#hashtag2\",\n    ...\n  ],\n  \"full_article\": \"Complete article text\"\n}\n```\n\n## Configuration\n\nThe application stores its configuration in `content_extractor_config.json`:\n\n```json\n{\n  \"project_dir\": \"/path/to/save/directory\",\n  \"auto_save\": true\n}\n```\n\n## Technical Details\n\n### Key Components\n\n1. **URLTask**: Manages individual URL processing tasks\n    - Tracks status, progress, and results\n    - Handles timing and error states\n\n2. **TaskPanel**: UI component for displaying task information\n    - Real-time status updates\n    - Progress bar\n    - Duration tracking\n    - Save path display\n\n3. **ContentExtractorGUI**: Main application interface\n    - Manages task queue and threading\n    - Handles file I/O and configuration\n    - Provides user interface controls\n\n### Threading Model\n\n- Uses a queue-based system for task management\n- Processes multiple URLs concurrently\n- Maintains UI responsiveness with proper thread management\n- Limits concurrent processing to prevent resource exhaustion\n\n## Error Handling\n\nThe application includes comprehensive error handling for:\n\n- Invalid URLs\n- Network issues\n- Processing failures\n- File system operations\n- Configuration management\n\n## Contributing\n\nContributions are welcome! Please follow these steps:\n\n1. Fork the repository\n2. Create a feature branch\n3. Commit your changes\n4. Push to the branch\n5. Create a Pull Request\n\n## License\n\nMIT\n\n## Support\n\nFor issues and feature requests, please:\n\n1. Check existing issues in the repository\n2. Create a new issue with detailed information\n3. Include steps to reproduce any bugs\n\n## Acknowledgments\n\n- Built using Python and Tkinter\n- Uses LLaMA model for content analysis through Ollama","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fceylonai%2Fapps-article-reader","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fceylonai%2Fapps-article-reader","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fceylonai%2Fapps-article-reader/lists"}