Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/bitartisan1/netdigger

A .NET 8.0 C# WPF desktop application for web scraping data into structured databases with a modern UI, comprehensive logging and optimized high performance.
https://github.com/bitartisan1/netdigger

csharp data data-scraper data-scraping database desktop dotnet internet logging scraper ui url web-scraper web-scrapers web-scraping web-scrapping

Last synced: about 1 month ago
JSON representation

A .NET 8.0 C# WPF desktop application for web scraping data into structured databases with a modern UI, comprehensive logging and optimized high performance.

Awesome Lists containing this project

README

        



Buy Me a Coffee Pls!

# netDigger

netDigger is a web scraping application built using .NET 8.0 and C# WPF in Visual Studio. It collects various types of data and exports them into organized and structured databases. The application features a modern UI design with detailed and comprehensive logging.


netDigger Logo.

A Powerful In-Depth Web Scraping Application.


## Features

- **Asynchronous Web Scraping**: Efficiently scrape web pages using asynchronous tasks with minimized latency and multi-threading for parallel processing.
- **Data Collection**: Collects data such as PDFs, CSVs, DOCX, XLS, PPTX, TXT, Images, Videos, JSON, DBSQL, XML, HTML, PHP, JS, Archives, and Miscellaneous files.
- **Comprehensive JSON, XML, and HTML Parsing**: Utilizes advanced parsing techniques to extract valuable information from JSON, XML, and HTML documents, including finding and processing hidden element data and meta data.
- **Database Integration**: Organizes scraped URLs into SQLite databases based on their file types.
- **Modern UI Design**: User-friendly WPF interface with rich text logging.
- **Detailed Logging**: Comprehensive log messages with timestamps, log levels, and thread IDs.
- **Export Options**: Export scraped data to database files, CSV, and TXT formats.
- **Multi OS Support**: Compatible with Windows x64/x86/ARM, Linux and MacOS.

## Technologies Used

- **.NET 8.0**
- **C#**
- **WPF (Windows Presentation Foundation)**
- **AngleSharp** for HTML parsing
- **PuppeteerSharp**: A headless browser automation library for .NET.
- **Newtonsoft.Json (Json.NET)**: A popular library for working with JSON in .NET.
- **SQLite** for database management
- **Concurrent Collections** for thread-safe operations

## Prerequisites

- .NET 8.0 Desktop Runtime or SDK Framework.
- Visual Studio 2022. (In case you want to build it yourself).

## Installation

1. Clone the repository:
```sh
git clone https://github.com/your-username/netDigger.git
cd netDigger
```
2. Open the solution file (netDigger.sln) in Visual Studio.

3. Build the project:

Select `Build > Build Solution`.
Run the application:

Select `Debug > Start Debugging` or press `F5`.

## Contribution
1. Fork the repository.
2. Create a new branch (git checkout -b feature-branch).
3. Commit your changes (git commit -m 'Add new feature').
4. Push to the branch (git push origin feature-branch).
5. Create a new Pull Request.

## License
This project is licensed under the __GNU Affero General Public License v3.0__. See the __LICENSE__ file for more details.

## Support Me
If you find RepoUp useful, consider supporting me by:

- Starring the repository on GitHub
- Sharing the tool with others
- Providing feedback and suggestions
- Follow me for more :)



---
For any issues or feature requests, please open an issue on GitHub. Happy coding!



octodance