https://github.com/wteja/pdf-converter
PDF Converter Server made with Go & LibreOffice
https://github.com/wteja/pdf-converter
api converter converter-app converters excel go golang libreoffice pdf powerpoint webapp word
Last synced: 17 days ago
JSON representation
PDF Converter Server made with Go & LibreOffice
- Host: GitHub
- URL: https://github.com/wteja/pdf-converter
- Owner: wteja
- License: mit
- Created: 2025-01-11T15:39:33.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-01-11T16:07:17.000Z (over 1 year ago)
- Last Synced: 2025-01-11T17:24:07.560Z (over 1 year ago)
- Topics: api, converter, converter-app, converters, excel, go, golang, libreoffice, pdf, powerpoint, webapp, word
- Language: Go
- Homepage: https://hub.docker.com/r/wteja/pdf-converter
- Size: 1.13 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Office to PDF Converter
This is a simple HTTP server application written in Go that converts uploaded office files into PDF files using LibreOffice in headless mode. The server listens on port `5000` and exposes a single endpoint for file conversion. It also includes automatic cleanup of temporary files older than one hour.
## Features
- Converts various office files to PDF format using LibreOffice.
- Handles file uploads via HTTP POST requests.
- Automatic cleanup of old files from the temporary directory after one hour.
- Minimal and efficient implementation using Go.
## Requirements
- **Go**: Ensure Go is installed on your system ([Download Go](https://golang.org/dl/)).
- **LibreOffice**: LibreOffice must be installed and accessible via the `soffice` command.
## Supported File Formats
This application relies on LibreOffice for file conversion, so it supports any file format LibreOffice can handle. Below are the formats that can be converted to PDF:
### Document Formats
- `.doc`, `.docx` (Microsoft Word)
- `.odt`, `.ott` (LibreOffice/OpenDocument Text)
- `.rtf` (Rich Text Format)
- `.txt` (Plain Text)
### Spreadsheet Formats
- `.xls`, `.xlsx` (Microsoft Excel)
- `.ods`, `.ots` (LibreOffice/OpenDocument Spreadsheet)
- `.csv` (Comma-Separated Values)
### Presentation Formats
- `.ppt`, `.pptx` (Microsoft PowerPoint)
- `.odp`, `.otp` (LibreOffice/OpenDocument Presentation)
### Other Formats
- `.svg` (Scalable Vector Graphics)
- `.html`, `.htm` (HTML Files)
- `.xml` (XML Files)
- `.pdf` (for PDF editing and re-exporting)
## Installation
1. Clone the repository:
```bash
git clone
cd
```
2. Install dependencies:
This application does not have external dependencies, but ensure you have LibreOffice installed.
3. Build the application:
```bash
go build -o pdf-converter .
```
4. Run the application:
```bash
./pdf-converter
```
The server will start listening on `http://localhost:5000`.
## API Usage
### **Endpoint: `/convert`**
- **Method**: `POST`
- **Field Name**: `file`
- **File Type**: Any supported format (e.g., `.xlsx`, `.docx`).
#### Request Example (Using `curl`):
```bash
curl -X POST -F "file=@example.xlsx" http://localhost:5000/convert --output output.pdf
```
#### Response:
- **Success**: Returns the converted PDF file as a response with the `Content-Type` set to `application/pdf`.
- **Error**: Returns an appropriate HTTP status code and error message if the conversion fails.
## Public Docker Image
You can use the publicly available Docker image for this application:
```bash
docker pull wteja/pdf-converter
```
Run the container:
```bash
docker run -p 5000:5000 wteja/pdf-converter
```
## Configuration
- Temporary files are stored in the `./tmp` directory. Ensure the application has write access to this directory.
- The application automatically removes files older than one hour from the `tmp` directory.
## Code Overview
### **Main Components**
1. **File Upload and Conversion**:
- The `/convert` endpoint processes file uploads, saves them to the `tmp` directory, and invokes LibreOffice in headless mode to perform the conversion.
2. **Temporary Directory Management**:
- All uploaded and converted files are stored in the `tmp` directory.
- A background goroutine periodically checks and deletes files older than one hour.
3. **Error Handling**:
- Comprehensive error handling for file uploads, conversions, and temporary file management.
### **Key Functions**
- `handleConvert`: Handles the HTTP requests, manages file upload and conversion, and returns the resulting PDF.
- `cleanupOldFiles`: Periodically deletes old files from the `tmp` directory.
## Future Improvements
- Support additional file formats (e.g., `.pptx`, `.odg`).
- Add configurable cleanup duration and temporary directory path.
- Implement better logging and monitoring.
## License
This project is licensed under the MIT License. See the `LICENSE` file for details.
## Author
Developed by [Weerayut Teja](https://github.com/wteja). Contributions and feedback are welcome!