https://github.com/bmedici/rest-ftp-daemon
A pretty simple but configurable and efficient FTP-client daemon, driven through a RESTful API, used by France Télévisions in production
https://github.com/bmedici/rest-ftp-daemon
daemon francetelevision ftp-client microservice robust ruby video vod
Last synced: 7 months ago
JSON representation
A pretty simple but configurable and efficient FTP-client daemon, driven through a RESTful API, used by France Télévisions in production
- Host: GitHub
- URL: https://github.com/bmedici/rest-ftp-daemon
- Owner: bmedici
- License: mit
- Created: 2014-07-31T17:04:57.000Z (over 11 years ago)
- Default Branch: master
- Last Pushed: 2019-06-04T09:54:45.000Z (almost 7 years ago)
- Last Synced: 2025-08-17T08:24:36.349Z (7 months ago)
- Topics: daemon, francetelevision, ftp-client, microservice, robust, ruby, video, vod
- Language: JavaScript
- Homepage:
- Size: 2.67 MB
- Stars: 24
- Watchers: 5
- Forks: 9
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
rest-ftp-daemon
====================================================================================
[](http://badge.fury.io/rb/rest-ftp-daemon)
[](https://codeclimate.com/github/bmedici/rest-ftp-daemon)
[](https://codeclimate.com/github/bmedici/rest-ftp-daemon/coverage)
[ ](https://codeship.com/projects/153245)
A pretty simple but configurable and efficient FTP-client daemon, driven
through a RESTful API. Create transfer jobs by POSTing a simple JSON structure,
be notified of their completion, watch their status on a dedicated dashboard.

Features
------------------------------------------------------------------------------------
* System and process features
* environment-aware configuration in a YAML file
* daemon process is tagged with its name and environment in process lists
* global dashboard directly served within the daemon HTTP interface
* support pooling of worker to dedicate workers to groups of jobs
* File management ans transferts
* allow authentication in FTP target in a standard URI-format
* static path pointers in configuration to abstract local mounts or remote FTPs (endpoint tokens)
* local source path and local/remote target path can use patterns to match multiple files (`/dir/file*.jpg`)
* several file transfer protocols supported: FTPs, FTPes, sFTP
* display bitrate to any pool or any FTP destination currently transferring (API and dashboard)
* Job management
* highly parrallel job processing using dedicated worker threads with their own context
* jobs are taken into account as soon as they are submitted
* each job carry its own attributes: build subdirectories (mkdir), overwrite target file, priority weight
* dynamic evaluation of priorities, honoring any change on context until the job is picked
* automatically clean-up jobs after a configurable amount of time (failed, finished)
* Realtime status reporting
* realtime transfer status reporting, with progress and errors
* periodic update notifications sent along with transfer status and progress to an arbitrary URL (JSON resource POSTed)
* metrics about pools, throughtput, and queues output to NewRelic
Project status and quick installation
------------------------------------------------------------------------------------
#### Stability
Though it may need more robust tests, this gem has been used successfully in production for
a while without any glitches at France Télévisions.
#### API Documentation
API documentation is self-hosted on ```/swagger.html```
#### Expected features in a short-time range
* Provide swagger-style API documentation
* Authenticate API clients
* Allow more transfer protocols (HTTP POST etc)
* Expose JSON status of workers on `GET /jobs/` for automated monitoring
#### Installation
With Ruby (version 2.3 or higher) and rubygems properly installed, you only need :
```
gem install rest-ftp-daemon
```
If that is not the case yet, see section [Debian install preparation](#debian-install-preparation).
Subsystems
------------------------------------------------------------------------------------
#### Conchita: jobs queues cleanup
Job queue can be set to automatically cleanup after a certain delay. Entries are removed from the queue when they have been idle (updated_at) for more than X seconds, and in any of the following statuses:
- failed (conchita.clean_failed)
- finished (conchita.clean_finished)
- queued, (conchita.clean_queued)
Cleanup is done on a regular basis, every (conchita.timer) seconds.
#### Reporter: metrics collection
[TODO]
Usage and examples
------------------------------------------------------------------------------------
#### Launching rest-ftp-daemon
You must provide a configuration file for the daemon to start, either explicitly using
option `--config` or implicitly at `/etc/rest-ftp-daemon.yml`. A sample file is provided, issue
`--help` to get more info.
You then simply start the daemon on its standard port, or on a specific port using `-p`
```
$ rest-ftp-daemon -p 3000 start
```
Check that the daemon is running and exposes a JSON status structure at `http://localhost:3000/status`.
The dashboard will provide an overview at `http://localhost:3000/`
If the daemon appears to exit quickly when launched, it may be caused by logfiles that can't be written (check files permissions or owner).
#### Launcher options :
| Param | Short | Default | Description |
|------- |-------------- |------------- |----------------------------------------------------------- |
| -p | --port | (automatic) | Port to listen for API requests |
| -e | | production | Environment name |
| | --dev | | Equivalent to -e development |
| -d | --daemonize | false | Wether to send the daemon to background |
| -f | --foreground | false | Wether to keep the daemon running in the shell |
| -P | --pid | (automatic) | Path of the file containing the PID |
| -u | --user | (none) | User to run the daemon as |
| -g | --group | (none) | Group of the user to run the daemon as |
| -h | --help | | Show info about the current version and available options |
| -v | --version | | Show the current version |
#### Start a job to transfer a file named "file.iso" to a local FTP server
```
curl -H "Content-Type: application/json" -X POST -D /dev/stdout -d \
'{"source":"~/file.iso","target":"ftp://anonymous@localhost/incoming/dest2.iso"}' "http://localhost:3000/jobs"
```
#### Start a job using endpoint tokens
First define ``nas`` ans ``ftp1`` in the configuration file :
```
defaults: &defaults
development:
<<: *defaults
endpoints:
nas: "~/"
ftp1: "ftp://anonymous@localhost/incoming/"
```
Those tokens will be expanded when the job is run:
```
curl -H "Content-Type: application/json" -X POST -D /dev/stdout -d \
'{"source":"~/file.dmg","priority":"3","target":"ftp://anonymous@localhost/incoming/dest4.dmg","notify":"http://requestb.in/1321axg1"}' "http://localhost:3000/jobs"
```
#### Start a job with a specific pool name
The daemon spawns groups of workers (worker pools) to work on groups of jobs (job pools). Any ```pool``` attribute not declared in configuration will land into the ```"default"``` pool.
```
curl -H "Content-Type: application/json" -X POST -D /dev/stdout -d \
'{"pool": "maxxxxx",source":"~/file.iso",target":"ftp://anonymous@localhost/incoming/dest2.iso"}' "http://localhost:3000/jobs"
```
This job will be handled by the "maxxxxx" workers only, or by the ```"default"``` worker is this pool is not declared.
#### Get info about a job with ID="q89j.1"
Both parameters `q89j.1` and `1` will be accepted as ID in the API. Requests below are equivalent:
```
GET http://localhost:3000/jobs/q89j.1
GET http://localhost:3000/jobs/1
```
Configuration
------------------------------------------------------------------------------------
Most of the configuration options live in a YAML configuration file, containing two main sections:
* `defaults` section should be left as-is and will be used is no other environment-specific value is provided.
* `production` section can receive personalized settings according to your environment-specific setup and paths.
Configuration priority is defined as follows (from most important to last resort):
* command-line parameters
* config file defaults section
* config file environment section
* application internal defaults
As a starting point, `rest-ftp-daemon.yml.sample` is an example config file that can be copied into the expected location ``/etc/rest-ftp-daemon.yml``.
Default administrator credentials are `admin/admin`. Please change the password in this configuration file before starting any kind of production.
Here is the contents of the default configuration (oeverride by passing -c local.yml at startup)
```yaml
daemonize: true
port: 3000
user: rftpd
# group: rftpd
# host: "myhost"
allow_reload: false
pools: # number of workers decidated to each pool value
default: 2
urgent: 1
reporter: # the subsytem in charge of reporting metrics, mainly to NewRelic
debug: false
timer: 10 # report every X seconds
conchita:
debug: false
timer: 60 # do the cleaning up every X seconds
garbage_collector: true # force a garbage collector cleanup when cleaning things up
clean_failed: 3600 # after X seconds, clean jobs with status="failed"
clean_finished: 3600 # // // // finished
clean_queued: 86400 # // // // queued
transfer:
debug: false
mkdir: true # build directory tree if missing
tempfile: true # transfer to temporary file, rename after sucessful transfer
overwrite: false # overwrite any target file with the same name
timeout: 1800 # jobs running for longer than X seconds will be killed
notify_after: 5 # wait at least X seconds between HTTP notifications
debug_ftp: false
debug_ftps: false
debug_sftp: false
retry_on: # job error values that will allow a retry
- ftp_perm_error
- net_temp_error
- conn_reset_by_peer
- conn_timed_out
- conn_refused
- sftp_auth_failed
- conn_host_is_down
- conn_unreachable
- conn_failed
- conn_openssl_error
retry_max: 5 # maximum number of retries before giving up on that job
retry_for: 1800 # maximum time window to retry failed jobs
retry_after: 10 # delay to wait before tries
newrelic:
debug: false
# license: "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
# app_name: "rftpd-bigbusiness-dev" # app_name used for naming app (used as-is if provided)
prefix: "rftpd" # app prefix to build app_name
# platform: "bigbusiness" # app platform to build app_name
logs:
path: "/var/log/"
thin: "rftpd-environment-thin.log"
newrelic: "rftpd-environment-newrelic.log"
queue: "rftpd-environment-core.log"
api: "rftpd-environment-core.log"
workers: "rftpd-environment-core.log"
transfer: "rftpd-environment-workers.log"
conchita: "rftpd-environment-workers.log"
reporter: "rftpd-environment-workers.log"
notify: "rftpd-environment-workers.log"
```
TODO for this document
------------------------------------------------------------------------------------
* Document /status
* Document /routes
* Document mkdir and overwrite options
* Document stats
Debian install preparation
------------------------------------------------------------------------------------
This project is available as a rubygem, requires Ruby 2.3.0 and RubyGems installed.
#### Using rbenv and ruby-build
You may use `rbenv` and `ruby-build` to get the right Ruby version. If this is your case, ensure that ruby-build definitions are up-to-date and include the right Ruby version.
You may have to install some extra packages for the compilations to complete.
```
# apt-get install libffi-dev zlib1g-dev bison libreadline-dev
# git clone https://github.com/rbenv/rbenv.git ~/.rbenv
# git clone https://github.com/rbenv/ruby-build.git ~/.rbenv/plugins/ruby-build
# echo 'export PATH="$HOME/.rbenv/bin:$PATH"' >> ~/.bashrc
# echo 'eval "$(rbenv init -)"' >> ~/.bashrc
# rbenv install --list | grep '2.3'
```
```
# curl -fsSL https://gist.github.com/mislav/055441129184a1512bb5.txt | rbenv install --patch 2.2.3
```
Otherwise, you way have to update ruby-build to include Ruby 2.3 definitions.
On Debian, 2.3 is not included in Wheezy and appears in Jessie's version of the package.
#### Dedicated user
Use a dedicated user for the daemon, switch to this user and enable rbenv
```
# adduser --disabled-password --gecos "" rftpd
# su rftpd -l
```
#### Ruby version
Install the right ruby version and activate it
```
# rbenv install 2.3.0
# rbenv local 2.3.0
# rbenv rehash
```
#### Daemon installation
Update RubyGems and install the gem from rubygems.org
```
# gem update --system
# gem install rest-ftp-daemon --no-ri --no-rdoc
# rbenv rehash
# rest-ftp-daemon start
```
Known bugs
------------------------------------------------------------------------------------
* As this project is based on the Psyck YAML parser, configuration merge from "defaults" section and environment-specific section are broken. A sub-tree defined for a specific environment, will overwrite the corresponding subtree from "defaults". Please repeat whole sections from "defaults".
* As this project is based on Chamber, and it considers hyphens in filename as namespaces, the global /etc/rest-ftp-daemon.yml config file is not parsed (and thus, ignored). Until this is worked around, please specify a config filename on the commandline.
* If you get ```fatal error: 'openssl/ssl.h' file not found when installing eventmachine``` on OSX El Capitan, you can try with:
```
gem install eventmachine -v '1.0.8' -- --with-cppflags=-I/usr/local/opt/openssl/include
bundle install
```
* If you get ```uncommon.mk:189: recipe for target 'build-ext' failed``` on Debian, you can try with:
```
curl -fsSL https://gist.github.com/mislav/055441129184a1512bb5.txt | rbenv install --patch 2.2.3
```
Contributing
------------------------------------------------------------------------------------
Contributions are more than welcome, be it for documentation, features, tests,
refactoring, you name it. If you are unsure of where to start, the [Code
Climate](https://codeclimate.com/github/bmedici/rest-ftp-daemon) report will
provide you with improvement directions. And of course, if in doubt, do not
hesitate to open an issue. (Please note that this project has adopted a [code
of conduct](CODE_OF_CONDUCT.md).)
If you want your contribution to adopted in the smoothest and fastest way, don't
forget to:
* provide sufficient documentation in you commit and pull request
* add proper testing (we know full grown solid test coverage is still lacking and
need to up the game)
* use the [RuboCop](https://github.com/bbatsov/rubocop) guidelines provided
(there are all sorts of editor integration plugins available)
So,
1. Fork the project
2. Create your feature branch (`git checkout -b my-new-feature`)
3. Code
* add proper tests if adding a feature
* run the tests using `rake`
* check for RuboCop style guide violations
4. Commit your changes
5. Push to the branch (`git push origin my-new-feature`)
6. Create new Pull Request
About
------------------------------------------------------------------------------------
Thanks to https://github.com/berkshelf/berkshelf-api for parts and ideas used in this project
This project has been initiated and originally written by
Bruno MEDICI Consultant (http://bmconseil.com/)