https://github.com/conduitio/conduit-connector-file
Conduit connector for local files
https://github.com/conduitio/conduit-connector-file
conduit file go golang
Last synced: 23 days ago
JSON representation
Conduit connector for local files
- Host: GitHub
- URL: https://github.com/conduitio/conduit-connector-file
- Owner: ConduitIO
- License: apache-2.0
- Created: 2022-02-22T12:18:32.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2025-05-06T15:15:29.000Z (24 days ago)
- Last Synced: 2025-05-06T16:45:22.967Z (24 days ago)
- Topics: conduit, file, go, golang
- Language: Go
- Homepage:
- Size: 432 KB
- Stars: 3
- Watchers: 6
- Forks: 3
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
- Codeowners: .github/CODEOWNERS
Awesome Lists containing this project
README
# Conduit Connector File
The File plugin is one of [Conduit](https://github.com/ConduitIO/conduit) builtin plugins.
It provides both source and destination File connectors, allowing for a file to be either
a source, or a destination in a Conduit pipeline.## How it works
The Source connector listens for changes appended to the source file and
sends records with the changes.
The Destination connector receives records and writes them to a file.### Source
The Source connector only cares to have a valid path, even if the file
doesn't exist, it will still run and wait until a file with the configured
name is there, then it will start listening to changes and sending records.### Destination
The Destination connector will create the file if it doesn't exist, and
records with changes will be appended to the destination file when received.## Source Configuration Parameters
```yaml
version: 2.2
pipelines:
- id: example
status: running
connectors:
- id: example
plugin: "file"
settings:
# Path is the file path used by the connector to read/write records.
# Type: string
# Required: yes
path: ""
# Maximum delay before an incomplete batch is read from the source.
# Type: duration
# Required: no
sdk.batch.delay: "0"
# Maximum size of batch before it gets read from the source.
# Type: int
# Required: no
sdk.batch.size: "0"
# Specifies whether to use a schema context name. If set to false, no
# schema context name will be used, and schemas will be saved with the
# subject name specified in the connector (not safe because of name
# conflicts).
# Type: bool
# Required: no
sdk.schema.context.enabled: "true"
# Schema context name to be used. Used as a prefix for all schema
# subject names. If empty, defaults to the connector ID.
# Type: string
# Required: no
sdk.schema.context.name: ""
# Whether to extract and encode the record key with a schema.
# Type: bool
# Required: no
sdk.schema.extract.key.enabled: "false"
# The subject of the key schema. If the record metadata contains the
# field "opencdc.collection" it is prepended to the subject name and
# separated with a dot.
# Type: string
# Required: no
sdk.schema.extract.key.subject: "key"
# Whether to extract and encode the record payload with a schema.
# Type: bool
# Required: no
sdk.schema.extract.payload.enabled: "false"
# The subject of the payload schema. If the record metadata contains
# the field "opencdc.collection" it is prepended to the subject name
# and separated with a dot.
# Type: string
# Required: no
sdk.schema.extract.payload.subject: "payload"
# The type of the payload schema.
# Type: string
# Required: no
sdk.schema.extract.type: "avro"
```## Destination Configuration Parameters
```yaml
version: 2.2
pipelines:
- id: example
status: running
connectors:
- id: example
plugin: "file"
settings:
# Path is the file path used by the connector to read/write records.
# Type: string
# Required: yes
path: ""
# Maximum delay before an incomplete batch is written to the
# destination.
# Type: duration
# Required: no
sdk.batch.delay: "0"
# Maximum size of batch before it gets written to the destination.
# Type: int
# Required: no
sdk.batch.size: "0"
# Allow bursts of at most X records (0 or less means that bursts are
# not limited). Only takes effect if a rate limit per second is set.
# Note that if `sdk.batch.size` is bigger than `sdk.rate.burst`, the
# effective batch size will be equal to `sdk.rate.burst`.
# Type: int
# Required: no
sdk.rate.burst: "0"
# Maximum number of records written per second (0 means no rate
# limit).
# Type: float
# Required: no
sdk.rate.perSecond: "0"
# The format of the output record. See the Conduit documentation for a
# full list of supported formats
# (https://conduit.io/docs/using/connectors/configuration-parameters/output-format).
# Type: string
# Required: no
sdk.record.format: "opencdc/json"
# Options to configure the chosen output record format. Options are
# normally key=value pairs separated with comma (e.g.
# opt1=val2,opt2=val2), except for the `template` record format, where
# options are a Go template.
# Type: string
# Required: no
sdk.record.format.options: ""
# Whether to extract and decode the record key with a schema.
# Type: bool
# Required: no
sdk.schema.extract.key.enabled: "true"
# Whether to extract and decode the record payload with a schema.
# Type: bool
# Required: no
sdk.schema.extract.payload.enabled: "true"
```## How to build it
Run `make`.
## Testing
Run `make test` to run all the tests.
## Limitations
* The Source connector only detects appended changes to the file, so it
doesn't detect deletes or edits.
* The connectors can only access local files on the machine where Conduit
is running. So, running Conduit on a server means it can't access a file
on your local machine.
* Currently, only works reliably with text files (may work with non-text
files, but not guaranteed)