https://github.com/posthog/posthog-redshift-import-plugin
Import data from Redshift into PostHog
https://github.com/posthog/posthog-redshift-import-plugin
Last synced: 6 months ago
JSON representation
Import data from Redshift into PostHog
- Host: GitHub
- URL: https://github.com/posthog/posthog-redshift-import-plugin
- Owner: PostHog
- License: mit
- Created: 2021-08-17T11:46:57.000Z (over 4 years ago)
- Default Branch: main
- Last Pushed: 2022-05-13T10:20:15.000Z (over 3 years ago)
- Last Synced: 2025-04-02T22:43:29.037Z (9 months ago)
- Language: TypeScript
- Size: 18.6 KB
- Stars: 1
- Watchers: 4
- Forks: 8
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Redshift Import Plugin (BETA)
> Looking to contribute a transformation? See [Contributing a transformation](#contributing-a-transformation).
Import data from a Redshift table in the form of PostHog events.
## ⚠️ Important Notice
This plugin is still in Beta! **Use it at your own risk**. Feel free to check out [its code](https://github.com/PostHog/posthog-redshift-import-plugin/blob/main/index.ts) and [submit feedback](https://github.com/PostHog/posthog-redshift-import-plugin/issues/new?title=Plugin+Feedback).
## Installation Instructions
### 1. Select a Redshift table to use for this plugin
### 2. Create a user with sufficient priviledges to read data from your table
We need to create a new table to store events and execute `INSERT` queries. You can and should block us from doing anything else on any other tables. Giving us table creation permissions should be enough to ensure this:
```sql
CREATE USER posthog WITH PASSWORD '123456yZ';
GRANT CREATE ON DATABASE your_database TO posthog;
```
### 3. Add the connection details at the plugin configuration step in PostHog
### 4. Determine what transformation to apply to your data
This plugin receives the data from your table and transforms it to create a PostHog-compatible event. To do this, you must select a transformation to apply to your data. If none of the transformations below suit your use case, feel free to contribute one via a PR to this repo.
> **Important:** Make sure your Redshift table has a [sort key](https://docs.aws.amazon.com/redshift/latest/dg/t_Sorting_data.html) and use the sort key column as the "Order by column" in the plugin config.
#### Contributing a transformation
If none of the transformations listed below suits your use case, you're more than welcome to contribute your own transformation!
To do so, just add your transformation to the [`transformations` object](https://github.com/PostHog/posthog-redshift-import-plugin/blob/33702ea2bc640aee0a45ddd40c6a4b8ca94012cf/index.ts#L226) in the `index.ts` file and list it in the `plugin.json` [choices list](https://github.com/PostHog/posthog-redshift-import-plugin/blob/33702ea2bc640aee0a45ddd40c6a4b8ca94012cf/plugin.json#L66) for the field `transformationName`.
A transformation entry looks like this:
```js
'': {
author: '',
transform: async (row, meta) => {
/*
Fill in your transformation here and
make sure to return an event according to
the TransformedPluginEvent interface:
interface TransformedPluginEvent {
event: string,
properties?: PluginEvent['properties']
}
*/
}
}
```
Your GitHub username is important so that we only allow changes to transformations by the authors themselves.
Once you've submitted your PR, feel free to tag @yakkomajuri for review!
#### Available Transformations
##### default
The default transformation looks for the following columns in your table: `event`, `timestamp`, `distinct_id`, and `properties`, and maps them to the equivalent PostHog event fields of the same name.
**Code**
```js
async function transform (row, _) {
const { timestamp, distinct_id, event, properties } = row
const eventToIngest = {
event,
properties: {
timestamp,
distinct_id,
...JSON.parse(properties),
source: 'redshift_import',
}
}
return eventToIngest
}
```
##### JSON Map
This transformation asks the user for a JSON file containing a map between their columns and fields of a PostHog event. For example:
```json
{
"event_name": "event",
"some_row": "timestamp",
"some_other_row": "distinct_id"
}
```
**Code (Simplified\*)**
*Simplified means error handling and type definitions were removed for the sake of brevity. See the full code in the [index.ts](/index.ts) file
```js
async function transform (row, { attachments }) {
let rowToEventMap = JSON.parse(attachments.rowToEventMap.contents.toString())
const eventToIngest = {
event: '',
properties: {}
}
for (const [colName, colValue] of Object.entries(row)) {
if (!rowToEventMap[colName]) {
continue
}
if (rowToEventMap[colName] === 'event') {
eventToIngest.event = colValue
} else {
eventToIngest.properties[rowToEventMap[colName]] = colValue
}
}
return eventToIngest
}
```