Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/bbc/dropbox-paper-to-json
A node module to convert a dropbox paper document to json
https://github.com/bbc/dropbox-paper-to-json
bbc-news-labs-news-mixer dropbox dropbox-paper dropbox-sdk json
Last synced: 7 days ago
JSON representation
A node module to convert a dropbox paper document to json
- Host: GitHub
- URL: https://github.com/bbc/dropbox-paper-to-json
- Owner: bbc
- Created: 2018-06-26T17:40:38.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2023-07-11T13:07:30.000Z (over 1 year ago)
- Last Synced: 2024-04-08T21:02:36.473Z (7 months ago)
- Topics: bbc-news-labs-news-mixer, dropbox, dropbox-paper, dropbox-sdk, json
- Language: JavaScript
- Size: 546 KB
- Stars: 10
- Watchers: 22
- Forks: 3
- Open Issues: 9
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
## Dropbox Paper to Markdown
A Node module to import data from a dropbox paper document and convert it into a json data structure.
## Setup
### 1. get dropbox access token
#### Create a dropbox App
- `create app` [from developer console](https://www.dropbox.com/developers/apps)
- chose `Dropbox API`
- chose `Full Dropbox`
- give it a Name: eg `News Mixer`
- click on the newly created app
- and [Generate an access token for your own account](https://blogs.dropbox.com/developers/2014/05/generate-an-access-token-for-your-own-account/)### 2. get dropbox paper `document id`
eg if the url of your dropbox paper is something like
```
https://paper.dropbox.com/doc/Main-Title-vJdrjMJAHdgfHz0rl83Z
```Then the last string element after the last `-`, reading from left to right, is your document id.
In this ficticious example it would be: `vJdrjMJAHdgfHz0rl83Z`.
### 2. add `DROPBOX_ACCESS_TOKEN` to `.env`
The project uses [dotenv](https://www.npmjs.com/package/dotenv) to deal with credentials and enviroment variables.
In the root of the folder repo create a `.env` file, this is excluded from the github repo by `.gitignore` to avoid leaking credentials.
Here's an examples format of `.env` file, _with some fictitious credentials_
```
# Dropbox credentials
DROPBOX_ACCESS_TOKEN=vJdrjMJAHdgfHz0rl83ZvJdrjMJAHdgfHz0rl83Z
DROPBOX_DOC_ID=vJdrjMJAHdgfHz0rl83Z
```## Usage
### In development
clone this repo
```
git clone [email protected]:bbc/dropbox-paper-to-json.git
```cd into folder
```
cd dropbox-paper-to-json
````npm install`
`npm start`
This will save a `data.json` file in the root of the project.
### In production
npm install
```
npm install dropbox-paper-to-json@git+ssh://[email protected]/bbc/dropbox-paper-to-json.git#master -save
```Add to your code base
```js
//if using dotenv for environment variable credentials for dropbox paper
require('dotenv').config();
// optional if you want to write the resulting json
const fs = require('fs');
// require module
const dbpMdToJson = require('dropbox-paper-to-json');dbpMdToJson({
accessToken: process.env.DROPBOX_ACCESS_TOKEN,
dbp_doc_id: process.env.DROPBOX_DOC_ID,
// default for nested === true
nested: true
}).then((data) => {
console.log(`done Dropbox Paper to JSON conversion`);
// optional: now do something with the data
fs.writeFileSync('./data.json', JSON.stringify(data, null, 2));
});```
## System Architecture
_High level overview of system architecture_
### Downloading a Dropbox paper
The module uses [`dpb-download-md`](./dpb-download-md/index.js) node module to get a dropbox paper as markdown given a dropbox paper id and access token.As the official SDK didn't seem to have a straightforward way to get to a dropbox paper document content.
### Converting markdown dropbox paper to "linear" json
The submodule [`md-to-json/linear.js`](./md-to-json/linear.js) takes the content of a markdown file as a string and converts it into an array of objects, representing markdown elements.
it's a flat data structure, with no nesting, hence why sometimes refered to as linear.
#### Example "linear json"
```json
[
{
"text": "Chapter 1",
"type": "h1"
},
{
"text": "Text",
"type": "h2"
},
{
"text": "vitae elementum velit urna id mi. Sed sodales arcu mi, eu condimentum tellus ornare non. Aliquam non mauris purus. Cras a dignissim tellus. Cras pharetra, felis et convallis tristique, sapien augue interdum ipsum, aliquet rhoncus enim diam vitae eros. Cras ullamcorper, lectus id commodo volutpat, odio urna venenatis tellus, vitae vehicula sapien velit eu purus. Pellentesque a feugiat ex. Proin volutpat congue libero vitae malesuada.",
"type": "p"
},
{
"text": "Video ",
"type": "h2"
},
...
]
```### Converting linear markdown json to nested json
For some use cases it might be heplfull to nest all the elments between an h1 tag to the next h1 take as siblings/childres/elements of that tag.
Eg h1 tag could contain h2, p tag, link etc..
Likewise h2 tag could contain all other elements up to the next h2 or h1 tag.
_NOTE_ dropbox paper flavour of markdown only properly reppresents `H1` and `H2` tags hence why we stopped the nesting only at two levels for this use case. But it could be nested further should there be a use case for it.
This is done in [`md-to-json/index.js`](./md-to-json/index.j)
#### Example "nested json"
```json
{
"title": "TEST CMS",
"elements": [
{
"text": "Chapter 1",
"type": "h1",
"elements": [
{
"text": "some text element between h1 and h2 tags",
"type": "p"
},
{
"text": "text",
"type": "h2",
"elements": [
{
"text": "vitae elementum velit urna id mi. Sed sodales arcu mi, eu condimentum tell.",
"type": "p"
}
]
},
...
}
```For full example see [`md-to-json/examples/example_output.json`](./md-to-json/examples/example_output.json).
## Development env
_How to run the development environment_
_Coding style convention ref optional, eg which linter to use_
_Linting, github pre-push hook - optional_
- node
- npm
- eslint see [`.eslintrc.json`](./.eslintrc.json)## Build
_How to run build_
NA `?`
## Tests
_How to carry out tests_
Minimal test coverage using [`jest`](https://facebook.github.io/jest/) for testing, to run tests:
```
npm test
```## Deployment
_How to deploy the code/app into test/staging/production_
NA, it's a node module.
## Contributing
- Pull requests are welcome.
- For questions, bugs, ideas feel free to raise a github issue.---
## Notes Dropbox "flavoured" markdown
Unforntunatelly, Dropbox paper has it's own flawour of markdown. Some of the most relevant and notable difference are:
- Title of the doc and first `heading 1` element, are both marked has `h1` / `#`.
- `Heading 3` is represented as bold `**` instead of `h3`/`###`.
- There's no Heading 4, 5 or 6.### Example of dropbox flavour markdown
see [`md-to-json/examples/test.md`](./md-to-json/examples/test.md) as an example of dropbox flavour markdown file.
### Markdown elements not included in module
- [ ] `H3` tag,since dropbox paper markdown represents it as bold `**`
- [ ] Parsing markdown github flavour tags `h3` to `h6` as not generated by dropbox paper markdown.### Markdown elements that could be included in module
- [X] Parsing markdown github flavour tags for images eg `![alt text](link url)`. These appear on their own line.
- _NOTE_ luckily even when displayed on the same line in dropbox paper, the images are still represented on individual lines when exported as markdown. Which makes it easier to identify as separate from other elements and parse.- [ ] Parsing markdown github flavour tags for links eg `[text](link url)` these generally appear as part of a paragraph, but could also appear in their own line, or as part of a heading etc..