https://github.com/vtfk/vis-til-arkiv-v3
Archives Visma InSchool documents
https://github.com/vtfk/vis-til-arkiv-v3
Last synced: 2 months ago
JSON representation
Archives Visma InSchool documents
- Host: GitHub
- URL: https://github.com/vtfk/vis-til-arkiv-v3
- Owner: vtfk
- License: mit
- Created: 2022-01-12T13:45:04.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2025-03-10T09:43:48.000Z (2 months ago)
- Last Synced: 2025-03-10T10:34:27.223Z (2 months ago)
- Language: JavaScript
- Size: 564 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 6
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# vis-til-arkiv
Nodejs script for archiving Visma InSchool documents, based on pdf-text-recognitionUses [@vtfk/pdf-text-reader](https://www.npmjs.com/package/@vtfk/pdf-text-reader) for extracting pdf text
Uses [@vtfk/pdf-splitter](https://www.npmjs.com/package/@vtfk/pdf-text-reader) for extracting pdf text
Uses [archiveApi/SyncElevmappe](https://github.com/vtfk/azf-archive#post-syncelevmappe) for getting student data as well as creation and updating of *Elevmapper*## Remarks
- Only tested and used on Windows as far as I know
- Not very suitable for other use cases than archiving pdfs when you do not have any other data than the pdf
- Good luck understanding the code... I am sorry, I take full responsibility## Setup
### Clone repo
```bash
$ git clone https://github.com/vtfk/vis-til-arkiv-v3
```### PDFtk
If you want to be able to split pdfs (handle a large pdf-document consisting of several documents of the same type),
Make sure you have [PDFtk](https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/) installed in the same environment as you are running the nodejs script from.*Use-case: Archiving several 'karakterprotokoller' in the one go - send in 300 protocols in one document, each protocol is handled as a separate document.*
See [et sted](#somePlace)### Install dependencies
```bash
$ npm i
```### Set up .env
```bash
NODE_ENV="production"
ROOT_DIRECTORY="C:/PROD-VIStilArkiv" #This is the root folder for the jobs and documents, can be wherever
STAT_FILE="C:/PROD-VIStilArkiv/stat.json" #A local statistics file
P360_SYNCELEVMAPPE_URL="https:///SyncElevmappe"
P360_SYNCELEVMAPPE_KEY=""
P360_SYNCELEVMAPPE_HEADER_NAME=""
PDFTK_EXT="C:/Program Files (x86)/PDFtk/bin/pdftk"
P360_ARCHIVE_DOC_URL="https:///DocumentService/CreateDocument"
P360_DISPATCH_DOC_URL="https://https:///DocumentService/DispatchDocuments"
P360_ARCHIVE_KEY=""
P360_ARCHIVE_QUERY_STRING=">"
E18_URL="https:///e18"
E18_KEY=""
E18_HEADER_NAME=""
DELETE_FINISHED_JOBS=false # If true, archived pdfs are deleted, if false, they are put in
DISPATCH_DIRECTORY_NAME="dispatchInput" #Optional, defaults to "input". Name of folder where script looks for pdfs
DOCUMENT_DIRECTORY_NAME="document" #Optional, defaults to "documents"
DELETE_DIRECTORY_NAME="delete" #Optional, defaults to "delete"
TYPE_SEARCH_WORD="VIS MAL TYPE" #Optional, defaults to "VIS MAL TYPE", used for recognizing documentTypes
TEAMSWEBHOOK_URL="" #Optional, for alerts in Teams
SMTP_HOST="" #Optional, for sending emails to users when documents are not recognized as a valid type
SMTP_PORT= #Optional, for sending emails to users when documents are not recognized as a valid type
PAPERTRAIL_HOST="" #Optional, for logging
PAPERTRAIL_TOKEN="" #Optional, for logging
```### Start the script
To see if it runs, the first run will set up necessary directories witihin the ROOT_DIRECTORY
```bash
$ node ./index.js
```## Usage
### Set up archive method
In *./archiveMethods.js* you can add, disable, or delete methods. Create a method for each documentType you want to archive**Example method without typeSearchWord**
```js
VISVarsel: {
active: true, // set to false to disable the method
id: 'VISVarsel', // set to the same as the property for the whole method. Don't ask why...
name: 'Varsel om fare for regn',
findDataMethod: 'visVarselDoc', // Use or create methods defined in "./lib/getData.js"
identifierStrings: ['Varsel om fare', 'yr'], // Sentences or words that uniquely distinguish this document
archiveTemplate: 'varsel-fare', // the template used to create archive metadata
internalNoteTemplate: 'internt-notat-varsel',
internalNote: './data/blockedAddress.pdf', // If svarUt is used, and the document could not be sent, send note to school
svarUt: false, // If document should be sent on svarut as well as archived
manualSvarUt: false, // If you need to manually control the document in P360 before svarut
schoolOrgnr: '994309153', // optional, overrides school found in document
accessGroup: 'Elev Kompetansebyggeren' // optional, overrides accessgroup found in document
}
```
**Example method with typeSearchWord**
```js
VIS001: {
active: true,
id: 'VIS001', // id is the value found in the document behind ":"
name: 'Fritak for opplæring i vær',
findDataMethod: 'soknad',
archiveTemplate: 'fritak-oppl-kro',
internalNoteTemplate: 'internt-notat-svarbrev',
internalNote: './data/blockedAddress.pdf',
svarUt: true,
manualSvarUt: false
}
```
**Example method with splitting enabled**
```js
VISVarsel: {
active: true,
id: 'VISVarsel',
name: 'Varsel om fare for regn',
findDataMethod: 'visVarselDoc', // note that the findDataMethod must check if documents need to be splitted
identifierStrings: ['Varsel om fare', 'yr'],
splitStrings: ['Varsel', 'om fare', 'for regn i dag'], // The split strings are words and sentences present on the page you want to split on
archiveTemplate: 'varsel-fare',
svarUt: false,
manualSvarUt: false,
}
```### Set up archvive template
Create a json file inside the *./templates directory*, reference the template in the corresponding archive method```json
// Use <<>>, where you want to replace the token with documentData.token, when running createMetadata.js
{
"Title": "Varsel om fare for regn på <<>>",
"UnofficialTitle": "Varsel om fare for regn på dag <<>> - <<>> - <<>>",
"DocumentDate": "<<>>",
"Archive": "Elevdokument",
"Category": "Dokument ut",
"Paragraph": "Offl. § 13 jf. fvl. § 13 (1) nr.1",
"AccessCodeDescription": "Offl §13 jf. fvl §13 første ledd pkt. 1 - taushetsplikt om værforhold",
"Status": "J",
"CaseNumber": "<<>>",
"AccessGroup": "<<>>",
"AccessCode": "13",
"ResponsibleEnterpriseNumber": "<<>>",
"Contacts": [
{
"ReferenceNumber": "<<>>",
"Role": "Avsender",
"IsUnofficial": false
},
{
"ReferenceNumber": "<<>>",
"Role": "Mottaker",
"IsUnofficial": true
}
],
"Files": [
{
"Base64Data": "<<>>",
"Category": "1",
"Format": "pdf",
"Status": "F",
"Title": "Varsel om fare for regn på <<>>",
"VersionFormat": "A"
}
]
}
```## Job-flow
### 1. Dispatch documents
- Get all pdfs in dispatch folder
- Extract text from pdfs, run recognition-methods
- If found an active document type defined in archive methods
- Move pdf to next job "Get data"
- Else
- Move to delete, and send email to user that sent document*If you already know the document-type, you could just put it in the next job and skip this step*
### 2. Get data
- For each archive method
- For each pdf in archive method get-data folder
- Extract text and run findDataMethod for this document type
- Save result and send to next job "sync student data"
### 3. Sync student data
- For each archive method
- For each pdf in archive method sync-student-data folder
- Send social security number or birthdate, firstname, lastname to [archiveApi/SyncElevmappe](https://github.com/vtfk/azf-archive#post-syncelevmappe), it handles elevmappe-stuff
- Save result and send to next job "get archive metadata"
### 3. Get archive metadata
- For each archive method
- For each pdf in archive method get-archive-metadata folder
- Send document and studentdata into create-metadata function, along with which archive template to use
- Save result and send to next job "archive document"
### 4. Archive document
- For each archive method
- For each pdf in archive method get-archive-metadata folder
- Send archive metadata along with base64 of pdf to P360
- If svarut
- Save result and send to next job "svarut"
- Else
- Save result and send to next job "stats and cleanup"
### 5. Svarut
- For each archive method
- For each pdf in archive method get-archive-metadata folder
- Send document on svarut to student
- Save result and send to next job "stats and cleanup"### 6. Stats and cleanup
- Save statistics and either delete or move pdfs and results to imported folder# License
[MIT](LICENSE)