Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/kieranjol/IFIscripts

Detailed documentation is available here: http://ifiscripts.readthedocs.io/en/latest/index.html
https://github.com/kieranjol/IFIscripts

batch-processing dcp digital-preservation ffv1 manifest metadata premis tiff-files transcode

Last synced: 2 months ago
JSON representation

Detailed documentation is available here: http://ifiscripts.readthedocs.io/en/latest/index.html

Awesome Lists containing this project

README

        

=================
This repo is now archived as is no longer maintained. The maintained, live repo is at https://github.com/Irish-Film-Institute/IFIscripts Full documentation at: http://ifiscripts.readthedocs.io/en/latest/index.html -
=================
.. inclusion-marker-do-not-remove
.. image:: http://readthedocs.org/projects/ifiscripts/badge/?version=latest

Introduction
============

Summary
-------

These scriptsfacilitate collections management workflows within the IFI Irish Film Archive. These scripts have been tested
in OSX, Windows 7 & 10, Ubuntu 14.04 & 16.04 & 18.04. They are located here on github: https://github.com/Irish-Film-Institute/IFIscripts

They are mostly Python 3.7 compatible but some are still Python 2.7 only.
Most scripts take either a file or a directory as their input, for
example ``makeffv1.py filename.mov`` or
``premis.py path/to/folder_of_stuff``. (It's best to just drag and drop
the folder or filename into the terminal as this provides the absolute path).

We want the project to be as reuseable as possible in different institutions and contexts. Some scripts, particularly anything to do with ``Object Entry`` or ``Accessioning`` will be quite IFI specific, but other scripts such as ``makeffv1.py``, ``dcpaccess.py`` and many others have been used in a variety of contexts in several different countries.

The project uses the MIT license, and we encourage the reuse, modification and study of the scripts. It's always nice to hear when the scripts have been reused in some way, but it's not necessary to let us know.

Purpose
-------

These python scripts facilitate much of our collections management procedures for digitised and born digital objects in the Irish Film Institute. We utilise a lot of open source tools, so we wanted to make these scripts as open as possible. This is why this project has the MIT License.

The Irish Film Institute has followed the SPECTRUM museum collections management standard for several years. These scripts attempt to follow SPECTRUM procedures while also utilising some of the concepts of the Open Archival Information System (OAIS). Initially the scripts only handled single video files, but they are now capable of handling:

* Digital Cinema Packages
* XDCAM cards
* DPX/TIFF image sequences
* Documents (.doc, .pdf etc)
* Images (.jpg, .TIFF etc)

An example workflow might be:

* A digital object is created or acquired by the IFI, and ``ingest`` begins.

* ``sipcreator.py`` is run on the object. This script:
* generates an ``Object Entry`` identifier (eg OE-1234)
* creates a folder structure for ``logs, metadata, objects``
* generates a ``UUID``, extracts technical metadata
* generates a md5 checksum manifest
* and see the usage section for more.

* All of these preservation events are logged in a log file located in the ``logs`` directory. This log file tries to use ``PREMIS (PREservation Metadata Implementation Strategies)`` terminology as much as possible.

* Even though the package has yet to be accessioned, temporary backups are required. ``copyit.py`` will generate backups, and it will use the checksum manifest generated by ``sipcreator.py`` to verify the integrity of the file transfer.

* If the package contains FFV1 or Matroska files, perhaps ``ffv1mkvvalidate.py`` could run, which uses ``mediaconch`` to verify the compliance of the files, and stores the information in the logfile.

* If the package passes our Quality Control Procedures, then it will be accessioned. ``accession.py`` will generate an accession number, rename the OE number with the accession number, generate a SHA-512 manifest and update the log file to document these new preservation events.

* A large batch of items can be accessioned using ``batchaccession.py``. If the ``-pbcore`` command line argument is used with the accessioning scripts, technical metadata based on the PBCore standard will be generated in CSV format. This process can be run seperately by using ``makepbcore.py``. CSV was chosen instead of XML as this allows us to immediately import the CSV into our database system so that we have item level records.

* Access copies may be needed, so low-res watermarked proxies can be generated with ``bitc.py``, or high res mezzanines with ``prores.py``.

* The accessioned package can then be written to preservation storage, again using the ``copyit.py`` command.

So this is just one way of using the scripts from acquisition to preservation storage, but there are many other scripts for specific workflows, which you can investigate further down in the documentation.

Table of Contents
-----------------

.. toctree::
:maxdepth: 4
:caption: Contents:

installation
contributing
usage
credits