Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/daniel-km/omeka-s-module-bulkimportfiles

Module for Omeka S to import files in bulk with their internal metadata (exif, iptc and xmp for images, audio and video, pdf, etc.).
https://github.com/daniel-km/omeka-s-module-bulkimportfiles

exif id3 import omeka-s omeka-s-module xmp

Last synced: about 2 months ago
JSON representation

Module for Omeka S to import files in bulk with their internal metadata (exif, iptc and xmp for images, audio and video, pdf, etc.).

Awesome Lists containing this project

README

        

Bulk Import Files (module for Omeka S)
======================================

> __New versions of this module and support for Omeka S version 3.0 and above
> are available on [GitLab], which seems to respect users and privacy better
> than the previous repository.__

[Bulk Import Files] is a module for [Omeka S] that allows to import files in
bulk with their internal metadata (for example exif, iptc and xmp for images,
audio and video, or pdf properties, etc.).

This module is backported as a [plugin] for [Omeka Classic]

Installation
------------

See general end user documentation for [installing a module].

The module [Common] must be installed first.

You may use the release zip to install it or clone the source via git.

* From the zip

Download the last release [BulkImportFiles.zip] from the list of releases (the
master does not contain the dependency), and uncompress it in the `modules`
directory.

* From the source and for development

If the module was installed from the source, rename the name of the folder of
the module to `BulkImportFiles`.

```sh
cd modules
git clone https://gitlab.com/Daniel-KM/Omeka-S-module-BulkImportFiles BulkImportFiles
cd BulkImportFiles
composer install --no-dev
```

Then install it like any other Omeka module and follow the config instructions.

The module uses external libraries, [`getid3`] and [`php-pdftk`], so use the
release zip to install it, or use and init the source.

The next times:

```sh
composer update
```

* Install pdftk

The command line tool `pdftk` is required to extract data from pdf without raw
xmp data. It should be installed on the server and the path should be set in the
config of the module.

Usage
-----

### Configuration

The mapping of each media type (`image/jpg`, `image/png`, `application/pdf`) is
managed via the files inside the folder `data/mapping`.

So the first thing to do is to create mappings will all the needed elements.

For example, for the `JPG` format, the values are the one that are exposed via
the following xml paths (`xmp` is xml and provides all `iptc` and `exif` metadata):

```
media_type = image/jpeg
dcterms:title = /x:xmpmeta/rdf:RDF/rdf:Description/@xmp:Label
dcterms:description = /x:xmpmeta/rdf:RDF/rdf:Description/@xmp:Caption
dcterms:created = /x:xmpmeta/rdf:RDF/rdf:Description/@xmp:CreateDate
dcterms:modified = /x:xmpmeta/rdf:RDF/rdf:Description/@xmp:ModifyDate
dcterms:format = /x:xmpmeta/rdf:RDF/rdf:Description/@tiff:Model
dcterms:subject = /x:xmpmeta/rdf:RDF/rdf:Description/dc:subject//rdf:li
```

Note that the first title is used as media type to import files, and the second
as title, if any.

If you prefer to use `exif` or `iptc`, here is the equivalent config:

```
media_type = image/jpeg
dcterms:title = iptc.IPTCApplication.Headline
dcterms:description = iptc.IPTCApplication.Caption
dcterms:created = jpg.exif.EXIF.DateTimeOriginal
dcterms:modified = jpg.exif.EXIF.DateTimeDigitized
dcterms:format = jpg.exif.FILE.MimeType
dcterms:subject = iptc.IPTCApplication.Keywords.0
dcterms:subject = iptc.IPTCApplication.Keywords.1
dcterms:subject = iptc.IPTCApplication.Keywords.2
dcterms:subject = iptc.IPTCApplication.Keywords.3
dcterms:subject = iptc.IPTCApplication.Keywords.4
```

The order of the mapping can be the opposite (`iptc.IPTCApplication.Headline = dcterms:title`),
but all the maps should be in the same order.

Note that metadata can be slighly different between standards.

### Assistant to create or update a mapping

An assistant is available to create or update a mapping via the second
sub-menu. Simply choose a directory where the files you want to import are
located, create your mapping and save it.

The assistant works only with data extractable as an array (getid3 or pdf), not
for xml data, that requires manual edition of xpaths.

### Upload

Once the mappings are ready, you can upload files via the third sub-menu
`Process import`. Just choose the folder where are files to import, then check
and add the files.

Warning
-------

Use it at your own risk.

It’s always recommended to backup your files and your databases and to check
your archives regularly so you can roll back if needed.

Troubleshooting
---------------

See online issues on the [module issues] page on GitLab.

License
-------

* Module

This module is published under the [CeCILL v2.1] license, compatible with
[GNU/GPL] and approved by [FSF] and [OSI].

This software is governed by the CeCILL license under French law and abiding by
the rules of distribution of free software. You can use, modify and/ or
redistribute the software under the terms of the CeCILL license as circulated by
CEA, CNRS and INRIA at the following URL "http://www.cecill.info".

As a counterpart to the access to the source code and rights to copy, modify and
redistribute granted by the license, users are provided only with a limited
warranty and the software’s author, the holder of the economic rights, and the
successive licensors have only limited liability.

In this respect, the user’s attention is drawn to the risks associated with
loading, using, modifying and/or developing or reproducing the software by the
user in light of its specific status of free software, that may mean that it is
complicated to manipulate, and that also therefore means that it is reserved for
developers and experienced professionals having in-depth computer knowledge.
Users are therefore encouraged to load and test the software’s suitability as
regards their requirements in conditions enabling the security of their systems
and/or data to be ensured and, more generally, to use and operate it in the same
conditions as regards security.

The fact that you are presently reading this means that you have had knowledge
of the CeCILL license and that you accept its terms.

* Dependencies

See licences of dependencies.

Copyright
---------

* Copyright Daniel Berthereau, 2019-2024

[Bulk Import Files]: https://gitlab.com/Daniel-KM/Omeka-S-module-BulkImportFiles
[Omeka S]: https://omeka.org/s
[Omeka Classic]: https://omeka.org/classic
[plugin]: https://gitlab.com/Daniel-KM/Omeka-plugin-BulkImportFiles
[`getid3`]: https://getid3.org
[`php-pdftk`]: https://github.com/mikehaertl/php-pdftk
[`pdftk`]: https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit
[Common]: https://gitlab.com/Daniel-KM/Omeka-S-module-Common
[BulkImportFiles.zip]: https://gitlab.com/Daniel-KM/Omeka-S-module-BulkImportFiles/-/releases
[Bulk Import]: https://gitlab.com/Daniel-KM/Omeka-S-module-BulkImport
[installing a module]: https://omeka.org/s/docs/user-manual/modules/#installing-modules
[module issues]: https://gitlab.com/Daniel-KM/Omeka-S-module-BulkImportFiles/-/issues
[CeCILL v2.1]: https://www.cecill.info/licences/Licence_CeCILL_V2.1-en.html
[GNU/GPL]: https://www.gnu.org/licenses/gpl-3.0.html
[FSF]: https://www.fsf.org
[OSI]: http://opensource.org
[GitLab]: https://gitlab.com/Daniel-KM
[Daniel-KM]: https://gitlab.com/Daniel-KM "Daniel Berthereau"