{"id":21076095,"url":"https://github.com/altomator/image_retrieval","last_synced_at":"2025-12-27T22:28:13.757Z","repository":{"id":154891460,"uuid":"100796278","full_name":"altomator/Image_Retrieval","owner":"altomator","description":"Image Retrieval in Digital Libraries - A Multicollection Experimentation of Machine Learning techniques","archived":false,"fork":false,"pushed_at":"2023-04-28T08:31:26.000Z","size":66652,"stargazers_count":26,"open_issues_count":1,"forks_count":5,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-01-20T22:52:46.534Z","etag":null,"topics":["alto-xml","api","bnf","digital-libraries","digital-library","europeana","face-detection","gallica","iiif","image-classification","image-recognition","image-retrieval","inception-v3","mets-xml","ocr","watson-visual-recognition"],"latest_commit_sha":null,"homepage":"","language":"XQuery","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/altomator.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-08-19T13:32:07.000Z","updated_at":"2024-04-04T13:11:23.000Z","dependencies_parsed_at":null,"dependency_job_id":"8e81a4ec-8406-4c1c-8664-d7ca189f02d3","html_url":"https://github.com/altomator/Image_Retrieval","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/altomator%2FImage_Retrieval","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/altomator%2FImage_Retrieval/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/altomator%2FImage_Retrieval/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/altomator%2FImage_Retrieval/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/altomator","download_url":"https://codeload.github.com/altomator/Image_Retrieval/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243521245,"owners_count":20304183,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["alto-xml","api","bnf","digital-libraries","digital-library","europeana","face-detection","gallica","iiif","image-classification","image-recognition","image-retrieval","inception-v3","mets-xml","ocr","watson-visual-recognition"],"created_at":"2024-11-19T19:26:36.160Z","updated_at":"2025-12-27T22:28:08.730Z","avatar_url":"https://github.com/altomator.png","language":"XQuery","funding_links":[],"categories":[],"sub_categories":[],"readme":"### Synopsis\nThis work uses an ETL (extract-transform-load) approach and deep learning techniques to implement image retrieval functionalities in digital librairies.\n\nSpecs are: \n1. Identify and extract iconography wherever it may be found, in the still images collection but also in printed materials (newspapers, magazines, books).\n2. Transform, harmonize and enrich the image descriptive metadata (in particular with deep learning classification tools: IBM Watson, Google Cloud Vision, yolo for visual recognition, TensorFlow Inception-V3 for image types classification).\n3. Load all the medatada into a web app dedicated to hybrid image retrieval. \n\nA proof of concept, [GallicaPix](http://demo14-18.bnf.fr:8984/rest?run=findIllustrations-form.xq) has been implemented on the World War 1 theme. All the heritage materials (photos, drawings, engravings, maps, posters, etc.) have been mainly harvested from the BnF (Bibliotheque national de France) digital collections [Gallica](gallica.bnf.fr). The [Welcome Collection](https://wellcomecollection.org/) has been leveraged too, through the Europeana aggregator. This PoC is referenced on [Gallica Studio](http://gallicastudio.bnf.fr/), the Gallica online participative platform dedicated to the creative uses that can be made from Gallica. \n\n\n![GallicaPix](https://github.com/altomator/Image_Retrieval/blob/master/Images/gpix.jpg)\n\n*Looking for [Georges Clemenceau](http://demo14-18.bnf.fr:8984/rest?run=findIllustrations-app.xq\u0026filter=1\u0026start=1\u0026action=first\u0026module=0.5\u0026similarity=\u0026corpus=1418-v2\u0026keyword=clemenceau\u0026kwTarget=\u0026kwMode=\u0026title=\u0026fromDate=\u0026toDate=\u0026iptc=00\u0026persType=00\u0026classif=\u0026operator=and\u0026colName=00\u0026illType=\u0026size=31\u0026density=26) iconography in GallicaPix*\n\n\n### Articles, blogs, conferences\n- [\"Image Retrieval in Digital Libraries\"](https://docs.google.com/document/d/1-wRlvuJG30oZw69Xp1jCe8p5MwagTM6WxGTPrELiwK0/edit) (EN article, [FR article](https://docs.google.com/document/d/1hKU-UOy5fg6AXVVYRrgM9SH-fMYYPfTR0RpaFU6QCb4)), IFLA News Media section 2017 (Dresden, August 2017). \n- [\"Hybrid Image Retrieval in Digital Libraries\"](https://fr.slideshare.net/Europeana/hybrid-image-retrieval-in-digital-libraries-by-jeanphilippe-moreux-europeanatech-conference-2018), EuropeanaTech 2018 (Rotterdam). Poster, TPDL 2018 (Porto)\n\n- [\"HYBRID IMAGE RETRIEVAL IN DIGITAL LIBRARIES: EXPERIMENTATION OF DEEP LEARNING TECHNIQUES\"](https://pro.europeana.eu/page/issue-10-innovation-agenda), EuropeanaTech Insight, Issue 10, 2018\n\n- [\"EXPLORER DES CORPUS D’IMAGES. L’IA AU SERVICE DU PATRIMOINE\"](https://bnf.hypotheses.org/2809), projet CORPUS, atelier BnF, 18 avril 2018\n\n- [\"Using IIIF for Image Retrieval in Digital Libraries: Experimentation of Deep Learning Techniques\"](https://drive.google.com/open?id=1jsxT5lW3bR-582UX7a9oeOc5nGQjoQnb), [presentation], 2019 IIIF Conference (Göttingen, June 2019). \n\n- [\"Intelligence artificielle et fouille de contenus iconographiques patrimoniaux\"](https://adbu.fr/retour-sur-la-journee-detude-du-congres-adbu2019-tous-bibl-ia-thecaires-lintelligence-artificielle-vers-un-nouveau-service-public/), [video \u0026 presentation], Congrès ADBU 2019 (Bordeaux, septembre 2019). \n\n- [\"Recherche d’images dans les bibliothèques numériques patrimoniales et expérimentation de techniques d’apprentissage profond\"](https://www.erudit.org/en/journals/documentation/1900-v1-n1-documentation04848/1063786ar/),  *Documentation et bibliothèques*, Volume 65, Number 2, April–June 2019, p. 5–27\n\n- [\"Plongez dans les images de 14-18 avec notre nouveau moteur de recherche iconographique GallicaPix\"](http://gallicastudio.bnf.fr/bo%C3%AEte-%C3%A0-outils/plongez-dans-les-images-de-14-18-en-testant-un-nouveau-moteur-de-recherche) (FR blog post)\n- [\"Towards new uses of cultural heritage online: Gallica Studio\"](http://pro.europeana.eu/post/towards-new-uses-of-cultural-heritage-online-gallica-studio) (EN blog post)\n- [Tutorial on images classification](https://github.com/CENL-Network-Group-AI/Recipes/wiki/Images-Classification-Recipe)\n \n### Datasets\nThe datasets are available as metadata files (one XML file/document) or JsonML dumps of the BaseX database. Images can be extracted from the metadata files thanks to [IIIF Image API](http://iiif.io/api/image/2.0/): \n- Complete WW1 dataset (222k illustrations): ftp://ftp.bnf.fr/api/jeux_docs_num/Images/GallicaPix/1418-data.zip\n- Illustrated WW1 ads dataset (65k illustrations): ftp://ftp.bnf.fr/api/jeux_docs_num/Images/GallicaPix/1418ads-data.zip\n- Persons ground truth (4k illustrations): [XML](http://www.euklides.fr/blog/altomator/Image_Retrieval/GT-Persons_xml.zip), [JSON](http://www.euklides.fr/blog/altomator/Image_Retrieval/GT-Persons_json.zip)\n\nMore thematic datasets have been produced:\n- [Vogue magazine](https://gallica.bnf.fr/ark:/12148/cb343833568/date), French edition, 1920-1940 (35k illustrations): [XML](https://github.com/altomator/Image_Retrieval/blob/master/vogue-data.zip)\n- Zoology samples (8.7k illustrations): [XML](https://github.com/altomator/Image_Retrieval/blob/master/zoology-data.zip)\n- Wallpapers and fabric designs, BnF and [TNA](https://www.nationalarchives.gov.uk/) collections (3.7k illustrations): [XML](https://github.com/altomator/Image_Retrieval/blob/master/designs-data.zip)\n\n![GallicaPix](https://github.com/altomator/Image_Retrieval/blob/master/Images/vogue.jpg)\n*Vogue magazine (1920-1940)* \n\n***\n\n### Installation \n\n1. Install BaseX: download the complete package from [basex.org](https://basex.org/download/) and unzip the archive in your Applications folder\n\n2. Launch the BaseX GUI: e.g.\n\n```\n\u003e /Applications/basex924/bin/basexgui\n```\n\n3. With the GUI, create the WW1 database from the dataset (1418-data.zip). The XML content should be displayed in BaseX.\n\n4. Setup the HTTP BaseX server: setting up the server is detailled [here](https://github.com/altomator/EN-data_mining).\n\n5. Copy all the [WebApp](https://github.com/altomator/Image_Retrieval/tree/master/WebAppBaseX) repo (XQuery files and the other support files: .css, .jpg) in your `$RESTPATH/webapp` folder.\n\n6. Test the local web app: http://localhost:8984/rest?run=findIllustrations-form.xq\u0026locale=en\n\n\n***\n\n### Workflow ####\n\n*A. Extract*\n\n*B. Transform \u0026 Enrich*\n\n*C. Load*\n\n\n\n\u003cb\u003eNote\u003c/b\u003e: All the scripts have been written by an amateur developer. They have been designed for the Gallica digital documents and repositories but could be adapted to other contexts.\n\nSome Perl or Python packages may need to be installed first. Sample documents are generally stored in a \"DOCS\" folder and output samples in a \"OUT\" folder.\n\n### A. Extract\nThe global workflow is detailled bellow.\n\n![Workflow: extract](https://github.com/altomator/Image_Retrieval/blob/master/Images/worflow.jpg)\n\nThe extract step can be performed from the catalog metada (using OAI-PMH and SRU protocols) or directly from the digital documents files (and their OCR). \n\n#### OAI-PMH\nThe OAI-PMH Gallica repository ([endpoint](http://oai.bnf.fr/oai2/OAIHandler?verb=Identify)) can be used to extract still image documents (drawings, photos, posters...) The extractMD_OAI.pl script harvests sets of documents or documents. Note: this script needs a Web connection (for Gallica OAI-PMH and Gallica APIs).\n\nEuropeana OAI is also supported (see EDM.pl for the Europeana Data Model mapping).\n\nPerl script extractMD_OAI.pl can handled 3 methods which needs to be choosen within the script:\n- harvesting a complete OAI Set, from its name:\n```perl\ngetOAI($set);\n```\n- harvesting a document from its ID:\n```perl\ngetRecordOAI(\"ark:/12148/bpt6k10732244\");\n```\n- harvesting a list of documents from a file of IDs:\n```perl\nrequire \"arks.pl\";\n```\n\nUsage: \n\u003eperl extractMD_OAI.pl oai_name oai_set out_folder format \n\n\nwhere: \n- oai_name: gallica/europeana\n- oai_set:  the OAI set title\n- out_fodler: the output folder\n- format: the only output format supported is xml\n\nExample:\n\u003eperl extractMD_OAI.pl gallica gallica:corpus:1418 OUT xml \n\n\nThis script also performs (using the available metadata):\n- IPTC topic classification (considering the WW1 theme)\n- image genres classification (photo/drawing/map...)\n\nIt outputs one XML metadata file per document, describing the document (title, date...), each page of the document and the included illustrations. Some illustrations are \"filtered\" due to their nature (empty page, bindings) thanks to the Gallica Pagination API.\n\n```xml\n\u003c?xml version=\"1.0\" encoding=\"UTF-8\"?\u003e\n\u003canalyseAlto\u003e\n\u003cmetad\u003e\n\t\u003ctype\u003eI\u003c/type\u003e\n\t\u003cID\u003ebpt6k3850489\u003c/ID\u003e\n\t\u003ctitre\u003eNos alliés du ciel : cantique-offertoire : solo \u0026amp; choeur à l_unisson avec accompagnement d'orgue / paroles du chanoine S. Coubé   musique de F. Laurent-Rolandez\u003c/titre\u003e\n\t\u003cdateEdition\u003e1916\u003c/dateEdition\u003e\n\t\u003cnbPage\u003e10\u003c/nbPage\u003e\n\t\u003cdescr\u003eChants sacrés acc. d_orgue -- 20e siècle -- Partitions et parties\u003c/descr\u003e\n\u003c/metad\u003e\n\u003ccontenus  ocr=\"false\" toc=\"false\"\u003e\n\t\u003clargeur\u003e135\u003c/largeur\u003e\n\t\u003chauteur\u003e173\u003c/hauteur\u003e\n\t\u003cpages\u003e\n\t\t\u003cpage  ordre=\"1\"\u003e\u003cblocIllustration\u003e1\u003c/blocIllustration\u003e\n\t\t\t\u003cills\u003e\n\t\t\t\t\u003cill  h=\"4110\" taille=\"6\" couleur=\"gris\" y=\"1\" w=\"3204\" n=\"1-1\" x=\"1\"\u003e\u003cgenre  CS=\"1\"\u003ephoto\u003c/genre\u003e\n\t\t\t\t\u003cgenre  CS=\"0.95\"\u003epartition\u003c/genre\u003e\n\t\t\t\t\u003ctheme  CS=\"0.8\"\u003e01\u003c/theme\u003e\n\t\t\t\t\u003ctitraille\u003eNos alliés du ciel : cantique-offertoire : solo \u0026amp; choeur à l_unisson avec accompagnement d'orgue / paroles du chanoine S. Coubé   musique de F. Laurent-Rolandez\u003c/titraille\u003e\n\t\t\t\t\u003c/ill\u003e\n\t\t\t\u003c/ills\u003e\n\t\t\u003c/page\u003e\n\t\u003c/pages\u003e\n\u003c/contenus\u003e\n\u003c/analyseAlto\u003e\n```\n\n\n#### SRU (Search/Retrieve via URL)\nSRU requesting of Gallica digital library can be done with the extractARKs_SRU.pl script.\nThe SRU request can be tested within gallica.bnf.fr and then copy/paste directly in the script:\n\n```perl\n$req=\"%28gallica%20all%20%22tank%22%29\u0026lang=fr\u0026suggest=0\"\n```\n\nIt outputs a text file (one ark ID per line). This output can then be used as the input of the OAI-PMH script.\n\nUsage:\n\u003eperl extractARKs_SRU.pl OUT.txt\n\n\n#### OCR\nPrinted collections (with OCR) can be analysed using extractMD.pl script. \nIt can handle various types of digital documents (books, newspapers) produced by BnF digitization programs or during the Europeana Newspapers project. Samples are available (see readme.txt).\n\nRegarding the newspapers type, the script can handle raw ALTO OCR mode or OLR mode (articles recognition described with a METS/ALTO format):\n- \u003cb\u003eocrbnf\u003c/b\u003e: to be used with BnF documents (monographs, serials) described with a METS manifest\n- \u003cb\u003eocrbnflegacy\u003c/b\u003e: to be used with BnF documents (monographs, serials) described with a refNum manifest\n- \u003cb\u003eolrbnf\u003c/b\u003e: to be used with BnF serials described with a METS manifest and an OLR mode (BnF METS profil) \n- \u003cb\u003eocren\u003c/b\u003e: to be used with Europeana Newspapers serials described with a METS manifest\n- \u003cb\u003eolren\u003c/b\u003e: to be used with Europeana Newspapers serials described with a METS manifest and an OLR mode (\u0026copy;CCS METS profil)\n\nThe script can handle various dialects of ALTO (ALTO BnF, ALTO LoC...) which may have different ways to markup the illustrations and to express the blocks IDs. This script is the more BnF centered and it may be complex to adapt to other context. An attempt has been made with a sample delivered by the SB Berlin library.\n\nSome parameters must be set in the Perl script, the remaining via the command line options (see readme.txt in the OCR folder).\n\nUsage:\n\u003eperl extractMD.pl [-LI] mode title IN OUT format\n\nwhere:\n- -L : extraction of illustrations is performed: dimensions, caption...\n- -I : BnF ARK IDs are computed\n- mode : types of documents (ocren, olren, ocrbnf, olrbnf)\n- title: some newspapers titles need to be identified by their title\n- IN : digital documents input folder \n- OUT : output folder\n- format: XML only\n\nExample for the Europeana Newspapers subdataset *L'Humanité*, with ark IDs computation and illustrations extraction:\n\u003eperl extractMD.pl -LI ocren Humanite OCR-EN-BnF OUT-OCR-EN-BnF xml\n\n\nNote: some monoline OCR documents may need to be reformatted before running the extraction script, as it does not parse the XML content (for efficiency reasons) but use grep patterns at the line level.\nUsage:\n\u003eperl prettyprint.pl IN\n\nThe script exports the same image metadata than before but also texts and captions surrounding illustrations: \n```xml\n\u003cill  w=\"4774\" n=\"4-5\" couleur=\"gris\" filtre=\"1\" y=\"3357\" taille=\"0\" x=\"114\" derniere=\"true\" h=\"157\"\u003e\u003cleg\u003e \u003c/leg\u003e\n\u003ctxt\u003eFans— Imprimerie des Arts et Manufactures» S.rue du Sentier. (M. Baunagaud, imp.) \u003c/txt\u003e\n```\n\nSome illustrations are filtered according to their form factor (size, localization on the page). In such cases, the illustrations are exported but they are reported with a filtered attribute (\"filtre\") set to true.\n\nAfter this extraction step, the metadata can be enriched (see next section, B.) or directly be used as the input of BaseX XML databases (see. section C.).\n\nFor newspapers and magazines collections, another kind of content should be identified (and eventually filtered), the illustrated ads (reported with a \"pub\" attribute set to true). \n\n\n#### External sources\nRaw images files or other digital catalogs can be used as sources to the GallicaPix database. For these use cases, the images file are locally stored (no use of IIIF).\n\n*Example: The National Archives* \n\nThis first Perl script takes a folder of TNA images as input and populates a GallicaPix metadata template XML file, based on the file names and some parameters:\n\n\u003eperl extractMD_TNA_noMD.pl IN_TNA/\n\nIf metadata can be extracted first from a catalog (https://discovery.nationalarchives.gov.uk, see the crawl_TNA.py script), this script aggregates metadata from the file names and the metadata:\n\n\u003eperl extractMD_TNA.pl IN_TNA/ MD-BT43.xml\n\n### B. Transform \u0026 Enrich\n\nThe toolbox.pl Perl script performs basic operations on the illustrations XML metadata files and the enrichment processing itself. This script supports the enrichment workflow as detailled bellow.\n\n![Workflow: extract](https://github.com/altomator/Image_Retrieval/blob/master/Images/workflow2.jpg)\n\nAll the treatments described in the following sections enrich the  metadata illustrations and set some attributes on these new metadata: \n- `classif`: the treatment applied (CC: content classification, DF: face detection)\n- `source`: the source of the treatment (IBM Watson, Google Cloud Vision, OpenCV/dnn, Tensorflow/Inception-v3)\n- `CS`: the confidence score\n- `lang`: the tag's language \n...\n\n(See the XML schema for a detailled presentation of the data model.)\n\n```xml\n\u003cill classif=\"CCibm\" ... \u003e\n            \u003ccontenuImg CS=\"0.598\" lang=\"en\" source=\"ibm\"\u003eclothing\u003c/contenuImg\u003e\n            \u003ccontenuImg CS=\"0.5\" lang=\"en\" source=\"ibm\"\u003esister\u003c/contenuImg\u003e\n            \u003ccontenuImg CS=\"0.5\" lang=\"en\" source=\"ibm\"\u003enurse\u003c/contenuImg\u003e\n            \u003ccontenuImg CS=\"0.501\" lang=\"en\" source=\"ibm\"\u003emason\u003c/contenuImg\u003e\n            \u003ccontenuImg CS=\"0.502\" lang=\"en\" source=\"ibm\"\u003eindoors\u003c/contenuImg\u003e\n\t    \u003cgenre CS=\"0.88\" source=\"TensorFlow\"\u003ephoto\u003c/genre\u003e\n```\n\n\n#### Image genres classification\n[Inception-v3](https://www.tensorflow.org/tutorials/image_recognition) model (Google's convolutional neural network, CNN) has been retrained on a multiclass ground truth datasets (photos, drawings, maps, music scores, comics... 12k images). Three Python scripts (within the Tensorflow framework) are used to train (and evaluate) a model:\n- split.py: the GT dataset is splited in a training set (e.g. 2/3) and an evaluation set (1/3). The GT dataset path and the training/evaluation ratio must be defined in the script.\n- retrain.py: the training set is used to train the last layer of the Inception-v3 model. The training dataset path and the generated model path must be defined.\n- label_image.py: the evaluation set is labeled by the model. The model path and the input images path must be defined.\n\n\u003epython3 split.py \n\n\u003epython3 retrain.py \n\n\u003epython3 label_image.py \n\nTo classify a set of images, the following steps must be chained:\n\n1. Extract the image files from a documents metadata folder thanks to the IIIF protocol:\n\u003eperl toolbox.pl -extr IN_md\n\nMind to set a reduction factor in the \"facteurIIIF\" parameter (eg: `$factIIIF`=50) as the CNN resizes all images to a 299x299 matrix.\n\n2. Move the OUT_img folder to a place where it will be found by the next script.\n\n3. Classify the images with the CNN trained model:\n\u003epython3 label_image.py \u003e data.csv\n\nThis will output a line per classified image:\n\n```csv\nbd\tcarte\tdessin\tfiltrecouv\tfiltretxt\tgravure\tphoto\tfoundClass\trealClass\tsuccess\timgTest\n0.01\t0.00\t0.96\t0.00\t0.00\t0.03\t0.00\tdessin\tOUT_img\t0\t./imInput/OUT_img/btv1b10100491m-1-1.jpg\n0.09\t0.10\t0.34\t0.03\t0.01\t0.40\t0.03\tgravure\tOUT_img\t0\t./imInput/OUT_img/btv1b10100495d-1-1.jpg\n...\n```\n\nEach line describes the best classified class (according to its probability) and also the probability for all the other classes.\n\n4. The classification data must then be reinjected in the metadata files:\n- Copy the data.csv file at the same level than the toolbox.pl script (or set a path in the `$dataFile` var)\n- Set some parameters in toolbox.pl: \n  - `$TFthreshold`: minimal confidence score for a classification to be used\n  - `$lookForAds`: for newspapers, say if the ads class must be used \n  - `$dataFile`: the CSV file name \n\n- Use the toolbox.pl script to import the CNN classification data in the illustrations metadata files:\n\n\u003eperl toolbox.pl -importTF IN_md \n\n\u003eperl toolbox.pl -importTF IN_md -p # for newspapers\n\nAfter running the script, a new `genre` metadata is created:\n```xml\n\u003cgenre CS=\"0.52\" source=\"TensorFlow\"\u003egravure\u003c/genre\u003e\n```\nExample: [caricatures](http://demo14-18.bnf.fr:8984/rest?run=findIllustrations-app.xq\u0026filter=1\u0026start=1\u0026action=first\u0026module=0.5\u0026similarity=\u0026corpus=1418-v2\u0026keyword=clemenceau\u0026kwTarget=\u0026kwMode=\u0026title=\u0026fromDate=\u0026toDate=\u0026iptc=00\u0026persType=00\u0026classif=\u0026operator=and\u0026colName=00\u0026illType=dessin\u0026size=31\u0026density=26) of George Clemenceau can be found using the Genre facet.\n\nThe filtering classes (text, blank pages, cover...) are handled later (see section \"Wrapping up the metadata\").\n\n#### Wrapping up the metadata \nThe illustrations may have been processed by multiple enrichment technics and/or described by catalogs metadata. For some metadata like topic and image genre, a \"final\" metadata is computed from these different sources and is described as the \"final\" data to be queried by the web app.\n\nFor image genre, first, a parameter must be set:\n- `$forceTFgenre`: force TF classifications to supersed the metadata classifications\n\nUsage:\n\u003eperl toolbox.pl -unify IN \n\nThe same approach can be used on the topic metadata (`unifyTheme` option).\n\nAll the sources are preserved but a new \"final\" metadata is generated, via a rules-based system. In the following example, the Inception CNN found a photo but this result has been superseded by a human correction. E.g. for image genres:\n```xml\n\u003cgenre source=\"final\"\u003edrawing\u003c/genre\u003e\n\u003cgenre CS=\"0.88\" source=\"TensorFlow\"\u003ephoto\u003c/genre\u003e\n\u003cgenre CS=\"0.95\" source=\"hm\"\u003edrawing\u003c/genre\u003e\n```\n\nThe noise classes for genres classification are also handled during the unify processing. If an illustration is noise, the `filtre` attribute is set to true.\n\n#### Image recognition\n\nVarious APIs or open sources frameworks can be tested and their results be requested within the web app thanks to the CBIR criteria (see screen capture below).\n\n\n##### IBM Watson\nWe've used IBM Watson [Visual Recognition API](https://www.ibm.com/watson/developercloud/doc/visual-recognition/index.html). The toolbox.pl script calls the API to perform visual recognition of content or human faces. \n\nSome parameters should be set before running the script:\n- `$ProcessIllThreshold`: max number of illustrations to be processed (Watson allows a free amount of calls per day)\n- `$CSthreshold`: minimum confidence score for a classification to be used\n- `$genreClassif`: list of illustration genres to be processed (drawing, pictures... but not maps)\n- `$apiKeyWatson`: your API key\n\nUsage for content recognition:\n\u003eperl toolbox.pl -CC IN -ibm\n\nUsage for face detection:\n\u003eperl toolbox.pl -DF IN -ibm\n\nNote: the image content is sent to Watson as a IIIF URL.\n\nThe face detection Watson API (2020 update: this API has been dismissed by IBM) also outputs cropping and genre detection:\n```xml\n\u003ccontenuImg CS=\"0.96\" h=\"2055\" l=\"1232\" sexe=\"M\" source=\"ibm\" x=\"1900\" y=\"1785\"\u003eface\u003c/contenuImg\u003e\n```\nExample: looking for  [\"poilus\"](http://demo14-18.bnf.fr:8984/rest?run=findIllustrations-app.xq\u0026filter=1\u0026start=1\u0026action=first\u0026module=0.5\u0026similarity=\u0026corpus=1418-v2\u0026keyword=poilu\u0026kwTarget=\u0026kwMode=\u0026title=\u0026fromDate=\u0026toDate=\u0026iptc=00\u0026persType=face\u0026classif=\u0026operator=and\u0026colName=00\u0026size=31\u0026density=26) faces.\n\n##### Google Cloud Vision\nThe very same visual content indexing can be performed with the Google Cloud Vision API.\nJust mind to set your key in `$apiKeyGoogle`.\n\nUsage for content recognition:\n\u003eperl toolbox.pl -CC IN -google\n\nNote: The Google face detection API outputs cropping but doesn't support genre detection.\n\n\n###### OCR \nThe Google Vision OCR can be applied to illustrations for which no textual metadata are available.\n\n\u003eperl toolbox.pl -OCR -IN_md\n\n\n##### Face and object detection \nA couple of Python scripts are used to apply face and objet detection to the illustrations. They output CSV data that must then be imported in the XML metadata files.\n\n\n\n###### OpenCV/dnn module\nThe [dnn](https://github.com/opencv/opencv/tree/master/modules/dnn) module can be used to try some pretrained neural network models imported from frameworks as Caffe or Tensorflow.\n\nThe detect_faces.py script performs face detection based on a ResNet network (see this [post](https://www.pyimagesearch.com/2018/02/26/face-detection-with-opencv-and-deep-learning/) for details).\n\n1. Extract the illustration files from a collection:\n\u003eperl toolbox.pl -extr IN_md\n\nNote: mind to set the size factor for IIIF image exportation in $factIIIF\n\n2. Process the images:\n\u003epython detect_faces.py --prototxt deploy.prototxt.txt --model res10_300x300_ssd_iter_140000.caffemodel --dir IN_img\n\nNote: the minimum confidence probability for a classification to be exported can be set via the command line.\n\nIt outputs a CSV file per input image, what can be merged in one file:\n\u003ecat OUT_csv/*.csv \u003e ./data.csv\nor\n\u003efind . -name \"*.csv\" -maxdepth 1  -exec cat {} \u003edata.csv  \\;\n\n3. Finally import the classification in the metadata files. Mind to set the classification source as a parameter: \n\u003eperl toolbox.pl -importDF IN_md dnn\n\nNote: to have consistent bounding boxes, mind to keep the same size factor in $factIIIF.\n\n![faces](https://github.com/altomator/Image_Retrieval/blob/master/Images/faces.jpg)\n\nAn object_detection.py script performs in a similar way to make content classification, thanks to a GoogLeNet network (see this [post](https://www.pyimagesearch.com/2017/08/21/deep-learning-with-opencv/) for details). It can handle a dozen of classes (person, boat, aeroplane...):\n\n\u003epython object_detection.py --prototxt MobileNetSSD_deploy.prototxt.txt --model MobileNetSSD_deploy.caffemodel --dir IN_img\n\n###### Yolo v3 and v4 models\nThe yolo.py Python script performs object detection on a 80 classes model (see this [post](https://www.pyimagesearch.com/2018/11/12/yolo-object-detection-with-opencv/) for details).\n\n\u003epython3 yolo.py --dir images --yolo yolo-coco\n\n#### Color analysis\nColor names can be extracted from the colors palette (RVB) produced by the Google Cloud Vision API (done with the -CC option).\nColors may also be extracted from images thanks to the RoyGBiv Python package (based on the Colorific package).\n\n\u003els IMG/*.jpg | python3 extract_colors.py\n\nThis script generates a .csv data file and small image palette (one palette for each image) which may be displayed on top of the illustration. (The script can also detect the background color.)\n\nThe CSV color data can then be imported:\n\u003eperl toolbox.pl -importColors IN no_bckg/bckg\n\n```xml\n\u003ccontenuImg b=\"20\" coul=\"1\" g=\"26\" ordre=\"4\" r=\"37\" source=\"colorific\" type=\"bckg\"\u003e#251a14\u003c/contenuImg\u003e\n```\n![Colors](http://www.euklides.fr/blog/altomator/Image_Retrieval/colors.png)\n*Looking for wallpaper patterns with a specific color background*\n\n\n#### Languages\nThe GallicaPix Web app offers 2 languages (FR, EN). Classification tags from the IBM or Google APIs can be translated from English to any other language with the -translateCC option. Bilingual lexicons must be set in $googleDict or $ibmDict vars.\n\n\u003eperl toolbox.pl -translateCC IN ibm\n\n```xml\n\u003ccontenuImg CS=\"0.67\" lang=\"en\" source=\"google\"\u003eCathedral\u003c/contenuImg\u003e\n\u003ccontenuImg CS=\"0.67\" lang=\"fr\" source=\"google\"\u003ecathédrale\u003c/contenuImg\u003e\n```\n\n\n### C. Load\nAn XML database (BaseX.org) is the back-end. Querying the metadata is done with XQuery. \nNote: the web app is minimalist and BaseX is not an effective choice for searching in very large databases.\n\nThe web app uses [IIIF Image API](http://iiif.io/api/image/2.0/) and [Mansory](https://masonry.desandro.com/) grid layout JavaScript library for image display. The web app is builded around 2 files, a HTML form and a results list page. The business logic is implemented with JavaScript and XQuery FLOWR.\n\nThe form (`findIllustrations-form.xq`) exposes databases to users. It can be switch in DEBUG mode to access more databases and to add filtering features, which can be helpful when a complete database is implemented (mixing illustrations and illustrated ads).\n\n![gallicaPix](https://github.com/altomator/Image_Retrieval/blob/master/Images/formv3.jpg)\n\nThe results list (`findIllustrations-app.xq`) has a DEBUG mode which implements a filtering functionality (for ads and filtered illustrations) and  more admin tools (display, edit, annotate).  These functions call XQuery scripts which perform updates on the database (thanks to the XQuery Update facility). These functionalities may be usefull for crowdsourcing experimentations.\n\n![gallicaPix](http://www.euklides.fr/blog/altomator/Image_Retrieval/boats.png)\n*Looking for [boats](http://demo14-18.bnf.fr:8984/rest?run=findIllustrations-app.xq\u0026filter=1\u0026start=1\u0026action=first\u0026module=0.5\u0026similarity=\u0026corpus=1418-v2\u0026keyword=\u0026kwTarget=\u0026kwMode=\u0026title=\u0026fromDate=\u0026toDate=\u0026iptc=00\u0026persType=00\u0026classif=boat\u0026operator=and\u0026colName=00\u0026size=31\u0026density=26)*\n\nFaceting and basic dataviz functionalities are also available.\n\n![faceting](https://github.com/altomator/Image_Retrieval/blob/master/Images/facettes.png) ![dataviz](https://github.com/altomator/Image_Retrieval/blob/master/Images/graph.jpg)\n\n##### Exporting data \nThe GallicaPix database also acts as a IIIF annotations server. Document metadata can be exported from GallicaPix as a IIIF list of annotations (JsonML), and then be displayed in any IIIF viewer (like [Mirador](https://manuscrits-france-angleterre.org/view3if/?target=https://gallica.bnf.fr/iiif/ark:/12148/bpt6k9604123v/manifest.json\u0026page=64) in this example).\n\n![Mirador](https://github.com/altomator/Image_Retrieval/blob/master/Images/vogue-gallicapix-mirador.jpg)\n*A Vogue issue in GallicaPix (left) and in Mirador (right), with the GallicaPix annotations* \n\nJsonML exports can also be asked to the web app (in the list results or at the document level).\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faltomator%2Fimage_retrieval","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faltomator%2Fimage_retrieval","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faltomator%2Fimage_retrieval/lists"}