{"id":13961891,"url":"https://github.com/OpenBookPublishers/XML-last","last_synced_at":"2025-07-21T06:32:17.275Z","repository":{"id":87180794,"uuid":"105033264","full_name":"OpenBookPublishers/XML-last","owner":"OpenBookPublishers","description":null,"archived":false,"fork":false,"pushed_at":"2020-06-26T12:38:25.000Z","size":671,"stargazers_count":3,"open_issues_count":1,"forks_count":1,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-03-27T15:52:49.004Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"XSLT","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/OpenBookPublishers.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2017-09-27T15:07:00.000Z","updated_at":"2022-07-01T00:42:31.000Z","dependencies_parsed_at":null,"dependency_job_id":"c4609850-fabb-4100-816a-19a14d119d49","html_url":"https://github.com/OpenBookPublishers/XML-last","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/OpenBookPublishers/XML-last","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OpenBookPublishers%2FXML-last","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OpenBookPublishers%2FXML-last/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OpenBookPublishers%2FXML-last/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OpenBookPublishers%2FXML-last/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/OpenBookPublishers","download_url":"https://codeload.github.com/OpenBookPublishers/XML-last/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OpenBookPublishers%2FXML-last/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":266253614,"owners_count":23900053,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-08T17:01:34.091Z","updated_at":"2025-07-21T06:32:12.263Z","avatar_url":"https://github.com/OpenBookPublishers.png","language":"XSLT","funding_links":[],"categories":["XSLT"],"sub_categories":[],"readme":"# XML-last\n\nThis repository contains a set of tools to convert an epub created with Adobe InDesign into a series of XML files that follow the TEI simplePrint customisation. For the conversion to work the InDesign documents need be formatted following a specific set of instructions (see the repo's [wiki](https://github.com/OpenBookPublishers/XML-last/wiki)).\n\n## Files and directories in this repository\n* __documents and templates__: this folder contains an InDesign template. It also includes sample input files for book-, chapter- and object-level metadata\n* __schemas__: this folder contains the tei_simplePrint schema (also available at http://www.tei-c.org/Guidelines/Customization/index.xml) and the OBP customisation\n* __LICENSE__\n* __README.md__: this file\n* __Transform-to-XML-book.xsl__: this script creates a unique book-long XML TEI file by combining the documents already converted\n* __Transform-to-XML-section.xsl__: this is the main conversion tool that transforms each XHTML file forming the input epub into a XML TEI file\n* __XML-after-transformation.py__: this python script should be run after conversion to fix some small mistakes in the XML\n* __XML-before-transformation.py__: this python script must be run before conversion to correctly set-up the input and output folders\n\n## Running the conversion\n1. copy your input files to the project folder:\n\t* the epub of the book you want to convert (see [Preparing the epub for conversion](https://github.com/OpenBookPublishers/XML-last/wiki/Preparing-the-epub-for-conversion))\n\t* the file containing book- and optionally chapter-level metadata (see [documents and templates/book-chapter-metadata-TEMPLATE.xml](https://github.com/OpenBookPublishers/XML-last/blob/master/documents%20and%20templates/book-chapter-metadata-TEMPLATE.xml) and [Book and chapter metadata](https://github.com/OpenBookPublishers/XML-last/wiki/Book-and-chapter-metadata))\n\t* (optional) the file containing object-level metadata (see [documents and templates/Object-metadata-TEMPLATE.csv](https://github.com/OpenBookPublishers/XML-last/blob/master/documents%20and%20templates/Object-metadata-TEMPLATE.csv) and [Object metadata](https://github.com/OpenBookPublishers/XML-last/wiki/Object-metadata)) \n2. Run 'XML-before-transformation.py' (you will need Python 3.6.2 or newer). This will:\n\t* un-package the epub\n\t* selectively copy the content of the epub to a newly created 'input' folder\n\t* re-name the book metadata file\n\t* create the output folder 'XML-edition'\n\t* transfer images, audio and video files (if any) from the epub to the 'XML-edition' folder\n3. Run 'Transform-to-XML-section.xsl'  to transform each input XHTML file into a XML TEI file. The output files will be saved to the 'XML-edition' folder. To run this transformation (XSLT 2.0) a processor such as SaxonHE will be needed (https://sourceforge.net/projects/saxon/files/Saxon-HE/9.8/ -- note that the open source edition of Saxon does not allow the validation of the result documents). Saxon can be run (1) from within a product that provides a graphical user interface (such as oXygen, https://www.oxygenxml.com/), (2) from the command line or (3) from within a Java or .NET application.\n\t* (1) select 'Transform-to-XML-section.xsl' as both the input and the XSL source of the transformation; the output field can be left blank\n\t* (2) type `java -jar _dir_/saxon9he.jar -s:_dir_/XML-last/Transform-to-XML-section.xsl -xsl:_dir_/XML-last/Transform-to-XML-section.xsl -o:_dir_/XML-last/Transform-to-XML-section.xsl`\n\t* (3) see eg http://www.oracle.com/technetwork/java/gazfm-138953.html\n4. Run 'Transform-to-XML-book.xsl'. This second transformation uses Xinclude to merge the newly created XML TEI files into one single file. The output is saved to the 'XML-edition' folder as 'entire-book.xml'. (See above for more on how to run the transformation).\n5. Run 'XML-after-transformation.py' to:\n\t* change cross-references destination throughout 'entire-book.xml'\n\t* modify relative URLs throughout\n\t* delete empty list items\n\t* delete empty `\u003cdiv\u003e`s\n\t* delete tabs\n\n## Further reading\nVisit the repo's [wiki](https://github.com/OpenBookPublishers/XML-last/wiki) to read about:\n* [Preparing the epub for conversion](https://github.com/OpenBookPublishers/XML-last/wiki/Preparing-the-epub-for-conversion)\n* [A quick description of content conversion](https://github.com/OpenBookPublishers/XML-last/wiki/A-quick-description-of-content-conversion)\n* [Book and chapter metadata](https://github.com/OpenBookPublishers/XML-last/wiki/Book-and-chapter-metadata)\n* [Object metadata](https://github.com/OpenBookPublishers/XML-last/wiki/Object-metadata)\n* [The TEI simplePrint schema](https://github.com/OpenBookPublishers/XML-last/wiki/TEI-simplePrint)\n\nIf you wish to extract bibliographic citations from your content after conversion, visit https://github.com/OpenBookPublishers/Extract-citations","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FOpenBookPublishers%2FXML-last","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FOpenBookPublishers%2FXML-last","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FOpenBookPublishers%2FXML-last/lists"}