{"id":16716678,"url":"https://github.com/dinosaure/docteur","last_synced_at":"2026-03-14T20:11:45.664Z","repository":{"id":42058894,"uuid":"353772497","full_name":"dinosaure/docteur","owner":"dinosaure","description":"An opiniated file-system for MirageOS","archived":false,"fork":false,"pushed_at":"2024-09-11T16:41:39.000Z","size":961,"stargazers_count":26,"open_issues_count":3,"forks_count":1,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-02-27T16:17:03.202Z","etag":null,"topics":["filesystem","mirageos","ocaml-git","unikernel"],"latest_commit_sha":null,"homepage":"","language":"OCaml","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dinosaure.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGES.md","contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-04-01T17:12:15.000Z","updated_at":"2024-09-11T16:24:34.000Z","dependencies_parsed_at":"2024-05-31T09:28:30.703Z","dependency_job_id":"68a30dbc-d620-4855-8ee8-c333171cb149","html_url":"https://github.com/dinosaure/docteur","commit_stats":null,"previous_names":[],"tags_count":7,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dinosaure%2Fdocteur","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dinosaure%2Fdocteur/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dinosaure%2Fdocteur/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dinosaure%2Fdocteur/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dinosaure","download_url":"https://codeload.github.com/dinosaure/docteur/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243835959,"owners_count":20355613,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["filesystem","mirageos","ocaml-git","unikernel"],"created_at":"2024-10-12T21:27:11.450Z","updated_at":"2025-10-19T13:36:21.498Z","avatar_url":"https://github.com/dinosaure.png","language":"OCaml","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Docteur - the simple way to load your Git repository into your unikernel\n\n`docteur` is a little program which wants to provide an easy way to integrate\na \"file-system\" into an _unikernel_. `docteur` provides a simple binary which\nmake an image disk from a Git repository. Then, the user is able to \"plug\" this\nimage into an _unikernel_ as a read-only \"file-system\".\n\n## Example\n\nThe distribution comes with a simple unikernel which show the given file from\nthe given image disk. The example requires KVM.\n\n```sh\n$ git clone https://github.com/dinosaure/docteur\n$ cd docteur\n$ opam pin add -y .\n$ cd unikernel\n$ docteur.make https://github.com/dinosaure/docteur -b refs/heads/main disk.img\n$ mirage configure -t hvt --disk docteur\n$ make depends\n$ mirage build\n$ solo5-hvt --block:docteur=disk.img simple.hvt --filename /README.md\n...\n```\n\n**NOTE:** For `mirage -t unix`, the disk name is the filename:\n```sh\n$ mirage configure -t unix --disk disk.img\n$ mirage build\n$ make depends\n$ ./simple --filename /README.md\n```\n\nAn image can be checked by `docteur with `docteur.verify`:\n```sh\n$ docteur.verify disk.img\ncommit\t: 57d227d8f4808076646de35acf26dee885f2555b\nauthor\t: \"Calascibetta Romain\" \u003cromain.calascibetta@gmail.com\u003e\nroot\t: 5886893922d57c1ff4871d9a6b7b2cfa48b9e9a6\n\nMerge pull request #22 from dinosaure/without-c\n\nRemove C code to be compatible with MirageOS\n```\n\nBy this way, you can check the version of your snapshot and if the given\n`disk.img` is well formed for a MirageOS.\n\nDocteur is able to _save_ a remote Git repository, a local Git repository or a\nsimple directory:\n``` sh\n$ docteur.make git@github.com:dinosaure/docteur disk.img\n$ docteur.make https://github.com/dinosaure/docteur disk.img\n$ docteur.make https://user:password@github.com/dinosaure/docteur disk.img\n$ docteur.make git://github.com/dinosaure/docteur disk.img\n$ docteur.make relativize://directory disk.img\n  ; can be a simple directory which will be prepend by $PWD\n$ docteur.make file://$(pwd)/ disk.img \n  ; assume that $(pwd) is a local Git repository\n  ; $(pwd)/.git exists\n$ docteur.make file://$(pwd)/ disk.img\n  ; or it's a simple directory\n```\n\n**NOTE:** The last example can be less efficient (about compression) than\nothers because we directly use our own way to generate a PACK file (which is\nless smart than `git`).\n\n## Docteur as a file-system\n\nMirageOS does not have a file-system at the beginning. So we must implement one\nto get the idea of files and directories. Multiple designs exist and no one are\nperfect for any cases.\n\nHowever, `docteur` exists as one possible \"file-system\" for MirageOS. It's not\nthe only one but it deserves a special case. Indeed, you can look into\n[irmin][irmin] and [ocaml-git][ocaml-git] for an other one.\n\nDocteur provides only a read-only file-system and contents are not a part of\nthe _unikernel_. Only _meta-data_ are in the _unikernel_. Let me explain a bit\nthe format.\n\n## The PACK file\n\nIn your Git repositories, most of your Git objects (files, directories,\ncommits) are stored into a [PACK file][pack-file]. It's an highly compressed\nrepresentation of your Git repository (your history, your files, etc.). Indeed,\nthe PACK file has 2 levels of compression:\n1) a `zlib` compression for each objects\n2) a compression between objects with a binary diff ([libXdiff][libXdiff])\n\nFor example, 14 Go of contents (like a documentation) can fit into a PACK file\nof 280 Mo! It's mostly due to the fact that a documentation, for example, has\nseveral files which are pretty the same. According to the second level of\nthe compression, we can store few objects as bases and compress the rest of\nthe documentation with them.\n\nSo, `docteur` uses the same format as an image disk. Then, it re-uses the\nIDX file associated to the PACK file. By this way, we permit as fast access\nto the content.\n\nFinally, contents of objects (files or directories) and where they are from\ntheir hashes into the PACK file are statically produced by `docteur.make`:\n```sh\n$ docteur.make \u003crepository\u003e [-b \u003crefs\u003e] \u003cimage\u003e\n$ docteur.make https://github.com/dinosaure/docteur -b refs/heads/main disk.img\n```\n\nHowever, the indexation of objects is done by their hashes. It's not done by\ntheir locations in your system. Such information is calculated by the\n_unikernel_ itself. At the beginning, it analyzes the PACK file and the IDX\nfile to reconstruct the system's layout with filenames and directory names.\n\nSo, the more files there are, the longer this operation can take - and the more\nmemory you use. Indeed, the system's layout is stored into memory with the\n[`art`][art] data-structure. Even if such data-structure is faster and smaller\nthan an usual radix tree, if you take the example of a huge documentation,\nthe _unikernel_ needs ~650 Mo in memory.\n\n`docteur` wants to solve 2 issues:\n- How to access to a huge file-system into an unikernel\n  We can from a block-device (an external ressource of the unikernel)\n- How to fastly load a file\n  We use a fast data-structure in-memory to get contents with [art][art]\n\nOf course, in many ways, such layout can not fit in many cases. If you have\nmultiple and small files, it's probably not the best solution. At least,\nit's one solution in the MirageOS eco-system!\n\n[irmin]: https://github.com/mirage/irmin\n[ocaml-git]: https://github.com/mirage/ocaml-git\n[pack-file]: https://git-scm.com/docs/pack-format\n[libXdiff]: http://www.xmailserver.org/xdiff-lib.html\n[art]: https://github.com/dinosaure/art\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdinosaure%2Fdocteur","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdinosaure%2Fdocteur","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdinosaure%2Fdocteur/lists"}