{"id":29191328,"url":"https://github.com/vectorcmdr/lakota-dictionary-mdf","last_synced_at":"2026-02-04T14:02:52.235Z","repository":{"id":301654079,"uuid":"1009921426","full_name":"vectorcmdr/Lakota-Dictionary-MDF","owner":"vectorcmdr","description":"Lakota language dictionary MDF format extraction data","archived":false,"fork":false,"pushed_at":"2025-06-28T02:23:50.000Z","size":21898,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-06-28T02:39:55.712Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"HTML","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/vectorcmdr.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-06-28T01:00:14.000Z","updated_at":"2025-06-28T02:23:53.000Z","dependencies_parsed_at":"2025-06-28T02:49:59.825Z","dependency_job_id":null,"html_url":"https://github.com/vectorcmdr/Lakota-Dictionary-MDF","commit_stats":null,"previous_names":["vectorcmdr/lakota-dictionary-mdf"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/vectorcmdr/Lakota-Dictionary-MDF","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vectorcmdr%2FLakota-Dictionary-MDF","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vectorcmdr%2FLakota-Dictionary-MDF/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vectorcmdr%2FLakota-Dictionary-MDF/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vectorcmdr%2FLakota-Dictionary-MDF/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/vectorcmdr","download_url":"https://codeload.github.com/vectorcmdr/Lakota-Dictionary-MDF/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vectorcmdr%2FLakota-Dictionary-MDF/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":263056121,"owners_count":23406807,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-07-02T00:32:17.254Z","updated_at":"2026-02-04T14:02:52.230Z","avatar_url":"https://github.com/vectorcmdr.png","language":"HTML","funding_links":["https://ko-fi.com/vector_cmdr"],"categories":[],"sub_categories":[],"readme":"# Lakota Language Dictionary (SFM/MDF \u0026 FLEx)\n\n### A [Lakota language](https://en.wikipedia.org/wiki/Lakota_language) dictionary in SFM/MDF \u0026 FLEx format for the [Lakota people](https://en.wikipedia.org/wiki/Lakota_people) of the [Sioux tribes](https://en.wikipedia.org/wiki/Sioux).\n#### Thank you to [u/Even-Morons-Dream](https://www.reddit.com/user/Even-Morons-Dream/) for the opportunity to help by reclaiming the data for them. I am honoured to be able to lend my skills.\n\n## Usage:\nFor usage of the data, see the files within the \\Lexicon folder.\nIt contains the following files for use:\n* An SFM/MDF dictionary file ([dict.sfm](Lexicon/dict.sfm))\n* A [FieldWorks Language Explorer (FLEx)](https://software.sil.org/fieldworks/) project backup ([Lakota Test 2025-06-28 1209 Lakota.fwbackup](Lexicon/Lakota%20Test%202025-06-28%201209%20Lakota.fwbackup))\n* A FLEx import map ([dict-import-settings.map](Lexicon/dict-import-settings.map))\n* An XHTML dictionary listing page ([Lexicon.xhtml](Lexicon/Lexicon.xhtml))\n\n\u003e [!IMPORTANT]  \n\u003eThe [`.fwbackup`](Lexicon/Lakota%20Test%202025-06-28%201209%20Lakota.fwbackup) file can be loaded as a backup restore and used, edited, added to and exported from within FLEx.\n\u003e \n\u003eThe [`.xhtml`](Lexicon/Lexicon.xhtml) file can be browsed, though it is rudimentary at best.\n\u003e \n\u003eThe [`.sfm`](Lexicon/dict.sfm) file itself can be imported into FLEx / Soapbox / Toolbox or any SFM/MDF compatible language tool for building a dictionary.\n\n## Methodology of Data Extraction:\n\u003e [!NOTE] \n\u003eThe data was retrieved from was a Unity binary built for Android, packaged as an `.apk`.\n\nAnalysis steps were as follows:\n1. Unpack `.apk`.\n2. Decompile `.dex` and check code for Android side keys or other info.\n3. Identify Unity files within `assets` folder.\n4. Identify libraries within `UnityServicesProjectConfiguration.json`.\n5. Identify libraries within `RuntimeInitializeOnLoads.json`.\n6. Identify libraries within `ScriptingAssemblies.json` and identify use of SqlCipher4Unity3D (_SQLCipher_).\n7. Identify binaries as _mono_ and not IL2CPP.\n8. Identify an obfuscated database outside of asset packs via header check (0-\u003e16:32 SQLCipher print) and disassembly of monobehaviour scripts as likely encrypted with SQLCipher.\n9. Database likely contains additional text records and audio files due to output of heuristic analysis. _No key found_.\n10. Merge `sharedassets0.assets.split[n]` into complete `sharedassets0.assets` file.\n11. Run custom header lookup scripts in hex editor for Unity disassembly/unpacking and identify asset chunks (shaders, images, fonts, text, etc.)\n12. Dump each data section to file.\n13. Identify two sections are _SFM/MDF_ databases/dictionaries and are a single dictionary split in reverse due to size.\n14. Merge SFM/MDF data and trim header + start/end padding.\n15. Dump to _ASCII_ string.\n16. Confirm output with native speaker.\n17. Confirm data meets [technical documentation](_refs) / SIL International specs.\n18. Import into _FLEx_.\n19. Export FLEx project backup and xhtml.\n\n## Raw Files:\nThe root of the repo contains more 'raw' extracted files:\n* Dumped SFM/MDF data block 1 ([raw_data_A_to_I.bin](raw_data_A_to_I.bin))\n* Dumped SFM/MDF data block 2 ([raw_data_I_to_Z.bin](raw_data_I_to_Z.bin))\n* Extracted ASCII dictionary merged from fragments ([dict.txt](dict.txt))\n* ASCII dictionary fragments ([dict_A_to_I.txt](dict_A_to_I.txt) + [dict_I_to_Z.txt](dict_I_to_Z.txt))\n\n## Future Possibilities:\nIf time permits, I would like to branch the build script, table/library code and frontend from [STL Bitz Box](https://github.com/vectorcmdr/STL-Bitz-Box) \u0026 [ACNH Pattern Dump Index](https://github.com/vectorcmdr/ACNH-Pattern-Dump-Index) to create a static webpage dictionary for the data with a row entry per word and searchable/filterable columns for each piece of information tied to that word (including audio support) that can be updated, managed and hosted by the community and will be completely open source.\n\n## Help Support My Tinkering\n\n\u003ca href=\"https://ko-fi.com/vector_cmdr\"\u003e\n\u003cimg src=\"https://custom-icon-badges.demolab.com/badge/-Donate-lightblue?style=for-the-badge\u0026logo=coffee\u0026logoColor=red\" height=\"64\"/\u003e\u003c/a\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvectorcmdr%2Flakota-dictionary-mdf","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvectorcmdr%2Flakota-dictionary-mdf","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvectorcmdr%2Flakota-dictionary-mdf/lists"}