{"id":17655140,"url":"https://github.com/ickc/papers3-export","last_synced_at":"2025-03-30T09:26:50.361Z","repository":{"id":90861337,"uuid":"190985103","full_name":"ickc/papers3-export","owner":"ickc","description":"Fix Papers 3 export library (XML, RIS, BibTeX) to contain annotation","archived":false,"fork":false,"pushed_at":"2021-03-26T21:37:36.000Z","size":6,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-02-05T11:36:41.380Z","etag":null,"topics":["export","papers","reference-manager"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ickc.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-06-09T09:02:57.000Z","updated_at":"2022-04-16T20:15:44.000Z","dependencies_parsed_at":"2024-06-13T20:17:23.555Z","dependency_job_id":null,"html_url":"https://github.com/ickc/papers3-export","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ickc%2Fpapers3-export","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ickc%2Fpapers3-export/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ickc%2Fpapers3-export/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ickc%2Fpapers3-export/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ickc","download_url":"https://codeload.github.com/ickc/papers3-export/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246299328,"owners_count":20755145,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["export","papers","reference-manager"],"created_at":"2024-10-23T12:40:56.253Z","updated_at":"2025-03-30T09:26:50.315Z","avatar_url":"https://github.com/ickc.png","language":"Jupyter Notebook","readme":"# Fix Papers 3 export library (XML, RIS, BibTeX) to contain annotation\n\nThis notebook is used as a one-off script for me to leave Papers 3. The problem of Papers 3 export is the either you export the library (e.g. in XML) or the PDFs with annotations, but not both.\n\nThis notebook fix the situation by \"merging\" the information from both of these export options from Papers 3. Moreover, it takes the lazy approach that whenever a PDF is not modified, the original (rather than the output from Papers 3) is used to reduced file size inflation. In my test my library inflated ~2.7 times after Papers 3 export with annotation (even for those PDFs without any annotations!)\n\nConfigure these in cell 3:\n\n```py\n# path to the XML export from Papers 3: \"EndNote XML Library\"\npath_xml = Path('~/Downloads/temp-papers-export-tidy.xml').expanduser()\n# export from Papers 3 \"PDF Files and Media\", without annotation\npath_original = Path('~/Downloads/temp-papers-export-original').expanduser()\n# export from Papers 3 \"PDF Files and Media\", with annotation\npath_annotated = Path('~/Downloads/temp-papers-export').expanduser()\n# input path of the library file\n# it can be the same as the path_xml, or another one such as RIS or BibTeX\nin_path = path_xml\n# output path of the modified library file, extension should be the same as in_path\nout_path = Path('~/Downloads/temp-papers-export-annotated.xml').expanduser()\n```\n\nThe comments should be self-explanatory.\n\n# Requirements\n\nThis notebook imports the following\n\n```py\nimport xmltodict\nfrom glom import glom\nimport pandas as pd\nimport fitz # pip install PyMuPDF\n```\n\n# Potential Improvements and notes\n\nHopefully I don't need to run this script anymore, which also means I probably won't maintain it. A few things that could improve this:\n\nImprovements:\n\n- use `argparse` or something like that (e.g. `defopt`) to provide command line options (or chain with `gooey` for GUI)\n- use parallel map. A couple of `map` are used throughout and can be trivially parallelized.\n- a few assumption made (mainly shown by the `assert` statements), run the notebook to fix your library (or fix my code) to change that\n\nNotes:\n\n- note that if you use APFS, the export of the original export should not takes extra space and is very fast (by the nature of copy in APFS)\n- the 2nd last cell shows how many PDFs are modified. If you pre-tidy up your XML input files (or use RIS/BibTeX), then you can take a diff between the input file and output file to verify that the changed paths are of the same no.\n- note that the output library has file paths pointing inside both the original Papers 3 library and the `path_annotated` above. Only delete these directories after you use the `out_path` to import into another software. (And leave the Papers 3 library around for a while until you're certain migrated library aren't broke.)\n\nWon't fix:\n\n- collections from Papers 3 are missing. Since the exported library (be it XML, RIS, BibTeX) doesn't include this information, it can't be done. May be AppleScript can help. See \u003chttps://github.com/extracts/mac-scripting/blob/master/Papers3/Papers_To_Bookends/Papers_To_Bookends.applescript\u003e.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fickc%2Fpapers3-export","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fickc%2Fpapers3-export","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fickc%2Fpapers3-export/lists"}