{"id":44875312,"url":"https://github.com/nightmachinery/get_the_nini","last_synced_at":"2026-02-17T14:01:05.935Z","repository":{"id":310771179,"uuid":"1041153702","full_name":"NightMachinery/get_the_nini","owner":"NightMachinery","description":"Ninisite Scraper: Fetches all pages of a Ninisite discussion and formats in org-mode, Markdown, or JSON","archived":false,"fork":false,"pushed_at":"2025-08-21T05:11:03.000Z","size":426,"stargazers_count":0,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2025-10-07T11:15:09.019Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"HTML","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/NightMachinery.png","metadata":{"files":{"readme":"README.org","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-08-20T04:15:58.000Z","updated_at":"2025-08-21T05:11:06.000Z","dependencies_parsed_at":"2025-08-20T06:20:03.902Z","dependency_job_id":"6fd11e11-7573-49f5-8c82-90aeb399900a","html_url":"https://github.com/NightMachinery/get_the_nini","commit_stats":null,"previous_names":["nightmachinery/get_the_nini"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/NightMachinery/get_the_nini","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NightMachinery%2Fget_the_nini","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NightMachinery%2Fget_the_nini/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NightMachinery%2Fget_the_nini/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NightMachinery%2Fget_the_nini/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/NightMachinery","download_url":"https://codeload.github.com/NightMachinery/get_the_nini/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NightMachinery%2Fget_the_nini/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29546746,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-17T13:00:00.370Z","status":"ssl_error","status_checked_at":"2026-02-17T12:57:14.072Z","response_time":100,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-02-17T14:01:05.088Z","updated_at":"2026-02-17T14:01:05.914Z","avatar_url":"https://github.com/NightMachinery.png","language":"HTML","funding_links":[],"categories":[],"sub_categories":[],"readme":"#+TITLE: get-the-nini: Ninisite Post Scraper\n\nA command-line tool for scraping discussion threads from the Ninisite website. It can take a topic ID or a full URL and save the entire conversation into a single, well-structured file.\n\n*   *Code*: [[file:get_the_nini/main.py]]\n\n*   *Purpose*: This tool is designed to archive and analyze discussion threads from ninisite.com, converting them into portable and easy-to-read formats.\n\n*   *Features*\n    - Scrape entire discussion threads by Topic ID or URL.\n    - Automatically handles pagination.\n    - Outputs in multiple formats: **Org-mode**, **Markdown**, and **JSON**.\n    - Extracts rich metadata including topic title, author, categories, views, and post dates.\n    - Preserves the structure of posts, including replies and quoted content.\n    - Streaming output for Org-mode, ideal for large topics or viewing progress live.\n    - Progress bar during page fetching.\n\n*   *Installation*\n    This tool can be installed from PyPI using pip.\n\n    **Prerequisites**\n    1.  **Python 3**: Ensure you have Python 3 installed.\n    2.  **Pandoc**: The `pypandoc` library is used for converting HTML to other formats. You must have Pandoc installed and available on your system's PATH. Please see the [Pandoc installation instructions](https://pandoc.org/installing.html).\n\n    **Install with pip**\n    To install the package, run the following command in your terminal:\n    #+begin_src sh\n    pip install get-the-nini\n    #+end_src\n\n    Or install the latest version from git:\n    #+begin_src sh :eval never\n    pip install 'git+https://github.com/NightMachinery/get_the_nini.git'\n    #+end_src\n\n*   *Usage*\n    Once installed, the script can be run from the command line, providing a topic ID or a full URL.\n\n**Syntax**\n#+begin_src sh\nget-the-nini [OPTIONS] \u003cTOPIC_ID_OR_URL\u003e\n#+end_src\n\n**Examples**\n\n1.  **Scrape by Topic ID (Default Org-mode output)**\n    This command will scrape the discussion for topic ID `11473285` and save it to an automatically generated file named `ninisite_11473285.org`.\n    #+begin_src sh\n    get-the-nini 11473285\n    #+end_src\n\n2.  **Scrape using a full URL**\n    #+begin_src sh\n    get-the-nini \"https://www.ninisite.com/discussion/topic/11473285/\"\n    #+end_src\n\n3.  **Specify an output file and format (Markdown)**\n    The format can be inferred from the file extension, or specified explicitly with `--format`.\n    #+begin_src sh\n    get-the-nini 11473285 -o output.md\n    #+end_src\n\n4.  **Output as JSON to stdout**\n    Use `-o -` to direct output to standard output, which can be redirected to a file.\n    #+begin_src sh\n    get-the-nini 11473285 --format json -o - \u003e ninisite_11473285.json\n    #+end_src\n\n*   *Output Formats \u0026 Examples*\n    The scraper can produce output in three different formats. Below are links to examples generated from the same topic.\n\n**Org-mode (.org)**\nA highly structured and readable plain-text format, perfect for use in Emacs. This is the default format and supports streaming output directly to a file as pages are scraped.\n-   *Example*: [[file:examples/ninisite_11473285.org]]\n\n**Markdown (.md)**\nA popular lightweight markup language for easy conversion to HTML and other formats.\n-   *Example*: [[file:examples/ninisite_11473285.md]]\n\n**JSON (.json)**\nA structured data format that includes all metadata and post content, suitable for programmatic analysis or integration into other systems.\n-   *Example*: [[file:examples/ninisite_11473285.json]]\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnightmachinery%2Fget_the_nini","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnightmachinery%2Fget_the_nini","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnightmachinery%2Fget_the_nini/lists"}