{"id":15028713,"url":"https://github.com/rhizome-conifer/conifer","last_synced_at":"2025-04-08T02:41:54.664Z","repository":{"id":31998982,"uuid":"35569859","full_name":"Rhizome-Conifer/conifer","owner":"Rhizome-Conifer","description":"Collect and revisit web pages.","archived":false,"fork":false,"pushed_at":"2023-11-08T17:14:03.000Z","size":26712,"stargazers_count":1482,"open_issues_count":126,"forks_count":119,"subscribers_count":52,"default_branch":"master","last_synced_at":"2024-10-29T17:51:06.050Z","etag":null,"topics":["archives","docker","python","pywb","warc","wayback","web-archiving","webrecorder"],"latest_commit_sha":null,"homepage":"https://conifer.rhizome.org","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":"kubernetes-sigs/cluster-api","license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Rhizome-Conifer.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2015-05-13T19:27:29.000Z","updated_at":"2024-10-28T18:13:38.000Z","dependencies_parsed_at":"2023-02-17T12:01:01.461Z","dependency_job_id":"a861e266-a83e-4ef1-98ec-5e9b6b7d8ee9","html_url":"https://github.com/Rhizome-Conifer/conifer","commit_stats":null,"previous_names":["webrecorder/webrecorder"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Rhizome-Conifer%2Fconifer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Rhizome-Conifer%2Fconifer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Rhizome-Conifer%2Fconifer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Rhizome-Conifer%2Fconifer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Rhizome-Conifer","download_url":"https://codeload.github.com/Rhizome-Conifer/conifer/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247767232,"owners_count":20992538,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["archives","docker","python","pywb","warc","wayback","web-archiving","webrecorder"],"created_at":"2024-09-24T20:08:56.340Z","updated_at":"2025-04-08T02:41:54.640Z","avatar_url":"https://github.com/Rhizome-Conifer.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n# Conifer\n### Collect and revisit web pages.\n\nConifer provides an integrated platform for creating high-fidelity, ISO-compliant web archives in a user-friendly interface, providing access to archived content, and sharing collections.\n\nThis repository represents the hosted service running at https://conifer.rhizome.org/, which can also be [deployed locally using Docker](#running-locally)\n\nThis README refers to the 5.x version of Conifer, released in June, 2020. This release includes a new UI and the renaming of Webrecorder.io to Conifer. Other parts of the open source efforts remain at the [Webrecorder Project](https://webrecorder.net). For more info about this momentous change, read our announcement [blog post.](https://blog.conifer.rhizome.org/2020/06/11/webrecorder-conifer.html) \n\nThe previous UI is available on the [legacy branch](https://github.com/Rhizome-Conifer/conifer/tree/legacy).\n\n\n## Frequently asked questions\n\n* If you have any questions about how to use Conifer, please see our [User Guide](https://guide.conifer.rhizome.org).\n\n* If you have a question about your account on the hosted service (conifer.rhizome.org), please contact us via email at [support@conifer.rhizome.org](mailto:support@conifer.rhizome.org)\n\n* If you have a previous Conifer installation (version 3.x), see [Migration Info](migrating-4.0.md) for instructions on how to migrate to the latest version.\n\n\n## Using the Conifer Platform\n\nConifer and related tools are designed to make web archiving more portable and decentralized, as well as to serve users and developers with a broad range of skill levels and requirements. Here are a few ways that Conifer can be used (starting with what probably requires the least technical expertise).\n\n### 1. Hosted Service\n\nUsing our hosted version of Conifer at https://conifer.rhizome.org/, users can sign up for a free account and create their own personal collections of web archives. Captures web content will be available online, either publicly or only privately, under each user account, and can be downloaded by the account owner at any time. Downloaded web archives are available as WARC files. (WARC is the ISO standard file format for web archives.) The hosted service can also be used anonymously and the captured content can be downloaded at the end of a temporary session.\n\n### 2. Offline Capture and Browsing\n\nThe Webrecorder Project is a closely aligned effort that offers OSX/Windows/Linux Electron applications:\n\n* [Webrecorder Player](https://github.com/webrecorder/webrecorder-player) browse WARCs created by Webrecorder (and other web archiving tools) locally on the desktop.\n* [Webrecorder Desktop](https://github.com/webrecorder/webrecorder-desktop) a desktop version of the hosted Webrecorder service providing both capture and replay features.\n\n\n### 3. Preconfigured Deployment\n\nTo deploy the full version of Conifer with Ansible on a Linux machine, the [Conifer Deploy](https://github.com/rhizome-conifer/conifer-deploy) workbook can be used to install this repository, configure nginx and other dependencies, such as SSL (via Lets Encrypt). The workbook is used for the https://conifer.rhizome.org deployment.\n\n### 4. Full Conifer Local Deployment\n\nThe Conifer system in this repository can be deployed directly by [following the instructions below](#running-locally).\nConifer runs entirely in Docker and also requires Docker Compose.\n\n### 5. Standalone Python Wayback (pywb) Deployment\n\nFinally, for users interested in the core \"replay system\" and very basic recording capabilities, deploying [pywb](https://github.com/webrecorder/pywb) could also make sense. Conifer is built on top of pywb (Python Wayback/Python Web Archive Toolkit), and the core recording and replay functionality is provided by pywb as a standalone Python library. pywb comes with a Docker image as well.\n\npywb can be used to deploy your own web archive access service. See the [full pywb reference manual](http://pywb.readthedocs.org/) for further information on using and deploying pywb.\n\n## Running Locally\n\nConifer can be run on any system that has [Docker](https://docs.docker.com/install/) and [Docker Compose](https://docs.docker.com/compose/install/) installed. To install manually, clone\n\n1. `git clone https://github.com/rhizome-conifer/conifer`\n\n2. `cd conifer; bash init-default.sh`.\n\n3. `docker-compose build`\n\n4. `docker-compose up -d`\n\n(The `init-default.sh` is a convenience script that copies [wr_sample.env](webrecorder/webrecorder/config/wr_sample.env) → `wr.env` and creates keys for session encryption.)\n\nPoint your browser to `http://localhost:8089/` to access the locally running Conifer instance.\n\n(Note: you may see a maintenance message briefly while Conifer is starting up. Refresh the page after a few seconds to see the Conifer home page).\n\n### Installing Remote Browsers\n\nRemote Browsers are standard browsers like Google Chrome and Mozilla Firefox, encapsulated in Docker containers. This feature allows Conifer to directly use fixed versions of browsers for capturing and accessing web archives, with a more direct connection to the live web and web archives. Remote browsers in many cases can improve the quality of web archives during capture and access. They can be \"remote controlled\" by users and are launched as needed, and use the same amount of computing and memory resources as they would when just running as regular desktop apps.\n\nRemote Browsers are optional, and can be installed as needed.\n\nRemote Browsers are just Docker images which start with `oldweb-today/`, and are part of\n[oldweb-today](https://github.com/oldweb-today/) organization on GitHub.\nInstalling the browsers can be as simple as running `docker pull` on each browser image each as well as\nadditional Docker images for the Remote Desktop system.\n\nTo install the Remote Desktop System and all of the officially supported Remote Browsers, run [install-browsers.sh](install-browsers.sh)\n\n\n### Configuration\n\nConifer reads its configuration from two files: `wr.env`, and less-commonly changed system settings in `wr.yaml`.\n\nThe `wr.env` file contains numerous deployment-specific customization options. In particular, the following options may be useful:\n\n#### Host Names\n\nBy default, Conifer assumes its running on localhost or a single domain, but on different ports for application (the Conifer user interface) and content (material rendered from web archives). This is a security feature preventing archived web sites accessing and possibly changing Conifer's user interface, and other unwanted interactions.\n\nTo run Conifer on different domains, the `APP_HOST` and `CONTENT_HOST` environment variables should be set.\n\nFor best results, the two domains should be two subdomains, both with https enabled.\n\nThe `SCHEME` env var should also be set to `SCHEME=https` when deploying via https.\n\n#### Anonymous Mode\n\nBy default Conifer disallows anonymous recording. To enable this feature, set ANON_DISABLED=false to the wr.env file and restart.\n\n*Note: Previously the default setting was anonymous recording enabled (`ANON_DISABLED=false`)*\n\n#### Storage\n\nConifer uses the `./data/` directory for local storage, or an external backend, currently supporting S3.\n\nThe `DEFAULT_STORAGE` option in `wr.env` configures storage options, which can be `DEFAULT_STORAGE=local` or `DEFAULT_STORAGE=s3`\n\nConifer uses a temporary storage directory for data while it is actively being captured, and temporary collections. Data is moved into the 'permanent' storage when the capturing process is completed or a temporary collection is imported into a user account.\n\nThe temporary storage directory is: `WARCS_DIR=./data/warcs`.\n\nThe permanent storage directory is either `STORAGE_DIR=./data/storage` or local storage.\n\nWhen using s3, the value of `STORAGE_DIR` is ignored and data gets placed into `S3_ROOT` which is an `s3://` bucket URL.\n\nAdditional s3 auth environment settings must also be set in `wr.env` or externally.\n\nAll data related to Conifer that is not web archive data (WARC and CDXJ) is stored in the Redis instance, which persists data to `./data/dump.rdb`. (See [Conifer Architecture](#conifer-architecture) below.)\n\n#### Email\n\nConifer can send confirmation and password recovery emails. By default, a local SMTP server is run in Docker, but can be configured to use a remote server by changing the environment variables `EMAIL_SMTP_URL` and `EMAIL_SMTP_SENDER`.\n\n#### Frontend Options\n\nThe react frontend includes a number of additional options useful for debugging. Setting `NODE_ENV=development` will switch react to development mode with hot reloading on port 8096.\n\nAdditional frontend configuration can be found in [frontend/src/config.js](frontend/src/config.js)\n\n\n### Administration tool\n\nThe script `admin.py` provides easy low level management of users. Adding, modifying, or removing users can be done via the command line.\n\nTo interactively create a user:\n\n```sh\ndocker exec -it app python -m webrecorder.admin -c\n```\n\nor programmatically add users by supplying the appropriate positional values:\n\n```sh\ndocker exec -it app  python -m webrecorder.admin \\\n                -c \u003cemail\u003e \u003cusername\u003e \u003cpasswd\u003e \u003crole\u003e '\u003cfull name\u003e'\n```\n\nOther arguments:\n\n* `-m` modify a user\n* `-d` delete a user\n* `-i` create and send a new invite\n* `-l` list invited users\n* `-b` send backlogged invites\n\nSee `docker exec -it app python -m webrecorder.admin --help` for full details.\n\n### Restarting Conifer\n\nWhen making changes to the Conifer backend app, running\n\n```sh\ndocker-compose kill app; docker-compose up -d app\n```\n\nwill stop and restart the container.\n\nTo integrate changes to the frontend app, either set `NODE_ENV=development` and utilize hot reloading. If you're running production (`NODE_ENV=production`), run\n\n```sh\ndocker-compose kill frontend; docker-compose up -d frontend\n```\n\nTo fully recreate Conifer, deleting old containers (but not the data!) use the `./recreate.sh` script.\n\n## Conifer Architecture\n\nThis repository contains the Docker Compose setup for Conifer, and is the exact system deployed on https://conifer.rhizome.org. The full setup consists of the following components:\n\n- `/app` - The Conifer backend system includes the API, recording and WARC access layers, split into 3 containers:\n  - `app` -- The API and data model and rewriting system are found in this container.\n  - `recorder` -- The WARC writer is found in this container.\n  - `warcserver` -- The WARC loading and lookup is found in this container.\n\nThe backend containers run different tools from [pywb](https://github.com/webrecorder/pywb), the core web archive replay toolkit library.\n\n- `/frontend` - A React-based frontend application, running in Node.js. The frontend is a modern interface for Conifer and uses the backend api. All user access goes through frontend (after nginx).\n\n- `/nginx` - A custom nginx deployment to provide routing and caching.\n\n- `redis` - A Redis instance that stores all of the Conifer state (other than WARC and CDXJ).\n\n- `dat-share` - An experimental component for sharing collections via the [Dat protocol](https://datproject.org/)\n\n- `shepherd` - An instance of [OldWebToday Browser Shepherd](https://github.com/oldweb-today/browsers) for managing remote browsers.\n\n- `mailserver` - A simple SMTP mail server for sending user account management mail\n\n- `behaviors` - Custom [automation behaviors](https://github.com/webrecorder/behaviors)\n\n- `browsertrix` - Automated [crawling system](https://github.com/webrecorder/browsertrix)\n\n\n### Dependencies\n\nConifer is built using both Python (for backend) and Node.js (for frontend) using a variety of Python and Node open source libraries.\n\nConifer relies on a few separate repositories in this organization:\n- [pywb](https://github.com/webrecorder/pywb)\n- [warcio](https://github.com/webrecorder/warcio)\n- [har2warc](https://github.com/webrecorder/har2warc)\n- [public-web-archives](https://github.com/webrecorder/public-web-archives)\n- [dat-share](https://github.com/webrecorder/dat-share)\n\nThe remote browser system uses https://github.com/oldweb-today/ repositories, including:\n- [browsers](https://github.com/oldweb-today/browsers)\n- [browser-chrome](https://github.com/oldweb-today/browser-chrome)\n- [browser-firefox](https://github.com/oldweb-today/browser-firefox)\n\n\n### Contact\n\nConifer is a project of [Rhizome](https://rhizome.org), made possible with generous past support from the Andrew W. Mellon Foundation.\n\nFor more info on using Conifer, you can consult our user guide at: https://guide.conifer.rhizome.org\n\nFor any general questions/concerns regarding the project or https://conifer.rhizome.org you can:\n\n* Open [issues](https://github.com/rhizome-conifer/conifer/issues) on GitHub\n\n* Tweet to us at https://twitter.com/rhizomeconifer\n\n* Contact us at support@conifer.rhizome.org\n\n### License\n\nConifer is Licensed under the Apache 2.0 License. See [NOTICE](NOTICE) and [LICENSE](LICENSE) for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frhizome-conifer%2Fconifer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frhizome-conifer%2Fconifer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frhizome-conifer%2Fconifer/lists"}