https://github.com/openzim/phet
Scraper for PhET Science & Math Interactive Simulations
https://github.com/openzim/phet
scraper zim
Last synced: about 1 year ago
JSON representation
Scraper for PhET Science & Math Interactive Simulations
- Host: GitHub
- URL: https://github.com/openzim/phet
- Owner: openzim
- License: apache-2.0
- Created: 2016-06-19T14:02:46.000Z (almost 10 years ago)
- Default Branch: main
- Last Pushed: 2025-04-17T12:07:42.000Z (about 1 year ago)
- Last Synced: 2025-04-18T02:42:29.662Z (about 1 year ago)
- Topics: scraper, zim
- Language: JavaScript
- Homepage: https://download.kiwix.org/zim/phet
- Size: 18.7 MB
- Stars: 9
- Watchers: 5
- Forks: 4
- Open Issues: 14
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- Funding: .github/FUNDING.yml
- License: LICENSE
Awesome Lists containing this project
README
# PhET Simulations scraper
This scraper creates offline versions in [ZIM
format](https://openzim.org) of [PhET science
simulations for Science and Math](https://phet.colorado.edu).
[](https://www.npmjs.com/package/phetscraper)
[](https://www.npmjs.com/package/phetscraper)
[](https://ghcr.io/openzim/phet)
[](https://github.com/openzim/phet/actions/workflows/ci.yml)
[](https://www.codefactor.io/repository/github/openzim/phet)
[](https://download.kiwix.org/zim/phet/)
[](LICENSE)
## Requirements
It requires Node.js version 16 or higher.
## Quick Start
```bash
npm i && phet2zim
```
The above will eventually output a ZIM file to ```output/```
## Command line arguments
See `phet2zim --help` for details.
`phet2zim --output` generates ZIM files in a specific folder.
```bash
phet2zim --output myFolder
```
`--withoutLanguageVariants` uses to exclude languages with Country variant. For example `en_CA` will not be present in zim with this argument.
`--subjects` is used to pass specific subjects to download. Pass values as csv. Sample of valid subjects :
```
physics, biology, earth-science, motion, sound-and-waves, work-energy-and-power, heat-and-thermodynamics, quantum-phenomena
```
Available only on GET step:
```bash
--withoutLanguageVariants ...
```
Available on GET and EXPORT steps only:
```bash
--includeLanguages 'lang_1,lang_2,lang_3' ...
--excludeLanguages 'lang_1,lang_2,lang_3' ...
--subjects 'math,physics' ...
```
Available on EXPORT step only:
```bash
# Skip ZIM files for individual languages
--mulOnly
# Create a ZIM file with all languages
--createMul
```
Example:
```bash
phet2zim --includeLanguages en ru fr
```
## Config
Another way to configure behaviour is through environment variables. Sample `.env` file (with default values):
```bash
# request per second, affects GET step only
PHET_RPS=8
# async workers on TRANSFORM step (keep it equal to number of CPU cores)
PHET_WORKERS=10
# number of retries on GET step (delay grow with exponential backoff)
PHET_RETRIES=5
# display verbose errors
PHET_VERBOSE_ERRORS=false
```
## About
This project achieves multiple things:
* Download PhET content
* Generate an Index for said content
* Generate ZIM file(s) containing content and index
Things this project does not yet do, but should:
* Generate Android APK
## Usage
The functionality is split into 5 ```npm scripts```:
* ```npm run setup``` - deletes state from previous runs
* ```npm run get``` - downloads PhET simulations in specified languages
* ```npm run transform``` - prepare the content and media files
* ```npm run export``` - generates ZIM file(s)
* ```npm start``` - runs all of the above in sequence
The steps get, transform and export have their own output directories:
* ```get``` outputs HTML and PNG files to ```state/get```
* ```transform``` outputs intermediate files to ```state/transform```
* ```export``` outputs HTML and PNG files to ```state/export``` AND a ZIM file(s) to ```output/``` (by default, unless customized with `--output`)
License
-------
[Apache](https://www.apache.org/licenses/LICENSE-2.0) or later, see
[LICENSE](LICENSE) for more details.