{"id":13588565,"url":"https://github.com/webrecorder/browsertrix-behaviors","last_synced_at":"2025-04-13T08:24:19.987Z","repository":{"id":42055770,"uuid":"341434390","full_name":"webrecorder/browsertrix-behaviors","owner":"webrecorder","description":"Automated behaviors that run in browser to interact with complex sites automatically. Used by ArchiveWeb.page and Browsertrix Crawler.","archived":false,"fork":false,"pushed_at":"2025-04-08T13:23:18.000Z","size":458,"stargazers_count":39,"open_issues_count":18,"forks_count":20,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-04-10T05:29:31.504Z","etag":null,"topics":["automation","browsertrix-behaviors","puppeteer"],"latest_commit_sha":null,"homepage":"","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"agpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/webrecorder.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"github":"webrecorder"}},"created_at":"2021-02-23T05:02:15.000Z","updated_at":"2025-04-08T13:22:27.000Z","dependencies_parsed_at":"2024-12-16T18:20:56.193Z","dependency_job_id":"31e8a3c1-a6a8-48f5-ac30-b6b467234011","html_url":"https://github.com/webrecorder/browsertrix-behaviors","commit_stats":{"total_commits":77,"total_committers":9,"mean_commits":8.555555555555555,"dds":"0.19480519480519476","last_synced_commit":"30752409408ce86e53769753efde7b59ac39ef72"},"previous_names":[],"tags_count":13,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/webrecorder%2Fbrowsertrix-behaviors","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/webrecorder%2Fbrowsertrix-behaviors/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/webrecorder%2Fbrowsertrix-behaviors/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/webrecorder%2Fbrowsertrix-behaviors/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/webrecorder","download_url":"https://codeload.github.com/webrecorder/browsertrix-behaviors/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248682153,"owners_count":21144805,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["automation","browsertrix-behaviors","puppeteer"],"created_at":"2024-08-01T15:06:47.556Z","updated_at":"2025-04-13T08:24:19.981Z","avatar_url":"https://github.com/webrecorder.png","language":"TypeScript","funding_links":["https://github.com/sponsors/webrecorder"],"categories":["TypeScript"],"sub_categories":[],"readme":"# Browsertrix Behaviors\n\n\u003cdetails\u003e\n  \u003csummary\u003e\u003cb\u003eBehavior Testing Results\u003c/b\u003e\u003c/summary\u003e\n\n[![Autoscroll Behavior](https://github.com/webrecorder/browsertrix-behaviors/actions/workflows/autoscroll.yaml/badge.svg)](https://github.com/webrecorder/browsertrix-behaviors/actions/workflows/autoscroll.yaml)\n\n[![Autoplay Behavior: Youtube](https://github.com/webrecorder/browsertrix-behaviors/actions/workflows/autoplay-youtube.yaml/badge.svg)](https://github.com/webrecorder/browsertrix-behaviors/actions/workflows/autoplay-youtube.yaml)\n\n[![Autoplay Behavior: Vimeo](https://github.com/webrecorder/browsertrix-behaviors/actions/workflows/autoplay-vimeo.yaml/badge.svg)](https://github.com/webrecorder/browsertrix-behaviors/actions/workflows/autoplay-vimeo.yaml)\n\n[![Instagram Behavior (Logged In)](https://github.com/webrecorder/browsertrix-behaviors/actions/workflows/instagram.yaml/badge.svg)](https://github.com/webrecorder/browsertrix-behaviors/actions/workflows/instagram.yaml)\n\n[![Twitter Behavior](https://github.com/webrecorder/browsertrix-behaviors/actions/workflows/twitter.yaml/badge.svg)](https://github.com/webrecorder/browsertrix-behaviors/actions/workflows/twitter.yaml)\n\n[![Twitter Behavior (Logged In)](https://github.com/webrecorder/browsertrix-behaviors/actions/workflows/twitter-logged-in.yaml/badge.svg)](https://github.com/webrecorder/browsertrix-behaviors/actions/workflows/twitter-logged-in.yaml)\n\n[![Facebook Behavior: Page (Owner Logged In)](https://github.com/webrecorder/browsertrix-behaviors/actions/workflows/facebook-page.yaml/badge.svg)](https://github.com/webrecorder/browsertrix-behaviors/actions/workflows/facebook-page.yaml)\n\n[![Facebook Behavior: Page Photos (Owner Logged In)](https://github.com/webrecorder/browsertrix-behaviors/actions/workflows/facebook-photos.yaml/badge.svg)](https://github.com/webrecorder/browsertrix-behaviors/actions/workflows/facebook-photos.yaml)\n\n[![Facebook Behavior: Page Videos (Owner Logged In)](https://github.com/webrecorder/browsertrix-behaviors/actions/workflows/facebook-videos.yaml/badge.svg)](https://github.com/webrecorder/browsertrix-behaviors/actions/workflows/facebook-videos.yaml)\n\n\u003c/details\u003e\n\nA set of behaviors injected into the browser to perform certain operations on a page, such as scrolling, fetching additional URLs, or performing\ncustomized actions for social-media sites.\n\n## Usage\n\nThe behaviors are compiled into a single file, `dist/behaviors.js`, which can be injected into any modern browser to load the behavior system.\nNo additional dependencies are required, and the behaviors file can be pasted directly into your browser.\n\nThe file can injected in a number of ways, using tools like puppeteer/playwright, a browser extension content script, or even a devtools Snippet, or even a regular\n`\u003cscript\u003e` tag. Injecting the behaviors into the browser is outside the scope of this repo, but here are a few ways you can try the behaviors:\n\nFor an extensive walkthrough of creating your own custom behaviors to use with Browsertrix and Browsertrix Crawler, [follow the Tutorial](https://crawler.docs.browsertrix.com/user-guide/behaviors/#creating-custom-behaviors).\n\n### Copy \u0026 Paste Behaviors (for testing)\n\nTo test out the behaviors in your current browser, you can:\n\n1. Go to the [dist/behaviors.js](dist/behaviors.js)\n2. Copy the file (it is minified so will be on one line).\n3. Open a web page, such as one that has a custom behavior, like: [https://twitter.com/webrecorder_io](https://twitter.com/webrecorder_io)\n4. Open devtools console, and paste the script\n5. Enter `self.__bx_behaviors.run();`\n6. You should see the Twitter page automatically scrolling and visiting tweets.\n\n### Use Puppeteer\n\nTo integrate behaviors into an automated workflow, here is an short example using puppeteer.\n\n```javascript\n// assumes browsertrix-behaviors is installed as a node module\nconst behaviors = fs.readFileSync(\"./node_modules/browsertrix-behaviors/dist/behaviors.js\", \"utf-8\");\n\nawait page.evaluateOnNewDocument(behaviors + `\nself.__bx_behaviors.init({\n  autofetch: true,\n  autoplay: true,\n  autoscroll: true,\n  siteSpecific: true,\n});\n`);\n\n// call and await run on top frame and all child iframes\nawait Promise.allSettled(page.frames().map(frame =\u003e frame.evaluate(\"self.__bx_behaviors.run()\")));\n\n```\n\nsee [Browsertrix Crawler](https://github.com/webrecorder/browsertrix-crawler) for a complete working example of injection using puppeteer.\n\n## Initialization\n\nOnce the behavior script has been injected, run: `__bx_behaviors.init(opts)` to initialize which behaviors should be used. `opts` includes several boolean options:\n\n- `autofetch` - enable background autofetching of img srcsets, stylesheets (when possible) and any data-* attribute\n- `autoplay` - attempt to automatically play and video/audio, or fetch the URLs for any video streams found on the page.\n- `autoscroll` - attempt to repeatedly scroll the page to the bottom as far as possible.\n- `timeout` - set a timeout (in ms) for all behaviors to finish.\n- `siteSpecific` - run a site-specific behavior if available.\n- `log` - a function or global string to receive log messages from behaviors\n\n### Background Behaviors\n\nThe `autoplay` and `autofetch` are background behaviors, and will run as soon as `init(...)` is called, or as soon as the page is loaded.\nBackground behaviors do not change the page, but attempt to do additional fetching to ensure more resources are loaded.\nBackground behaviors can be used with user-directed browsing, and can also be loaded in any iframes on the page.\n\n### Active Behaviors\n\nThe `autoscroll` and `siteSpecific` enable 'active' behaviors, modify the page, and run until they are finished or timeout.\n\nIf both `siteSpecific` and `autoscroll` is specified, only one behavior is run. If a site-specific behavior exists, it takes precedence over auto-scroll, otherwise, auto-scroll is used.\n\nCurrently, the available site-specific behaviors are available for:\n\n- Twitter\n- Instagram\n\nAdditional site-specific behaviors can be added to the [site](./src/site) directory.\n\nTo run the active behavior, call: `await __bx_behaviors.run()` after init.\n\nAlternatively, calling `await __bx_behaviors.run(opts)` will also call `init(opts)` if init has not been called before.\n\nThe promised returned by run will wait for the active behavior to finish, for the `timeout` time to be reached. It will also ensure any pending autoplay requests are started for the `autoplay` behavior.\n\n## Logging\n\nBy default, behaviors will log debug messages to `console.log`. To disable this logging, set `log: false` in the init options.\n\nThis param can also be set to a custom init function by string. For example, to have behavior event messages be passed to `self.my_log`, set `log: \"my_log\"` in the options.\n\nAdditional logging options may be added soon.\n\n## Building\n\nBrowsertrix Behaviors uses webpack to build. Run `yarn run build` to build the latest `dist/behaviors.js`.\n\nShared utility functions can be added to `utils.js` while site-specific behavior can be added to `lib/site`.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwebrecorder%2Fbrowsertrix-behaviors","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwebrecorder%2Fbrowsertrix-behaviors","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwebrecorder%2Fbrowsertrix-behaviors/lists"}