{"id":20388350,"url":"https://github.com/probe-lab/tiros","last_synced_at":"2025-04-12T10:37:56.186Z","repository":{"id":105010727,"uuid":"602465073","full_name":"probe-lab/tiros","owner":"probe-lab","description":"🌐 An IPFS website measurement tool","archived":false,"fork":false,"pushed_at":"2025-01-23T11:44:19.000Z","size":8097,"stargazers_count":4,"open_issues_count":6,"forks_count":3,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-26T05:33:09.798Z","etag":null,"topics":["golang","ipfs","kademlia","libp2p"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/probe-lab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-02-16T09:12:41.000Z","updated_at":"2024-10-17T18:57:06.000Z","dependencies_parsed_at":"2024-06-21T19:07:48.083Z","dependency_job_id":"e87b4c3d-40de-443b-8a64-2dde6ecddd52","html_url":"https://github.com/probe-lab/tiros","commit_stats":null,"previous_names":["dennis-tra/tiros","probe-lab/tiros","plprobelab/tiros"],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/probe-lab%2Ftiros","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/probe-lab%2Ftiros/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/probe-lab%2Ftiros/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/probe-lab%2Ftiros/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/probe-lab","download_url":"https://codeload.github.com/probe-lab/tiros/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248555407,"owners_count":21123891,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["golang","ipfs","kademlia","libp2p"],"created_at":"2024-11-15T03:08:55.425Z","updated_at":"2025-04-12T10:37:56.115Z","avatar_url":"https://github.com/probe-lab.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Tiros\n\n[![standard-readme compliant](https://img.shields.io/badge/readme%20style-standard-brightgreen.svg?style=flat-square)](https://github.com/RichardLitt/standard-readme)\n\nTiros is an IPFS website measurement tool. It is intended to run on AWS ECS in multiple regions.\n\n## Table of Contents\n\n- [Tiros](#tiros)\n  - [Table of Contents](#table-of-contents)\n  - [Measurement Methodology](#measurement-methodology)\n  - [Measurement Metrics](#measurement-metrics)\n  - [Run](#run)\n  - [Development](#development)\n    - [Migrations](#migrations)\n  - [Alternative IPFS Implementation](#alternative-ipfs-implementation)\n  - [Maintainers](#maintainers)\n  - [Contributing](#contributing)\n  - [License](#license)\n\n## Measurement Methodology\n\nWe are running Tiros as a scheduled AWS ECS task in seven different AWS regions. These regions are:\n\n- `eu-central-1`\n- `ap-south-1`\n- `af-southeast-2`\n- `sa-east-1`\n- `us-east-2`\n- `us-west-1`\n- `af-south-1`\n\nEach ECS task consists of three containers:\n\n1. `scheduler` (this repository)\n2. `chrome` - via [`browserless/chrome`](https://github.com/browserless/chrome)\n3. `ipfs` - an IPFS implementation like [ipfs/kubo](https://hub.docker.com/r/ipfs/kubo/) or [ipfs/helia-http-gateway](https://github.com/ipfs/helia-http-gateway)\n\nIf run with `kubo` we'll run it with `LIBP2P_RCMGR=0` which disables the [libp2p Network Resource Manager](https://github.com/libp2p/go-libp2p-resource-manager#readme).\n\nThe `scheduler` gets configured with a list of websites that will then be probed. A typical website config looks like this `ipfs.io,docs.libp2p.io,ipld.io`.\nThe scheduler probes each website via the IPFS implementation by requesting `http://localhost:8080/ipns/\u003cwebsite\u003e` and via HTTP by requesting`https://\u003cwebsite\u003e`.\nPort `8080` is the default `kubo` HTTP-Gateway port. The `scheduler` uses [`go-rod`](https://github.com/go-rod/rod) to communicate with the `browserless/chrome` instance.\nThe following excerpt is a gist of what's happening when requesting a website:\n\n```go\nbrowser := rod.New().Context(ctx).ControlURL(\"ws://localhost:3000\")) // default CDP chrome port\n\nbrowser.Connect()\ndefer browser.Close()\n\nvar metricsStr string\nrod.Try(func() {\n    browser = browser.Context(c.Context).MustIncognito() // first defense to prevent hitting the cache\n    browser.MustSetCookies()                             // second defense to prevent hitting the cache (empty args clears cookies)\n    \n    page := browser.MustPage() // Get a handle of a new page in our incognito browser\n    \n    page.MustEvalOnNewDocument(jsOnNewDocument) // third defense to prevent hitting the cache - clears the cache by running `localStorage.clear()`\n    \n    // disable caching in general\n    proto.NetworkSetCacheDisabled{CacheDisabled: true}.Call(page) // fourth defense to prevent hitting the cache\n\n\n    // finally navigate to url and fail out of rod.Try by panicking\n    page.Timeout(websiteRequestTimeout).Navigate(url)\n    page.Timeout(websiteRequestTimeout).WaitLoad()\n    page.Timeout(websiteRequestTimeout).WaitIdle(time.Minute)\n\n    page.MustEval(wrapInFn(jsTTIPolyfill)) // add TTI polyfill\n    page.MustEval(wrapInFn(jsWebVitalsIIFE)) // add web-vitals\n\n    // finally actually measure the stuff\n    metricsStr = page.MustEval(jsMeasurement).Str()\n    \n    page.MustClose()\n})\n// parse metricsStr\n```\n\n`jsOnNewDocument` contains javascript that gets executed on a new page before anything happens. We're subscribing to performance events which is necessary for TTI polyfill and we're clearing the local storage. This is the code ([link to source](https://github.com/probe-lab/tiros/blob/main/js/onNewDocument.js)):\n\n```javascript\n// From https://github.com/GoogleChromeLabs/tti-polyfill#usage\n!function(){if('PerformanceLongTaskTiming' in window){var g=window.__tti={e:[]};\n    g.o=new PerformanceObserver(function(l){g.e=g.e.concat(l.getEntries())});\n    g.o.observe({entryTypes:['longtask']})}}();\n\nlocalStorage.clear();\n```\n\nThen, after the website has loaded we are adding a [TTI polyfill](https://github.com/probe-lab/tiros/blob/main/js/tti-polyfill.js) and [web-vitals](https://github.com/probe-lab/tiros/blob/main/js/web-vitals.iife.js) to the page.\n\nWe got the tti-polyfill from [GoogleChromeLabs/tti-polyfill](https://github.com/GoogleChromeLabs/tti-polyfill/blob/master/tti-polyfill.js) (archived in favor of the [First Input Delay](https://web.dev/fid/) metric).\nWe got the web-vitals javascript from [GoogleChrome/web-vitals](https://github.com/GoogleChrome/web-vitals) by building it ourselves with `npm run build` and then copying the `web-vitals.iife.js` (`iife` = immediately invoked function execution)\n\nThen we execute the following javascript on that page ([link to source](https://github.com/probe-lab/tiros/blob/main/js/measurement.js)):\n\n```javascript\nasync () =\u003e {\n\n    const onTTI = async (callback) =\u003e {\n        const tti = await window.ttiPolyfill.getFirstConsistentlyInteractive({})\n\n        // https://developer.chrome.com/docs/lighthouse/performance/interactive/#how-lighthouse-determines-your-tti-score\n        let rating = \"good\";\n        if (tti \u003e 7300) {\n            rating = \"poor\";\n        } else if (tti \u003e 3800) {\n            rating = \"needs-improvement\";\n        }\n\n        callback({\n            name: \"TTI\",\n            value: tti,\n            rating: rating,\n            delta: tti,\n            entries: [],\n        });\n    };\n\n    const {onCLS, onFCP, onLCP, onTTFB} = window.webVitals;\n\n    const wrapMetric = (metricFn) =\u003e\n        new Promise((resolve, reject) =\u003e {\n            const timeout = setTimeout(() =\u003e resolve(null), 10000);\n            metricFn(\n                (metric) =\u003e {\n                    clearTimeout(timeout);\n                    resolve(metric);\n                },\n                {reportAllChanges: true}\n            );\n        });\n\n    const data = await Promise.all([\n        wrapMetric(onCLS),\n        wrapMetric(onFCP),\n        wrapMetric(onLCP),\n        wrapMetric(onTTFB),\n        wrapMetric(onTTI),\n    ]);\n\n    return JSON.stringify(data);\n}\n```\n\nThis function will return a JSON array of the following format:\n\n```json\n[\n  {\n    \"name\": \"CLS\",\n    \"value\": 1.3750143983783765e-05,\n    \"rating\": \"good\",\n    ...\n  },\n  {\n    \"name\": \"FCP\",\n    \"value\": 872,\n    \"rating\": \"good\",\n    ...\n  },\n  {\n    \"name\": \"LCP\",\n    \"value\": 872,\n    \"rating\": \"good\",\n    ...\n  },\n  {\n    \"name\": \"TTFB\",\n    \"value\": 717,\n    \"rating\": \"good\",\n    ...\n  },\n  {\n    \"name\": \"TTI\",\n    \"value\": 999,\n    \"rating\": \"good\",\n    ...\n  }\n]\n```\n\nIf the website request went through the IPFS gateway we're running one round of garbage collection by calling the [`/api/v0/repo/gc` endpoint](https://docs.ipfs.tech/reference/kubo/rpc/#api-v0-repo-gc). With this, we make sure that the next request to that website won't come from the local kubo node cache.\n\nTo also measure a \"warmed up\" kubo node, we also configured a \"settle time\". This is just the time to wait before the first website requests are made. After the scheduler has looped through all websites we configured another settle time of 10min before all websites are requested again. Each run in between settles also has a \"times\" counter which is set to `5` right now in our deployment. This means that we request a single website 5 times in between each settle times. The loop looks like this:\n\n```go\nfor _, settle := range c.IntSlice(\"settle-times\") {\n    time.Sleep(time.Duration(settle) * time.Second)\n    for i := 0; i \u003c c.Int(\"times\"); i++ {\n        for _, mType := range []string{models.MeasurementTypeIPFS, models.MeasurementTypeHTTP} {\n            for _, website := range websites {\n\n                pr, _ := t.Probe(c, websiteURL(c, website, mType))\n                \n                t.Save(c, pr, website, mType, i)\n\n                if mType == models.MeasurementTypeIPFS {\n                    t.GarbageCollect(c.Context)\n                }\n            }\n        }\n    }\n}\n```\n\nSo in total, each run measures `settle-times * times * len([http, ipfs]) * len(websites)` website requests. In our case it's `2 * 5 * 2 * 14 = 280` requests. This takes around `1h` because some websites time out and the second settle time is configured to be `10m`\n\n## Measurement Metrics\n\nI read up on how to measure website performance and came across this list:\n\nhttps://developer.mozilla.org/en-US/docs/Learn/Performance/Perceived_performance\n\nTo quote the website:\n\n\u003e ## [Performance metrics](https://developer.mozilla.org/en-US/docs/Learn/Performance/Perceived_performance#performance_metrics)\n\u003e \n\u003e There is no single metric or test that can be run on a site to evaluate how a user \"feels\". However, there are a number of metrics that can be \"helpful indicators\":\n\u003e \n\u003e [First paint](https://developer.mozilla.org/en-US/docs/Glossary/First_paint)\n\u003e The time to start of first paint operation. Note that this change may not be visible; it can be a simple background color update or something even less noticeable.\n\u003e \n\u003e [First Contentful Paint](https://developer.mozilla.org/en-US/docs/Glossary/First_contentful_paint) (FCP)\n\u003e The time until first significant rendering (e.g. of text, foreground or background image, canvas or SVG, etc.). Note that this content is not necessarily useful or meaningful.\n\u003e \n\u003e [First Meaningful Paint](https://developer.mozilla.org/en-US/docs/Glossary/First_meaningful_paint) (FMP)\n\u003e The time at which useful content is rendered to the screen.\n\u003e \n\u003e [Largest Contentful Paint](https://wicg.github.io/largest-contentful-paint/) (LCP)\n\u003e The render time of the largest content element visible in the viewport.\n\u003e \n\u003e [Speed index](https://developer.mozilla.org/en-US/docs/Glossary/Speed_index)\n\u003e Measures the average time for pixels on the visible screen to be painted.\n\u003e \n\u003e [Time to interactive](https://developer.mozilla.org/en-US/docs/Glossary/Time_to_interactive)\n\u003e Time until the UI is available for user interaction (i.e. the last [long task](https://developer.mozilla.org/en-US/docs/Glossary/Long_task) of the load process finishes).\n\nI think the relevant metrics on this list for us are `First Contentful Paint`, `Largest Contentful Paint`, and `Time to interactive`. `First Meaningful Paint` is deprecated (you can see that if you follow the link) and they recommend: \"[...] consider using the [LargestContentfulPaint API](https://wicg.github.io/largest-contentful-paint/) instead.\".\n\n`First paint` would include changes that \"may not be visible\", so I'm not particularly fond of this metric.\n\n`Speed index` seems to be very much website-specific. With that, I mean that the network wouldn't play a role in this metric. We would measure the performance of the website itself. I would argue  that this is not something we want.\n\nBesides the above metrics, we should still measure `timeToFirstByte`. According to https://web.dev/ttfb/ the metric would be the time difference between `startTime` and `responseStart`:\n\n![image](https://user-images.githubusercontent.com/11836793/224770610-1a02a082-96e6-4198-8af6-4682d76f2d41.png)\n\nIn the above graph you can also see the two timestamps `domContentLoadedEventStart` and `domContentLoadedEventEnd`. So I would think that the `domContentLoaded` metric would just be the difference between the two. However, this seems to only account for the processing time of the HTML ([+ deferred JS scripts](https://developer.mozilla.org/en-US/docs/Web/API/Window/DOMContentLoaded_event)).\n\nWe could instead define `domContentLoaded` as the time difference between `startTime` and `domContentLoadedEventEnd`.\n\n## Run\n\nYou need to provide many configuration parameters to `tiros`. See this help page:\n\n```text\nNAME:\n   tiros run\n\nUSAGE:\n   tiros run [command options] [arguments...]\n\nOPTIONS:\n   --websites value [ --websites value ]          Websites to test against. Example: 'ipfs.io' or 'filecoin.io [$TIROS_RUN_WEBSITES]\n   --region value                                 In which region does this tiros task run in [$TIROS_RUN_REGION]\n   --settle-times value [ --settle-times value ]  a list of times to settle in seconds (default: 10, 1200) [$TIROS_RUN_SETTLE_TIMES]\n   --times value                                  number of times to test each URL (default: 3) [$TIROS_RUN_TIMES]\n   --dry-run                                      Whether to skip DB interactions (default: false) [$TIROS_RUN_DRY_RUN]\n   --db-host value                                On which host address can this clustertest reach the database [$TIROS_RUN_DATABASE_HOST]\n   --db-port value                                On which port can this clustertest reach the database (default: 0) [$TIROS_RUN_DATABASE_PORT]\n   --db-name value                                The name of the database to use [$TIROS_RUN_DATABASE_NAME]\n   --db-password value                            The password for the database to use [$TIROS_RUN_DATABASE_PASSWORD]\n   --db-user value                                The user with which to access the database to use [$TIROS_RUN_DATABASE_USER]\n   --db-sslmode value                             The sslmode to use when connecting the the database [$TIROS_RUN_DATABASE_SSL_MODE]\n   --kubo-api-port value                          port to reach the Kubo API (default: 5001) [$TIROS_RUN_KUBO_API_PORT]\n   --kubo-gateway-port value                      port to reach the Kubo Gateway (default: 8080) [$TIROS_RUN_KUBO_GATEWAY_PORT]\n   --chrome-cdp-port value                        port to reach the Chrome DevTools Protocol port (default: 3000) [$TIROS_RUN_CHROME_CDP_PORT]\n   --cpu value                                    CPU resources for this measurement run (default: 2) [$TIROS_RUN_CPU]\n   --memory value                                 Memory resources for this measurement run (default: 4096) [$TIROS_RUN_MEMORY]\n   --help, -h                                     show help\n```\n\n## Development\n\nTo test the tool locally, you need to start a database, kubo node, and headless chrome. You can do all of this by running:\n\n```shell\ndocker compose up -d\n```\n\nThen you need to point `tiros` to your local deployment. You can do this by\n_sourcing_ the included [`.env.local`](./.env.local) file:\n\n```shell\nsource .env.local\n```\n\nFinally, run `tiros` via:\n\n```shell\ngo build -o tiros .\n./tiros run\n\n# OR\n\ngo run . run\n```\n\nAfter the run has finished, you can check the local database for the measurement data. Run:\n\n```shell\ndocker exec -it tiros-db-1 psql -U tiros_test -d tiros_test\n```\n\nto connect to the local database. If prompted for a password enter `password` or\nwhatever is set in the [`.env.local`](./.env.local) file for the `TIROS_RUN_DATABASE_PASSWORD` environment variable.\n\nExample output:\n\n```\n$ docker exec -it tiros-db-1 psql -U tiros_test -d tiros_test                                                                                                                                                                                                                                                                                                                                  3s \npsql (14.6 (Debian 14.6-1.pgdg110+1))\nType \"help\" for help.\n\ntiros_test=# select * from runs;\n id | region |   websites    |    version     | times | cpu | memory |          updated_at           |          created_at           |          finished_at          | ipfs_impl \n----+--------+---------------+----------------+-------+-----+--------+-------------------------------+-------------------------------+-------------------------------+-----------\n  1 | local  | {filecoin.io} | 0.19.0-1963219 |     1 |   2 |   4096 | 2024-03-26 09:26:07.948483+00 | 2024-03-26 09:25:30.600963+00 | 2024-03-26 09:26:07.948482+00 | KUBO\n  2 | local  | {filecoin.io} | 0.19.0-1963219 |     1 |   2 |   4096 | 2024-03-26 09:32:05.247122+00 | 2024-03-26 09:31:28.844582+00 | 2024-03-26 09:32:05.247122+00 | KUBO\n(2 rows)\n\n```\n\n### Migrations\n\nTo create a new migration run:\n\n```shell\nmigrate create -ext sql -dir migrations -seq create_measurements_table\n```\n\nTo create the database models\n\n```shell\nmake models\n```\n\n## Alternative IPFS Implementation\n\nAn alternative IPFS implementation needs to support a couple of things:\n\n1. The [`/api/v0/repo/gc`](https://docs.ipfs.tech/reference/kubo/rpc/#api-v0-repo-gc) endpoint\n2. The [`/api/v0/version`](https://docs.ipfs.tech/reference/kubo/rpc/#api-v0-version) endpoint\n3. Expose a [rudimentary IPFS Gateway](https://docs.ipfs.tech/reference/http/gateway/) that at least supports resolving IPNS links\n\n## Maintainers\n\n[@dennis-tra](https://github.com/dennis-tra).\n\n## Contributing\n\nFeel free to dive in! [Open an issue](https://github.com/RichardLitt/standard-readme/issues/new) or submit PRs.\n\n## License\n\n[MIT](LICENSE) © Dennis Trautwein\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fprobe-lab%2Ftiros","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fprobe-lab%2Ftiros","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fprobe-lab%2Ftiros/lists"}