{"id":16334011,"url":"https://github.com/nmattia/cio","last_synced_at":"2026-04-28T18:04:10.003Z","repository":{"id":94700634,"uuid":"145323021","full_name":"nmattia/cio","owner":"nmattia","description":"cio: cached HTTP requests for a smooth Jupyter experience!","archived":false,"fork":false,"pushed_at":"2018-08-20T21:59:56.000Z","size":14,"stargazers_count":4,"open_issues_count":0,"forks_count":0,"subscribers_count":4,"default_branch":"master","last_synced_at":"2024-12-26T16:42:20.442Z","etag":null,"topics":["haskell","http","jupyter","leveldb"],"latest_commit_sha":null,"homepage":"","language":"Haskell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/nmattia.png","metadata":{"files":{"readme":"README.ipynb","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-08-19T17:16:21.000Z","updated_at":"2019-12-13T10:04:47.000Z","dependencies_parsed_at":"2023-04-27T04:45:24.864Z","dependency_job_id":null,"html_url":"https://github.com/nmattia/cio","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nmattia%2Fcio","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nmattia%2Fcio/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nmattia%2Fcio/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nmattia%2Fcio/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/nmattia","download_url":"https://codeload.github.com/nmattia/cio/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":239321400,"owners_count":19619697,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["haskell","http","jupyter","leveldb"],"created_at":"2024-10-10T23:37:10.527Z","updated_at":"2025-11-01T20:30:22.469Z","avatar_url":"https://github.com/nmattia.png","language":"Haskell","funding_links":[],"categories":[],"sub_categories":[],"readme":"{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"# cio: cached HTTP requests for a smooth Jupyter experience!\\n\",\n    \"\\n\",\n    \"This library provides a thin wrapper around the [wreq](http://serpentine.com/wreq) library (a simple HTTP client library). It is meant to be used with [Jupyter](http://jupyter.org/): all requests will be stored _on disk_ and served from the cache subsequently, even if your kernel gets restarted. The cache lookups are near-instantaneous thanks to the amazing [LevelDB](http://leveldb.org/) library. You can use `cio` just like you would `wreq` -- instead of importing `Network.Wreq`, import `CIO` (which stands for Cached IO):\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 1,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"{-# LANGUAGE OverloadedStrings #-}\\n\",\n    \"\\n\",\n    \"import CIO\\n\",\n    \"import Data.Aeson.Lens\\n\",\n    \"import Control.Lens\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"Then use the functions you are used to, like `get`:\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 2,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"\\\"Nicolas Mattia\\\"\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    }\n   ],\n   \"source\": [\n    \"get \\\"https://api.github.com/users/nmattia\\\" \u003c\u0026\u003e\\n\",\n    \"    (^.responseBody.key \\\"name\\\"._String)\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"## Building cio\\n\",\n    \"\\n\",\n    \"The simplest way to build this library is to use Nix. To get started clone the cio repository ([nmattia/cio](https://github.com/nmattia/cio)), then run the following:\\n\",\n    \"\\n\",\n    \"``` shell\\n\",\n    \"$ nix-shell\\n\",\n    \"helpers:\\n\",\n    \"\u003e cio_build\\n\",\n    \"\u003e cio_ghci\\n\",\n    \"\u003e cio_notebook\\n\",\n    \"\u003e cio_readme_gen\\n\",\n    \"```\\n\",\n    \"\\n\",\n    \"The helper functions will respectively build `cio`, start a `ghci` session for `cio`, start a Jupyter notebook with `cio` loaded and regenerate the README (this file is a Jupyter notebook!).\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"## Using cio\\n\",\n    \"\\n\",\n    \"Three functions are provided on top of `wreq`:\\n\",\n    \"* `get :: String -\u003e CIO Response` performs a (cached) request to the given URL.\\n\",\n    \"* `getWith :: Options -\u003e String -\u003e CIO Response` performs a (cached) request to the given URL using the provided `wreq` [`Options`](http://hackage.haskell.org/package/wreq-0.5.2.1/docs/Network-Wreq.html#t:Options).\\n\",\n    \"* `getAllWith :: Options -\u003e String -\u003e Producer CIO Response` performs several (cached) requests by lazily following the `Link` headers (see for instance [GitHub's pagination mechanism](https://developer.github.com/v3/guides/traversing-with-pagination/)).\\n\",\n    \"\\n\",\n    \"Let's see what happens when a request is performed twice. First let's write a function for timing the requests:\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 3,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"import Control.Monad.IO.Class\\n\",\n    \"import Data.Time\\n\",\n    \"\\n\",\n    \"timeIt :: CIO a -\u003e CIO (NominalDiffTime, a)\\n\",\n    \"timeIt act = do\\n\",\n    \"    start \u003c- liftIO $ getCurrentTime\\n\",\n    \"    res \u003c- act\\n\",\n    \"    stop \u003c- liftIO $ getCurrentTime\\n\",\n    \"    pure (diffUTCTime stop start, res)\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"Then we'll generate a unique string which we'll use as a dummy parameter in order to force `cio` to perform the request the first time, so that we can time it:\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 4,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"import Data.UUID (toText)\\n\",\n    \"import System.Random (randomIO)\\n\",\n    \"\\n\",\n    \"uuid \u003c- toText \u003c$\u003e randomIO\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"Finally we use `getWith` and set the `dummy` query parameter to the `UUID` we just generated and time the request:\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 5,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"(1.214306799s,\\\"Nicolas Mattia\\\")\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    }\n   ],\n   \"source\": [\n    \"timeIt $ getWith (param \\\"dummy\\\" .~ [uuid] $ defaults) \\\"https://api.github.com/users/nmattia\\\" \u003c\u0026\u003e\\n\",\n    \"    (^.responseBody.key \\\"name\\\"._String)\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"That's a pretty long time! When playing around with data in a Jupyter notebook waiting around for requests to complete is a real productivity and creativity killer. Let's see what `cio` can do for us:\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 6,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"(0.000248564s,\\\"Nicolas Mattia\\\")\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    }\n   ],\n   \"source\": [\n    \"timeIt $ getWith (param \\\"dummy\\\" .~ [uuid] $ defaults) \\\"https://api.github.com/users/nmattia\\\" \u003c\u0026\u003e\\n\",\n    \"    (^.responseBody.key \\\"name\\\"._String)\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"Pretty nice! You might have noticed that the `CIO` results were printed out, as `Show a =\u003e IO a` would be in GHCi. As mentioned before, `cio` is optimized for Jupyter workflows, and as such all `Show`-able results will be printed directly to the notebook's output. Lists of `Show`-ables will be pretty printed, which we'll demonstrate by playing with `cio`'s other cool feature: lazily following page links.\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 7,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"import Data.Conduit\\n\",\n    \"import Data.Conduit.Combinators as C\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"In order to lazily fetch data `cio` uses the [`conduit` library](http://hackage.haskell.org/package/conduit). The `getAllWith` function is a `Producer` of `Response`s (sorry, a `ConduitT i Response CIO ()`) which are served from the cache when possible. Here we ask GitHub to give us only two results per page, and `cio` will iterate the pages until the five expected items have been fetched (if you do the math that's about 3 pages):\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 8,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"\\\"jgm/pandoc\\\"\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    },\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"\\\"koalaman/shellcheck\\\"\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    },\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"\\\"PostgREST/postgrest\\\"\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    },\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"\\\"purescript/purescript\\\"\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    },\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"\\\"elm/compiler\\\"\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    }\n   ],\n   \"source\": [\n    \"sourceToList $ \\n\",\n    \"    getAllWith \\n\",\n    \"        (defaults \\n\",\n    \"        \u0026 param \\\"q\\\" .~ [\\\"language:haskell\\\"] \\n\",\n    \"        \u0026 param \\\"sort\\\" .~ [\\\"stars\\\"]\\n\",\n    \"        \u0026 param \\\"per_page\\\" .~ [\\\"2\\\"])\\n\",\n    \"        \\\"https://api.github.com/search/repositories\\\"\\n\",\n    \"    .| awaitForever (C.yieldMany . (\\n\",\n    \"        ^..responseBody\\n\",\n    \"        .key \\\"items\\\"\\n\",\n    \"        .values\\n\",\n    \"        .key \\\"full_name\\\"\\n\",\n    \"        ._String))\\n\",\n    \"    .| C.take 5\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"## What if something goes wrong?\\n\",\n    \"\\n\",\n    \"What's the second hardest thing in computer science, besides naming and off-by-one errors? Cache invalidation, of course. For the cache's sake, all your requests should be idempotent, but unfortunately that's not always possible. Here `cio` doesn't assume anything but lets you deal with dirtying yourself by using either of these two functions:\\n\",\n    \"\\n\",\n    \"* `dirtyReq :: String -\u003e CIO ()`, like `get` but instead of fetching the response dirties the entry in the cache.\\n\",\n    \"* `dirtyReqWith :: Options -\u003e String -\u003e CIO ()`, like `getWith` but instead of fetching the response dirties the entry in the cache.\\n\",\n    \"\\n\",\n    \"If things went _really_ wrong, you can always wipe the cache entirely...\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"## ... but where's the cache?\\n\",\n    \"\\n\",\n    \"The cache is set globally (reminder: this is a Jupyter-optimized workflow):\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 9,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"\\\"requests.cache\\\"\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    }\n   ],\n   \"source\": [\n    \"getCacheFile\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"If you need a different cache file you can either change the global cache file:\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 10,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/html\": [\n       \"\u003cstyle\u003e/* Styles used for the Hoogle display in the pager */\\n\",\n       \".hoogle-doc {\\n\",\n       \"display: block;\\n\",\n       \"padding-bottom: 1.3em;\\n\",\n       \"padding-left: 0.4em;\\n\",\n       \"}\\n\",\n       \".hoogle-code {\\n\",\n       \"display: block;\\n\",\n       \"font-family: monospace;\\n\",\n       \"white-space: pre;\\n\",\n       \"}\\n\",\n       \".hoogle-text {\\n\",\n       \"display: block;\\n\",\n       \"}\\n\",\n       \".hoogle-name {\\n\",\n       \"color: green;\\n\",\n       \"font-weight: bold;\\n\",\n       \"}\\n\",\n       \".hoogle-head {\\n\",\n       \"font-weight: bold;\\n\",\n       \"}\\n\",\n       \".hoogle-sub {\\n\",\n       \"display: block;\\n\",\n       \"margin-left: 0.4em;\\n\",\n       \"}\\n\",\n       \".hoogle-package {\\n\",\n       \"font-weight: bold;\\n\",\n       \"font-style: italic;\\n\",\n       \"}\\n\",\n       \".hoogle-module {\\n\",\n       \"font-weight: bold;\\n\",\n       \"}\\n\",\n       \".hoogle-class {\\n\",\n       \"font-weight: bold;\\n\",\n       \"}\\n\",\n       \".get-type {\\n\",\n       \"color: green;\\n\",\n       \"font-weight: bold;\\n\",\n       \"font-family: monospace;\\n\",\n       \"display: block;\\n\",\n       \"white-space: pre-wrap;\\n\",\n       \"}\\n\",\n       \".show-type {\\n\",\n       \"color: green;\\n\",\n       \"font-weight: bold;\\n\",\n       \"font-family: monospace;\\n\",\n       \"margin-left: 1em;\\n\",\n       \"}\\n\",\n       \".mono {\\n\",\n       \"font-family: monospace;\\n\",\n       \"display: block;\\n\",\n       \"}\\n\",\n       \".err-msg {\\n\",\n       \"color: red;\\n\",\n       \"font-style: italic;\\n\",\n       \"font-family: monospace;\\n\",\n       \"white-space: pre;\\n\",\n       \"display: block;\\n\",\n       \"}\\n\",\n       \"#unshowable {\\n\",\n       \"color: red;\\n\",\n       \"font-weight: bold;\\n\",\n       \"}\\n\",\n       \".err-msg.in.collapse {\\n\",\n       \"padding-top: 0.7em;\\n\",\n       \"}\\n\",\n       \".highlight-code {\\n\",\n       \"white-space: pre;\\n\",\n       \"font-family: monospace;\\n\",\n       \"}\\n\",\n       \".suggestion-warning { \\n\",\n       \"font-weight: bold;\\n\",\n       \"color: rgb(200, 130, 0);\\n\",\n       \"}\\n\",\n       \".suggestion-error { \\n\",\n       \"font-weight: bold;\\n\",\n       \"color: red;\\n\",\n       \"}\\n\",\n       \".suggestion-name {\\n\",\n       \"font-weight: bold;\\n\",\n       \"}\\n\",\n       \"\u003c/style\u003e\u003cspan class='get-type'\u003esetCacheFile :: FilePath -\u003e IO ()\u003c/span\u003e\"\n      ],\n      \"text/plain\": [\n       \"setCacheFile :: FilePath -\u003e IO ()\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    }\n   ],\n   \"source\": [\n    \":t setCacheFile\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"or run your `CIO` code manually:\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 11,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/html\": [\n       \"\u003cstyle\u003e/* Styles used for the Hoogle display in the pager */\\n\",\n       \".hoogle-doc {\\n\",\n       \"display: block;\\n\",\n       \"padding-bottom: 1.3em;\\n\",\n       \"padding-left: 0.4em;\\n\",\n       \"}\\n\",\n       \".hoogle-code {\\n\",\n       \"display: block;\\n\",\n       \"font-family: monospace;\\n\",\n       \"white-space: pre;\\n\",\n       \"}\\n\",\n       \".hoogle-text {\\n\",\n       \"display: block;\\n\",\n       \"}\\n\",\n       \".hoogle-name {\\n\",\n       \"color: green;\\n\",\n       \"font-weight: bold;\\n\",\n       \"}\\n\",\n       \".hoogle-head {\\n\",\n       \"font-weight: bold;\\n\",\n       \"}\\n\",\n       \".hoogle-sub {\\n\",\n       \"display: block;\\n\",\n       \"margin-left: 0.4em;\\n\",\n       \"}\\n\",\n       \".hoogle-package {\\n\",\n       \"font-weight: bold;\\n\",\n       \"font-style: italic;\\n\",\n       \"}\\n\",\n       \".hoogle-module {\\n\",\n       \"font-weight: bold;\\n\",\n       \"}\\n\",\n       \".hoogle-class {\\n\",\n       \"font-weight: bold;\\n\",\n       \"}\\n\",\n       \".get-type {\\n\",\n       \"color: green;\\n\",\n       \"font-weight: bold;\\n\",\n       \"font-family: monospace;\\n\",\n       \"display: block;\\n\",\n       \"white-space: pre-wrap;\\n\",\n       \"}\\n\",\n       \".show-type {\\n\",\n       \"color: green;\\n\",\n       \"font-weight: bold;\\n\",\n       \"font-family: monospace;\\n\",\n       \"margin-left: 1em;\\n\",\n       \"}\\n\",\n       \".mono {\\n\",\n       \"font-family: monospace;\\n\",\n       \"display: block;\\n\",\n       \"}\\n\",\n       \".err-msg {\\n\",\n       \"color: red;\\n\",\n       \"font-style: italic;\\n\",\n       \"font-family: monospace;\\n\",\n       \"white-space: pre;\\n\",\n       \"display: block;\\n\",\n       \"}\\n\",\n       \"#unshowable {\\n\",\n       \"color: red;\\n\",\n       \"font-weight: bold;\\n\",\n       \"}\\n\",\n       \".err-msg.in.collapse {\\n\",\n       \"padding-top: 0.7em;\\n\",\n       \"}\\n\",\n       \".highlight-code {\\n\",\n       \"white-space: pre;\\n\",\n       \"font-family: monospace;\\n\",\n       \"}\\n\",\n       \".suggestion-warning { \\n\",\n       \"font-weight: bold;\\n\",\n       \"color: rgb(200, 130, 0);\\n\",\n       \"}\\n\",\n       \".suggestion-error { \\n\",\n       \"font-weight: bold;\\n\",\n       \"color: red;\\n\",\n       \"}\\n\",\n       \".suggestion-name {\\n\",\n       \"font-weight: bold;\\n\",\n       \"}\\n\",\n       \"\u003c/style\u003e\u003cspan class='get-type'\u003erunCIOWith :: forall a. FilePath -\u003e CIO a -\u003e IO a\u003c/span\u003e\"\n      ],\n      \"text/plain\": [\n       \"runCIOWith :: forall a. FilePath -\u003e CIO a -\u003e IO a\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    }\n   ],\n   \"source\": [\n    \":t runCIOWith\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"## one more thing...\\n\",\n    \"\\n\",\n    \".. nope, that's all! Enjoy!\"\n   ]\n  }\n ],\n \"metadata\": {\n  \"kernelspec\": {\n   \"display_name\": \"Haskell\",\n   \"language\": \"haskell\",\n   \"name\": \"haskell\"\n  },\n  \"language_info\": {\n   \"codemirror_mode\": \"ihaskell\",\n   \"file_extension\": \".hs\",\n   \"name\": \"haskell\",\n   \"version\": \"8.2.2\"\n  }\n },\n \"nbformat\": 4,\n \"nbformat_minor\": 2\n}\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnmattia%2Fcio","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnmattia%2Fcio","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnmattia%2Fcio/lists"}