{"id":19660227,"url":"https://github.com/takumakanari/embulk-input-http","last_synced_at":"2025-07-26T08:37:50.578Z","repository":{"id":28284126,"uuid":"31796413","full_name":"takumakanari/embulk-input-http","owner":"takumakanari","description":"Embulk plugin for http input","archived":false,"fork":false,"pushed_at":"2021-12-22T01:43:35.000Z","size":310,"stargazers_count":22,"open_issues_count":1,"forks_count":6,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-07-22T10:33:03.413Z","etag":null,"topics":["embulk","embulk-input-plugin","http","java"],"latest_commit_sha":null,"homepage":null,"language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/takumakanari.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2015-03-07T01:09:37.000Z","updated_at":"2025-07-08T12:04:38.000Z","dependencies_parsed_at":"2022-08-26T15:00:33.668Z","dependency_job_id":null,"html_url":"https://github.com/takumakanari/embulk-input-http","commit_stats":null,"previous_names":[],"tags_count":17,"template":false,"template_full_name":null,"purl":"pkg:github/takumakanari/embulk-input-http","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/takumakanari%2Fembulk-input-http","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/takumakanari%2Fembulk-input-http/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/takumakanari%2Fembulk-input-http/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/takumakanari%2Fembulk-input-http/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/takumakanari","download_url":"https://codeload.github.com/takumakanari/embulk-input-http/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/takumakanari%2Fembulk-input-http/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":267141100,"owners_count":24041979,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-26T02:00:08.937Z","response_time":62,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["embulk","embulk-input-plugin","http","java"],"created_at":"2024-11-11T15:45:43.125Z","updated_at":"2025-07-26T08:37:50.537Z","avatar_url":"https://github.com/takumakanari.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Embulk::Input::Http\n\n[![CircleCI](https://circleci.com/gh/takumakanari/embulk-input-http.svg?style=svg)](https://circleci.com/gh/takumakanari/embulk-input-http)\n\nInput HTTP plugin for [Embulk](https://github.com/embulk/embulk).\nFetch data via HTTP.\n\n\n## Installation\n\nRun this command with your embulk binary.\n\n```ruby\n$ embulk gem install embulk-input-http\n```\n\n## Usage\n\nSpecify in your config.yml file.\n\n```yaml\nin:\n  type: http\n  url: http://express.heartrails.com/api/json\n  params:\n    - {name: method, value: getStations}\n    - {name: x, value: 135.0}\n    - {name: y, value: \"{30..35}.0\", expand: true}\n  method: get\n```\n\n- **type**: specify this plugin as `http`\n- **url**: base url something like api (required)\n- **params**: pair of name/value to specify query parameter (optional)\n- **pager**: configuration to parameterize paging (optional)\n- **method**: http method, get is used by default (optional)\n- **user_agent**: the user agent to specify request header (optional)\n- **request_headers**: the extra request headers as key-value (optional)\n- **request_body**: the request body content, enabled if method is post and params are empty (optional)\n- **charset**: charset to specify request header (optional, default: utf8)\n- **basic_auth**: username/password for basic authentication (optional)\n- **open_timeout**: timeout msec to open connection (optional, default: 2000)\n- **read_timeout**: timeout msec to read content via http (optional, default: 10000)\n- **max_retries**: max number of retry request if failed (optional, default: 5)\n- **retry_interval**: interval msec to retry max (optional, default: 10000)\n- **request_interval**: wait msec before each requests (optional, default: 0)\n- **interval\\_includes\\_response\\_time**: yes/no, if yes and you set `request_interval`, response time will be included in interval for next request (optional, default: no)\n- **input\\_direct**: If false, dumps content to temp file first, to avoid read timeout due to process large data while downloading from remote (optional, default: true)\n\n### Defining multiple requests in `params`\n\nTo defining multiple requests in `params` by using  `values` or `brace expansion` with setting `expand: true`.\n\nSimply using `values` array is as below:\n\n```yaml\nparams:\n  - {name: id, values: [5, 4, 3, 2, 1]}\n  - {name: name, values: [\"John\", \"Paul\", \"George\", \"Ringo\"], expand: true}\n```\n\nThe `values` is also rewritable with  `brace expansion` like as follows:\n\n```yaml\nparams:\n  - {name: id, value \"{5,4,3,2,1}\", expand: true}\n  - {name: name, value \"{John,Paul,George,Ringo}\", expand: true}\n```\n\n### Basic authentication\n\nThe following is configuring username/password for the basic authentication.\n\n```yaml\nbasic_auth:\n user: MyUser\n password: MyPassword\n```\n\n### Paginate by `pager`\n\nConfigure like as follows to easily paginate a request:\n\n```yaml\nin:\n  type: http\n  url: http://express.heartrails.com/api/json\n  pager: {from_param: from, to_param: to, start: 1, step: 1000, pages: 10}\n```\n\nProperties of pager is as below:\n\n- **from_param**: parameter name of 'from' index\n- **to_param**: parameter name of 'to' index (optional)\n- **pages**: total page size\n- **start**: first index number (optional, 0 is used by default)\n- **step**: size to increment (optional, 1 is used by default)\n\n#### Examples of using `pager`\n\n1. Combination of `from` and `to`\n\n    ```yaml\n    pager: {from_param: from, to_param: to, pages: 4, start: 1, step: 10}\n    ```\n \n    1. ?from=1\u0026to=10\n    1. ?from=11\u0026to=20\n    1. ?from=21\u0026to=30\n    1. ?from=31\u0026to=40\n\n1. Increment page parameter\n\n    ```yaml\n    params:\n      - {name: size, value: 100}\n    pager: {from_param: page, pages: 4, start: 1, step: 1}\n    ```\n\n    1. ?page=1\u0026size=100\n    1. ?page=2\u0026size=100\n    1. ?page=3\u0026size=100\n    1. ?page=4\u0026size=100\n\n\n## Example\n\n### Fetch json via http api\n\n```yaml\nin:\n  type: http\n  url: http://express.heartrails.com/api/json\n  params:\n    - {name: method, value: getStations}\n    - {name: x, value: 135.0}\n    - {name: y, value: \"{35,34,33,32,31}.0\", expand: true}\n  request_headers: {X-Some-Key1: some-value1, X-Some-key2: some-value2}\n  parser:\n    type: json\n    root: $.response.station\n    schema:\n      - {name: name, type: string}\n      - {name: next, type: string}\n      - {name: prev, type: string}\n      - {name: distance, type: string}\n      - {name: lat, type: double, path: x}\n      - {name: lng, type: double, path: y}\n      - {name: line, type: string}\n      - {name: postal, type: string}\n```\n\n### Fetch csv\n\n```yaml\nin:\n  type: http\n  url: http://192.168.33.10:8085/sample.csv\n    - {name: y, value: \"{35,34,33,32,31}.0\", expand: true}\n  parser:\n    charset: UTF-8\n    newline: CRLF\n    type: csv\n    delimiter: ','\n    quote: '\"'\n    escape: ''\n    skip_header_lines: 1\n    columns:\n    - {name: id, type: long}\n    - {name: account, type: long}\n    - {name: time, type: timestamp, format: '%Y-%m-%d %H:%M:%S'}\n    - {name: purchase, type: timestamp, format: '%Y%m%d'}\n    - {name: comment, type: string}\n```\n\n## TODO\n- HTTP-proxy\n- Guess\n\n## Patch\n\nWelcome!\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftakumakanari%2Fembulk-input-http","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftakumakanari%2Fembulk-input-http","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftakumakanari%2Fembulk-input-http/lists"}