{"id":15653161,"url":"https://github.com/eonu/arx","last_synced_at":"2025-04-16T05:57:24.829Z","repository":{"id":52420151,"uuid":"177364951","full_name":"eonu/arx","owner":"eonu","description":"A Ruby interface for querying academic papers on the arXiv search API.","archived":false,"fork":false,"pushed_at":"2021-04-29T19:32:28.000Z","size":163,"stargazers_count":31,"open_issues_count":3,"forks_count":1,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-03-29T05:05:28.852Z","etag":null,"topics":["academic-paper","api","api-wrapper","archive","arxiv","arxiv-api","arxiv-org","gem","ruby"],"latest_commit_sha":null,"homepage":"https://arxiv.org/help/api","language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/eonu.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-03-24T03:09:31.000Z","updated_at":"2024-11-28T07:24:34.000Z","dependencies_parsed_at":"2022-09-13T11:00:40.978Z","dependency_job_id":null,"html_url":"https://github.com/eonu/arx","commit_stats":null,"previous_names":[],"tags_count":11,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eonu%2Farx","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eonu%2Farx/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eonu%2Farx/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eonu%2Farx/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/eonu","download_url":"https://codeload.github.com/eonu/arx/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248799985,"owners_count":21163404,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["academic-paper","api","api-wrapper","archive","arxiv","arxiv-api","arxiv-org","gem","ruby"],"created_at":"2024-10-03T12:44:51.108Z","updated_at":"2025-04-16T05:57:24.810Z","avatar_url":"https://github.com/eonu.png","language":"Ruby","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Arx\n\n[![Ruby Version](https://img.shields.io/badge/ruby-%3E=%202.5-red.svg)](https://github.com/eonu/arx/blob/503a1c95ac450dbc20623491060c3fc32d213627/arx.gemspec#L19)\n[![Gem](https://img.shields.io/gem/v/arx.svg)](https://rubygems.org/gems/arx)\n[![License](https://img.shields.io/github/license/eonu/arx.svg)](https://github.com/eonu/arx/blob/master/LICENSE)\n\n[![Maintainability](https://api.codeclimate.com/v1/badges/e94073dfa8c3e2442298/maintainability)](https://codeclimate.com/github/eonu/arx/maintainability)\n[![Build Status](https://travis-ci.com/eonu/arx.svg?branch=master)](https://travis-ci.com/eonu/arx)\n[![Coverage Status](https://coveralls.io/repos/github/eonu/arx/badge.svg?branch=feature%2Fcoveralls)](https://coveralls.io/github/eonu/arx?branch=feature%2Fcoveralls)\n\n**A Ruby interface for querying academic papers on the arXiv search API.**\n\n[arXiv](https://arxiv.org/) provides an advanced search utility on their website, as well as an extensive [search API](https://arxiv.org/help/api) that allows for the external querying of academic papers hosted on their website.\n\nAlthough [Scholastica](https://github.com/scholastica) offer a great [Ruby gem](https://github.com/scholastica/arxiv) for retrieving papers from arXiv through the search API, this gem only allows for the retrieval of one paper at a time, and only supports searching for paper by ID.\n\n\u003e Arx is a gem that allows for quick and easy querying of the arXiv search API, without having to worry about manually writing your own search query strings or parsing the resulting XML query response to find the data you need.\n\n## Examples\n\n1. Suppose we wish to search for papers in the `cs.FL` (Formal Languages and Automata Theory) category whose title contains `\"Buchi Automata\"`, not authored by `Tomáš Babiak`, sorted by submission date (latest first).\n\n  ```ruby\n  require 'arx'\n\n  papers = Arx(sort_by: :submitted_at) do |query|\n    query.category('cs.FL')\n    query.title('Buchi Automata').and_not.author('Tomáš Babiak')\n  end\n  ```\n\n2. Suppose we wish to retrieve the main category of the paper with arXiv ID `1809.09415`, the name of the first author and the date it was published.\n\n  ```ruby\n  require 'arx'\n\n  paper = Arx('1809.09415')\n  paper.authors.first.name\n  #=\u003e \"Christof Löding\"\n  paper.categories.first.full_name # or paper.primary_category.full_name\n  #=\u003e \"Formal Languages and Automata Theory\"\n  paper.published_at\n  #=\u003e #\u003cDateTime: 2018-09-25T11:40:39+00:00 ((2458387j,42039s,0n),+0s,2299161j)\u003e\n  ```\n\n## Features\n\n- Ruby classes `Arx::Paper`, `Arx::Author` and `Arx::Category` that wrap the resulting Atom XML query result from the search API.\n- Supports querying by a paper's ID, title, author(s), abstract, subject category, comment, journal reference, report number or last updated date.\n- Provides a small DSL for writing queries.\n- Supports searching fields by exact match.\n\n## Installation\n\nTo install Arx, run the following in your terminal:\n\n```console\ngem install arx\n```\n\n## Documentation\n\nThe documentation for Arx is hosted on [![rubydoc.info](https://img.shields.io/badge/docs-rubydoc.info-blue.svg)](https://www.rubydoc.info/github/eonu/arx/master/toplevel).\n\n## Contributing\n\nAll contributions to Arx are greatly appreciated. Contribution guidelines can be found [here](/CONTRIBUTING.md).\n\n## Usage\n\nBefore you start using Arx, you'll have to ensure that the gem is required (either in your current working file, or shell such as [IRB](https://en.wikipedia.org/wiki/Interactive_Ruby_Shell)):\n\n```ruby\nrequire 'arx'\n```\n\n### Building search queries\n\nQuery requests submitted to the arXiv search API are typically of the following form (where the query string is indicated in bold):\n\n[http://export.arxiv.org/api/query?**search_query=ti:%22Buchi+Automata%22+AND+cat:%22cs.FL%22**](http://export.arxiv.org/api/query?search_query=ti:%22Buchi+Automata%22+AND+cat:%22cs.FL%22)\n\n\u003e This particular query searches for papers whose title includes the string `Buchi Automata`, and are in the *Formal Languages and Automata Theory* (`cs.FL`) category.\n\nObviously writing out queries like this can quickly become time-consuming and tedious.\n\n---\n\nThe `Arx::Query` class provides a small DSL for writing these query strings.\n\n#### Sorting criteria and order\n\nThe order in which search results are returned can be modified through the `sort_by` and `sort_order` keyword arguments (in the `Arx::Query` initializer):\n\n- `sort_by` accepts the symbols: `:relevance`, `:updated_at` or `:submitted_at`\n\n- `sort_order` accepts the symbols: `:ascending` or `:descending`\n\n```ruby\n# Sort by submission date in ascending order (earliest first)\nArx::Query.new(sort_by: :submitted_at, sort_order: :ascending)\n#=\u003e sortBy=submittedDate\u0026sortOrder=ascending\n```\n\n**Note**: The default setting is to sort by `:relevance` in `:descending` order:\n\n```ruby\nArx::Query.new #=\u003e sortBy=relevance\u0026sortOrder=descending\n```\n\n#### Paging\n\nThe arXiv API offers a paging mechanism that allows you to get chucks of the result set at a time. It can be used through the `start` and `max_results` keyword arguments (in the `Arx::Query` initializer):\n\n- `start` is the index of the first returned result (using 0-based indexing)\n\n- `max_results` is the number of results returned by the query\n\n```ruby\n# Get results 10-29\nArx::Query.new(start: 10, max_results: 20)\n#=\u003e start=10\u0026max_results=20\n```\n\n**Note**: The default values are those of the arXiv API: `start` defaults to `0` and `max_results` defaults to `10`:\n\n```ruby\nArx::Query.new #=\u003e start=0\u0026max_results=10\n```\n\n#### Searching by ID\n\nThe arXiv search API doesn't only support searching for papers by metadata fields, but also by ID. When searching by ID, a different URL query string parameter `id_list` is used (instead of `search_query` as seen before).\n\nAlthough the `id_list` can be used to *\"search by ID\"*, it is better to **think of it as restricting the search space to the papers with the provided IDs**:\n\n| `search_query` present? | `id_list` present? | Returns                                              |\n| ----------------------- | ------------------ | ---------------------------------------------------- |\n| Yes                     | No                 | Articles that match `search_query`                   |\n| No                      | Yes                | Articles that are in `id_list`                       |\n| Yes                     | Yes                | Articles in `id_list` that also match `search_query` |\n\nTo search by ID, simply pass the arXiv paper identifiers (ID) or URLs into the `Arx::Query` initializer method:\n\n```ruby\nArx::Query.new('https://arxiv.org/abs/1711.05738', '1809.09415')\n#=\u003e sortBy=relevance\u0026sortOrder=descending\u0026id_list=1711.05738,1809.09415\n```\n\n#### Searching by metadata fields\n\nThe arXiv search API supports searches for the following paper metadata fields:\n\n```ruby\nFIELDS = {\n  title: 'ti',                   # Title\n  author: 'au',                  # Author\n  abstract: 'abs',               # Abstract\n  comment: 'co',                 # Comment\n  journal: 'jr',                 # Journal reference\n  category: 'cat',               # Subject category\n  report: 'rn',                  # Report number\n  updated_at: 'lastUpdatedDate', # Last updated date\n  submitted_at: 'submittedDate', # Submission date\n  all: 'all'                     # All (of the above)\n}\n```\n\nEach of these fields has an instance method defined under the `Arx::Query` class. For example:\n\n```ruby\n# Papers whose title contains the string \"Buchi Automata\".\nq = Arx::Query.new\nq.title('Buchi Automata')\n#=\u003e sortBy=relevance\u0026sortOrder=descending\u0026search_query=ti:%22Buchi+Automata%22\n```\n\n##### Exact matches\n\nBy default, this searches for exact matches of the provided string (by adding double quotes around the string - in the query string, this is represented by the `%22`s). To disable this, you can use the `exact` keyword argument (which defaults to `true`):\n\n```ruby\n# Papers whose title contains either the words \"Buchi\" or \"Automata\".\nq = Arx::Query.new\nq.title('Buchi Automata', exact: false)\n#=\u003e sortBy=relevance\u0026sortOrder=descending\u0026search_query=ti:Buchi+Automata\n```\n\n##### Multiple values for one field\n\nSometimes you might want to provide multiple field values to search for a paper by. This can simply be done by adding them as another argument (or providing an `Array`):\n\n**Note**: The default logical connective used when there are multiple values for one field is `and`.\n\n```ruby\n# Papers authored by both \"Eleonora Andreotti\" and \"Dominik Edelmann\".\nq = Arx::Query.new\nq.author('Eleonora Andreotti', 'Dominik Edelmann')\n```\n\nTo change the logical connective to `or` or `not` (and not), use the `connective` keyword argument:\n\n```ruby\n# Papers authored by either \"Eleonora Andreotti\" or \"Dominik Edelmann\".\nq = Arx::Query.new\nq.author('Eleonora Andreotti', 'Dominik Edelmann', connective: :or)\n```\n\n```ruby\n# Papers authored by \"Eleonora Andreotti\" and not \"Dominik Edelmann\".\nq = Arx::Query.new\nq.author('Eleonora Andreotti', 'Dominik Edelmann', connective: :and_not)\n```\n\n#### Chaining subqueries (logical connectives)\n\n**Note**: By default, subqueries (successive instance method calls) are chained with a logical `and` connective.\n\n```ruby\n# Papers authored by \"Dominik Edelmann\" in the \"Numerical Analysis\" (math.NA) category.\nq = Arx::Query.new\nq.author('Dominik Edelmann')\nq.category('math.NA')\n```\n\nTo change the logical connective used to chain subqueries, use the `and`, `or`, `and_not` instance methods between the subquery calls:\n\n```ruby\n# Papers authored by \"Eleonora Andreotti\" in neither the \"Numerical Analysis\" (math.NA) or \"Combinatorics (math.CO)\" categories.\nq = Arx::Query.new\nq.author('Eleonora Andreotti')\nq.and_not\nq.category('math.NA', 'math.CO', connective: :or)\n```\n\n#### Grouping subqueries\n\nSometimes you'll have a query that requires nested or grouped logic, using parentheses. This can be done using the `Arx::Query#group` method.\n\nThis method accepts a block and basically parenthesises the result of whichever methods were called within the block.\n\nFor example, this will allow the last query from the previous section to be written as:\n\n```ruby\n# Papers authored by \"Eleonora Andreotti\" in neither the \"Numerical Analysis\" (math.NA) or \"Combinatorics (math.CO)\" categories.\nq = Arx::Query.new\nq.author('Eleonora Andreotti')\nq.and_not\nq.group do\n  q.category('math.NA').or.category('math.CO')\nend\n```\n\nAnother more complicated example with two grouped subqueries:\n\n```ruby\n# Papers whose title contains \"Buchi Automata\", either authored by \"Tomáš Babiak\", or in the \"Formal Languages and Automata Theory (cs.FL)\" category and not the \"Computational Complexity (cs.CC)\" category.\nq = Arx::Query.new\nq.title('Buchi Automata')\nq.group do\n  q.author('Tomáš Babiak')\n  q.or\n  q.group do\n    q.category('cs.FL').and_not.category('cs.CC')\n  end\nend\n```\n\n### Running search queries\n\nSearch queries can be executed with the `Arx()` method (alias of `Arx.search`). This method contains the same parameters as the `Arx::Query` initializer - including the list of IDs.\n\n#### Without a predefined query\n\nCalling the `Arx()` method with a block allows for the construction and execution of a new query.\n\n**Note**: If running a search query this way, then the `sort_by` and `sort_order` parameters can be added as additional keyword arguments.\n\n```ruby\n# Papers in the cs.FL category whose title contains \"Buchi Automata\", not authored by Tomáš Babiak\nresults = Arx(sort_by: :submitted_at) do |query|\n  query.category('cs.FL')\n  query.title('Buchi Automata').and_not.author('Tomáš Babiak')\nend\n\nresults.size #=\u003e 18\n```\n\n#### With a predefined query\n\nThe `Arx()` method accepts a predefined `Arx::Query` object through the `query` keyword parameter.\n\n**Note**: If using the `query` parameter, the `sort_by` and `sort_order` criteria should be defined in the `Arx::Query` object initializer rather than as arguments in `Arx()`.\n\n```ruby\n# Papers in the cs.FL category whose title contains \"Buchi Automata\", not authored by Tomáš Babiak\nq = Arx::Query.new(sort_by: :submitted_at)\nq.category('cs.FL')\nq.title('Buchi Automata').and_not.author('Tomáš Babiak')\n\nresults = Arx(query: q)\nresults.size #=\u003e 18\n```\n\n#### With IDs\n\nThe `Arx()` methods accepts a list of IDs as a splat parameter, just like the `Arx::Query` initializer.\n\nIf only one ID is specified, then a single `Arx::Paper` is returned:\n\n```ruby\nresult = Arx('1809.09415')\nresult.class #=\u003e Arx::Paper\n```\n\nOtherwise, an `Array` of `Arx::Paper`s is returned.\n\n### Query results\n\nSearch results are typically:\n\n- an `Array`, either empty if no papers matched the supplied query, or containing `Arx::Paper` objects.\n- a single `Arx::Paper` object (when the search method is only supplied with one ID).\n\n### Entities\n\nThe `Arx::Paper`, `Arx::Author` and `Arx::Category` classes provide a simple interface for the metadata concerning a single arXiv paper:\n\n#### `Arx::Paper`\n\n```ruby\npaper = Arx('1809.09415')\n#=\u003e #\u003cArx::Paper:0x00007fb657b59bd0\u003e\n\npaper.id\n#=\u003e \"1809.09415\"\npaper.id(true)\n#=\u003e \"1809.09415v1\"\npaper.url\n#=\u003e \"http://arxiv.org/abs/1809.09415\"\npaper.url(true)\n#=\u003e \"http://arxiv.org/abs/1809.09415v1\"\npaper.version\n#=\u003e 1\npaper.revision?\n#=\u003e false\n\npaper.title\n#=\u003e \"On finitely ambiguous Büchi automata\"\npaper.summary\n#=\u003e \"Unambiguous B\\\\\\\"uchi automata, i.e. B\\\\\\\"uchi automata allowing...\"\npaper.authors\n#=\u003e [#\u003cArx::Author:0x00007fb657b63108\u003e, #\u003cArx::Author:0x00007fb657b62438\u003e]\n\n# Paper's categories\npaper.primary_category\n#=\u003e #\u003cArx::Category:0x00007fb657b61830\u003e\npaper.categories\n#=\u003e [#\u003cArx::Category:0x00007fb657b60e80\u003e]\n\n# Dates\npaper.published_at\n#=\u003e #\u003cDateTime: 2018-09-25T11:40:39+00:00 ((2458387j,42039s,0n),+0s,2299161j)\u003e\npaper.updated_at\n#=\u003e #\u003cDateTime: 2018-09-25T11:40:39+00:00 ((2458387j,42039s,0n),+0s,2299161j)\u003e\n\n# Paper's comment\npaper.comment?\n#=\u003e false\npaper.comment\n#=\u003e Arx::Error::MissingField (arXiv paper 1809.09415 is missing the `comment` metadata field)\n\n# Paper's journal reference\npaper.journal?\n#=\u003e false\npaper.journal\n#=\u003e Arx::Error::MissingField (arXiv paper 1809.09415 is missing the `journal` metadata field)\n\n# Paper's PDF URL\npaper.pdf?\n#=\u003e true\npaper.pdf_url\n#=\u003e \"http://arxiv.org/pdf/1809.09415v1\"\n\n# Paper's DOI (Digital Object Identifier) URL\npaper.doi?\n#=\u003e true\npaper.doi_url\n#=\u003e \"http://dx.doi.org/10.1007/978-3-319-98654-8_41\"\n```\n\n#### `Arx::Author`\n\n```ruby\npaper = Arx('cond-mat/9609089')\n#=\u003e #\u003cArx::Paper:0x00007fb657a7b8d0\u003e\n\nauthor = paper.authors.first\n#=\u003e #\u003cArx::Author:0x00007fb657a735e0\u003e\n\nauthor.name\n#=\u003e \"F. Gebhard\"\n\nauthor.affiliated?\n#=\u003e true\nauthor.affiliations\n#=\u003e [\"ILL Grenoble, France\"]\n```\n\n#### `Arx::Category`\n\n```ruby\npaper = Arx('cond-mat/9609089')\n#=\u003e #\u003cArx::Paper:0x00007fb657b59bd0\u003e\n\ncategory = paper.primary_category\n#=\u003e #\u003cArx::Category:0x00007fb6570609b8\u003e\n\ncategory.name\n#=\u003e \"cond-mat\"\ncategory.full_name\n#=\u003e \"Condensed Matter\"\n```\n\n## Acknowledgements\n\nA large portion of this library is based on the brilliant work done by [Scholastica](https://github.com/scholastica) in their [`arxiv`](https://github.com/scholastica/arxiv) gem for retrieving individual papers from arXiv through the search API.\n\nArx was created mostly due to the seemingly inactive nature of Scholastica's repository. Additionally, it would have been infeasible to contribute such large changes to an already well-established gem, especially since https://scholasticahq.com/ appears to be dependent upon this gem.\n\nNevertheless, a special thanks goes out to Scholastica for providing the influence for Arx.\n\n## Contributors\n\nAll contributions to this repository are greatly appreciated. Contribution guidelines can be found [here](/CONTRIBUTING.md).\n\n\u003ctable\u003e\n\t\u003cthead\u003e\n\t\t\u003ctr\u003e\n\t\t\t\u003cth align=\"center\"\u003e\n        \u003ca href=\"https://github.com/eonu\"\u003e\n          \u003cimg src=\"https://avatars0.githubusercontent.com/u/24795571?s=460\u0026v=4\" alt=\"Edwin Onuonga\" width=\"60px\"\u003e\n          \u003cbr/\u003eeonu\n          \u003cbr/\u003e\u003csub\u003e(Edwin Onuonga)\u003c/sub\u003e\n        \u003c/a\u003e\n        \u003cbr/\u003e\n        \u003ca href=\"mailto:ed@eonu.net\"\u003e✉️\u003c/a\u003e\n        \u003ca href=\"https://eonu.net\"\u003e🌍\u003c/a\u003e\n\t\t\t\u003c/th\u003e\n      \u003cth align=\"center\"\u003e\n        \u003ca href=\"https://github.com/xuanxu\"\u003e\n          \u003cimg src=\"https://avatars.githubusercontent.com/u/6528?v=4\" alt=\"xuanxu\" width=\"60px\"\u003e\n          \u003cbr/\u003exuanxu\n          \u003cbr/\u003e\u003csub\u003e(Juanjo Bazán)\u003c/sub\u003e\n        \u003c/a\u003e\n        \u003cbr/\u003e\n        \u003ca href=\"jjbazan@gmail.com\"\u003e✉️\u003c/a\u003e\n        \u003ca href=\"http://juanjobazan.com/\"\u003e🌍\u003c/a\u003e\n\t\t\t\u003c/th\u003e\n\t\t\t\u003c!-- Add more \u003cth\u003e\u003c/th\u003e blocks for more contributors --\u003e\n\t\t\u003c/tr\u003e\n\t\u003c/thead\u003e\n\u003c/table\u003e\n\n---\n\n\u003cp align=\"center\"\u003e\n  \u003cb\u003eArx\u003c/b\u003e \u0026copy; 2019-2020, Edwin Onuonga - Released under the \u003ca href=\"http://mit-license.org/\"\u003eMIT\u003c/a\u003e License.\u003cbr/\u003e\n  \u003cem\u003eAuthored and maintained by Edwin Onuonga.\u003c/em\u003e\n\u003c/p\u003e","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Feonu%2Farx","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Feonu%2Farx","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Feonu%2Farx/lists"}