{"id":13937777,"url":"https://github.com/evilmartians/fias","last_synced_at":"2025-04-10T00:19:49.560Z","repository":{"id":6447380,"uuid":"7686855","full_name":"evilmartians/fias","owner":"evilmartians","description":"Ruby wrapper for the Russian FIAS database (Федеральная Информационная Адресная Система)","archived":false,"fork":false,"pushed_at":"2019-01-18T12:00:41.000Z","size":219,"stargazers_count":83,"open_issues_count":8,"forks_count":16,"subscribers_count":45,"default_branch":"master","last_synced_at":"2025-04-08T10:51:41.334Z","etag":null,"topics":["address","geo","geodata","ruby","russian"],"latest_commit_sha":null,"homepage":"","language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/evilmartians.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2013-01-18T13:15:38.000Z","updated_at":"2024-04-05T12:02:36.000Z","dependencies_parsed_at":"2022-08-24T14:07:02.049Z","dependency_job_id":null,"html_url":"https://github.com/evilmartians/fias","commit_stats":null,"previous_names":[],"tags_count":6,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/evilmartians%2Ffias","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/evilmartians%2Ffias/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/evilmartians%2Ffias/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/evilmartians%2Ffias/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/evilmartians","download_url":"https://codeload.github.com/evilmartians/fias/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247829452,"owners_count":21002994,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["address","geo","geodata","ruby","russian"],"created_at":"2024-08-07T23:03:52.604Z","updated_at":"2025-04-10T00:19:49.543Z","avatar_url":"https://github.com/evilmartians.png","language":"Ruby","funding_links":[],"categories":["Ruby"],"sub_categories":[],"readme":"# FIAS\n\n[![Build Status](https://travis-ci.org/evilmartians/fias.svg)](http://travis-ci.org/evilmartians/fias)\n[![Code Climate](https://codeclimate.com/github/evilmartians/fias/badges/gpa.svg)](https://codeclimate.com/github/evilmartians/fias)\n[![Test Coverage](https://codeclimate.com/github/evilmartians/fias/badges/coverage.svg)](https://codeclimate.com/github/evilmartians/fias)\n\nRuby wrapper for the Russian [ФИАС](http://fias.nalog.ru) database.\n\nDesigned for usage with Ruby on Rails and a PostgreSQL backend.\n\n\u003ca href=\"https://evilmartians.com/?utm_source=fias-gem\"\u003e\n\u003cimg src=\"https://evilmartians.com/badges/sponsored-by-evil-martians.svg\" alt=\"Sponsored by Evil Martians\" width=\"236\" height=\"54\"\u003e\n\u003c/a\u003e\n\nThink twice before you decide to use a standalone copy of FIAS database in your project. [КЛАДР в облаке](https://kladr-api.ru/) could also be a solution.\n\n![Мухосраново](http://i.imgur.com/JCwgqjL.jpg) ![Марс](http://i.imgur.com/j8WPtOI.jpg)\n\n## Installation\n\nAdd this line to your application's `Gemfile`:\n\n```ruby\ngem 'fias'\n```\n\nAnd then execute:\n\n    $ bundle\n\nOr install it yourself:\n\n    $ gem install fias\n\n## Import into PostgreSQL\n\n**Warning!** You should not run the import in a 32-bit operating system, because you're likely to get a Memory Limit exception\n\n    $ mkdir -p tmp/fias \u0026\u0026 cd tmp/fias\n    $ bundle exec rake fias:download | xargs wget\n    $ unrar e fias_dbf.rar\n    $ bundle exec rake fias:create_tables fias:import DATABASE_URL=postgres://localhost/fias\n\nIf you get an error \"Errno::EMFILE: Too many open files @ rb_sysopen\" please set ulimit 512 or more before starting rake tasks:\n\n    ulimit -S -n 512\n\nThe rake task accepts options through ENV variables:\n\n* `TABLES` to specify a comma-separated list of tables to import or create. See `Fias::Import::Dbf::TABLES` for the list of key names. Use `houses` as an alias for HOUSE* tables and `nordocs` for NORDOC* tables. In most cases you'll need only the  `address_objects` table.\n* `PREFIX` for database tables prefix ('fias_' by default).\n* `FIAS_PATH` to specify DBF files location ('tmp/fias' by default).\n* `DATABASE_URL` to set database credentials (required explicitly even with a Ruby on Rails project).\n\nThis gem uses `COPY FROM STDIN BINARY` to import data. At the moment it works with PostgreSQL only.\n\n## Notes about FIAS\n\n1. FIAS address objects table contains a lot of fields which are useless in most cases (tax office ID, legal document ID, etc.).\n2. Address objects table contains a lot of historical records (more than 50%, in fact), which are useless in most cases.\n3. Every record in the address object table could have multiple parents. For example, \"Nevsky prospekt\" in Saint Petersburg has two parents: Saint Petersburg (active) and Leningrad (historical name of the city, inactive). Most hierarchy libraries do accept just one parent for a record.\n4. Using UUID type field as a primary key as it used in FIAS is not a good idea if you want to use `ancestry` or `closure_tree` gems to navigate through record tree.\n5. Typical SQL production server settings are optimized for reading, so the import process in production environment could take a dramatically long time.\n\n## Notes on initial import workflow\n\n1. Use raw FIAS tables just as a temporary data source for creating/updating primary address objects table for your application.\n2. The only requirement is to keep AOGUID, PARENTGUID and AOID fields in target table. You will need them for updating.\n3. Keep your addressing object table immutable. This will give you an ability to work with huge amounts of addressing data locally. Send the result to production environment via a SQL dump.\n4. FIAS contains some duplicates. Duplicates are records which have different UUIDs but equal names, abbrevations and nesting level. It is up to you to decide on how to deal with it: completely remove them or just mark as hidden. Krasnodar city has a lot of equally named streets situated in different districts.\n5. [closure_tree](https://github.com/mceachen/closure_tree) works great as a hierarchy backend. Use [pg_closure_tree_rebuild](https://github.com/gzigzigzeo/pg_closure_tree_rebuild) to rebuild the hierarchy table from scratch.\n\n[See example](examples/create.rb).\n\n## Toponyms\n\nEvery FIAS address object has two fields: `formalname`, which holds the toponym (the name of a geographical object) and `shortname`, which holds its type (street, city, etc.). FIAS contains the list of all available `shortname` values and their corresponding long forms in the `address_object_types` table (SOCRBASE.DBF).\n\n### Canonical forms\n\nIn real life people use a lot of type name variations. For example, 'проспект' can be written as 'пр' or 'пр-кт'.\n\nYou can convert any variation to a canonical form:\n\n```ruby\nFias::Name::Canonical.canonical('поселок')\n# =\u003e [\n#  'поселок', # FIAS canonical full name\n#  'п',       # FIAS canonical short name (as in address_objects table)\n#  'п.',      # Short name with dot if needed\n#  'пос',     # Alias\n#  'посёлок'  # Alias\n# ]\n```\n\nSee [fias.rb](lib/fias.rb) for a list of settings.\n\n### Append type to toponym\n\nUse `Fias::Name::Append` to build toponym names in conformity with the rules of grammar:\n\n```ruby\nFias::Name::Append.append('Санкт-Петербург', 'г')\n# =\u003e ['г. Санкт-Петербург', 'город Санкт-Петербург']\n\nFias::Name::Append.append('Невский', 'пр')\n# =\u003e ['Невский пр-кт', 'Невский проспект']\n\nFias::Name::Append.append('Чечня', 'республика')\n# =\u003e ['Респ. Чечня', 'Республика Чечня']\n\nFias::Name::Append.append('Чеченская', 'республика')\n# =\u003e ['Чеченская Респ.', 'Чеченская Республика']\n```\n\nYou can pass any form of type name: full, short, an alias, with or without the dot.\n\n### Extract a toponym\n\nSometimes you need to extract a toponym and its type from a plain string:\n\n```ruby\nFias::Name::Extract.extract('Город Санкт-Петербург')\n# =\u003e ['Санкт-Петербург', 'город', 'г', 'г.']\n\nFias::Name::Extract.extract('ул. Казачий Вал')\n# =\u003e ['Казачий Вал', 'улица', 'ул', 'ул.']\n```\n\n### Extract house number\n\nSometimes street names come mixed up with house numbers, and you need to extract the house number from a string to clean it up for indexing:\n\n```ruby\nFias::Name::HouseNumber.extract('Ново-Садовая ул,303а')\n# =\u003e ['Ново-Садовая ул', '303а']\n\nFias::Name::HouseNumber.extract('пр.Энергетиков 72/2')\n# =\u003e ['пр.Энергетиков', '72/2']\n```\n\n## Searching\n\nGiven you have a set of structured addresses:\n\n```ruby\n[\n  { region: 'Еврейская АОбл', city: 'г. Биробиджан', street: 'Шолом-Алейхема' },\n  { city: 'Санкт-Петербург', street: 'Лермонтовский проспект' }\n]\n```\n\nYou need to find a FIAS item for each address in set.\n\nYour project may use a full-text search engine (Sphinx, ElasticSearch) or just a SQL database. Search principles are the same, but the implementation would differ. This library contains helpful modules and base classes to facilitate searching.\n\n### Indexing\n\nEach toponym consists of words; some of them are considered \"special\". Said \"special\" words could have synonyms or different forms, they could be skipped by user or could be written differently in FIAS database itself.\n\nExamples:\n\n* \"50 лет Октября\" == \"50-летия Октября\"\n* \"1-ая Советская\" == \"1 Советская\" || \"Советская 1-я\"\n* \"Большой Проспект П.С.\" == \"Большой Проспект Петроградской\"\n* \"имени Максима Горького\" == \"им. Горького\" || \"Горького\"\n* \"ул. Цюрупы\" == \"Цурюпы\" || \"Цюрупа\" || \"Цорюпы\" || \"Цорупа\" (that's my favorite!)\n\nYou should trait them as equal when performing search.\n\nNote that we are talking about toponym names with types extracted (see type extraction above).\n\n#### Splitting the words\n\nWords are split according to a set of simple rules aimed to simplify disclosure of synonyms and determination of optional parts.\n\n```ruby\nAddressing::Name::Split.split(\"50 лет Октября\")\n# =\u003e [\"50 лет\", \"октября\"]\n\nAddressing::Name::Split.split(\"Ю.Р.Г.Эрвье\")\n# =\u003e [\"ю.р.г.\", \"эрвье\"]\n```\n\n#### Finding synonyms and optional words\n\nGiven we have a street named `им. академика И.П.Павлова` in FIAS, most people will reference it as just `Павлова` street, some will write it as `имени Павлова`, and some - `академика Павлова`. Basically, nobody except the FIAS database would reference it by the exact original name.\n\n```ruby\nAddressing::Name::Synonyms.expand('им. академика И.П.Павлова')\n\n# =\u003e [[\"им\", \"имени\", \"им.\", \"\"],\n# [\"ак.\", \"академика\", \"\"],\n# [\"и.п.\", \"\"],\n# [\"павлова\"]]\n```\n\nWill return all possible forms for each word. Empty strings here mark optional words.\n\n```ruby\nAddressing::Name::Synonyms.tokens('им. академика И.П.Павлова')\n\n# =\u003e [\"им\", \"имени\", \"им.\", \"ак.\", \"академика\", \"и.п.\", \"павлова\"]\n```\n\nWill return flat array with all words.\n\nYou can also calculate all possible name combinations:\n\n```ruby\nAddressing::Name::Synonyms.forms('им. И.П.Павлова')\n# =\u003e [\n#   'и.п. им павлова',\n#   'им павлова',\n#   'и.п. имени павлова',\n#   'имени павлова',\n#   'и.п. им. павлова',\n#   'им. павлова',\n#   'и.п. павлова',\n#   'павлова'\n# ]\n```\n\n#### Generating search index\n\nIn search index you need:\n* name tokens (result of `Fias::Name::Synonyms.tokens`)\n* name forms (result of `Fias::Name::Synonyms.forms`)\n* ancestor ids\n\nSee [indexing example](examples/generate_index.rb).\n\n### Querying\n\nPerforming a search will execute these three steps:\n\n1. Preparation: sanitizing request values, splitting toponym name and type, etc.\n2. Querying: finding possible candidates in addressing object tree.\n3. Decision: determining the most suitable result depending on similarity with request.\n\n#### Defining an in-app query class\n\nWe'll use the `sequel` gem in this example.\n\n```ruby\nclass Query\n  include Fias::Query\n\n  def find(tokens)\n    return [] if tokens.blank? # Empty array has no type, Sequel fails.\n\n    op = Sequel.pg_array_op(:tokens)\n\n    DB[:address_objects]\n      .select(:id, :name, :abbr, :parent_id, :ancestry, :forms, :tokens)\n      .where(op.overlaps(tokens))\n      .to_a\n  end\nend\n```\n\n`#find` accepts splitted object name (a result of `Fias::Name::Split.split`). It searches all address objects with their tokens matching a given set of tokens. It returns an array of hashes with keys you can see above.\n\n* `:abbr` - FIAS shortname value.\n* `:ancestry` - array of ancestor IDs.\n* `:forms` - object name forms (`Fias::Name::Synonyms.forms`)\n* `:tokens` - object name tokens (`Fias::Name::Synonyms.tokens`)\n\n#### Query params\n\n```ruby\nquery = Query.new(\n  region: 'Еврейская АОбл', city: 'г. Биробиджан', street: 'Шолом-Алейхема'\n)\n\nquery.params.sanitized\n# =\u003e {\n#   :region =\u003e [\"Еврейская\", \"автономная область\", \"Аобл\", \"Аобл\"],\n#   :city   =\u003e [\"Биробиджан\", \"город\", \"г\", \"г.\"],\n#   :street =\u003e [\"Шолом-Алейхема\"]\n# }\n```\n\nAllowed params are: `%i(region district city subcity street)`\n\n#### Result\n\n```ruby\nquery.perform\n#\n# [[13213, {:id=\u003e72344, :name=\u003e\"Шолом-Алейхема\", :abbr=\u003e\"ул\", :parent_id=\u003e184027, :ancestry=\u003e[184027, 12550], :forms=\u003e[\"шолом-\n# алейхема\"], :tokens=\u003e[\"шолом-алейхема\"], :key=\u003e:street}]]\n```\n\nResult is array.\n\n* Each element of array contains two values: factor of equality and found object.\n* If there are more then one row in array it means that query results are ambigous. All elements will have same factors.\n* Array is empty if nothing found.\n\n## Notes\n\n1. People make mistakes. Search requests can have mistakes. Our goal is to minimize mistake's impact. Everything above (name forms, synonyms, etc.) is made to better understand humans. Over 50K of different real addresses written by humans was used to collect type of mistakes and deduce that rules.\n2. That's why requests are slow.\n3. In real applications there could be a lot of similar queries. It's okay to cache request results in database to prevent repeated queries. Cached items do not need TTL because FIAS changes rarely.\n4. In many cases human can resolve ambigous results or try to find result manually. It could be wise to have some kind of admin interface in your app to do that.\n\n## Contributors\n\n* Victor Sokolov (@gzigzigzeo)\n* Vlad Bokov (@razum2um)\n\nSpecial thanks to @gazay.\n\n## Contributing\n\n1. Fork it ( https://github.com/evilmartians/fias/fork )\n2. Create your feature branch (`git checkout -b my-new-feature`)\n3. Commit your changes (`git commit -am 'Add some feature'`)\n4. Push to the branch (`git push origin my-new-feature`)\n5. Create a new Pull Request\n\n## License\n\nThe MIT License\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fevilmartians%2Ffias","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fevilmartians%2Ffias","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fevilmartians%2Ffias/lists"}