{"id":17220507,"url":"https://github.com/keithdoggett/spatial_stats","last_synced_at":"2025-04-13T22:31:49.904Z","repository":{"id":59156012,"uuid":"243868790","full_name":"keithdoggett/spatial_stats","owner":"keithdoggett","description":"Spatial Statistics Library for ActiveRecord/PostGIS","archived":false,"fork":false,"pushed_at":"2024-06-10T19:34:45.000Z","size":453,"stargazers_count":7,"open_issues_count":1,"forks_count":1,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-03-27T12:52:34.800Z","etag":null,"topics":["postgis","rails","ruby","spatial","spatial-stats","stats"],"latest_commit_sha":null,"homepage":"https://keithdoggett.github.io/spatial_stats/","language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/keithdoggett.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-02-28T22:54:34.000Z","updated_at":"2024-06-10T19:34:43.000Z","dependencies_parsed_at":"2024-10-15T03:52:39.496Z","dependency_job_id":"4c2c132c-cd9b-41bf-b1e5-bc2e32831f90","html_url":"https://github.com/keithdoggett/spatial_stats","commit_stats":{"total_commits":108,"total_committers":4,"mean_commits":27.0,"dds":0.08333333333333337,"last_synced_commit":"27770867b306c17aa08c5822aeeb5fc2c8dd7ede"},"previous_names":[],"tags_count":10,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/keithdoggett%2Fspatial_stats","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/keithdoggett%2Fspatial_stats/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/keithdoggett%2Fspatial_stats/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/keithdoggett%2Fspatial_stats/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/keithdoggett","download_url":"https://codeload.github.com/keithdoggett/spatial_stats/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248429076,"owners_count":21101782,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["postgis","rails","ruby","spatial","spatial-stats","stats"],"created_at":"2024-10-15T03:52:29.007Z","updated_at":"2025-04-13T22:31:49.476Z","avatar_url":"https://github.com/keithdoggett.png","language":"Ruby","readme":"![Spatial Stats](/assets/ruby.svg)\n\n[![Build Status](https://travis-ci.com/keithdoggett/spatial_stats.svg?branch=master)](https://travis-ci.com/keithdoggett/spatial_stats)\n\n[Docs](https://keithdoggett.github.io/spatial_stats)\n\n# SpatialStats\n\nSpatialStats is an ActiveRecord/Rails plugin that utilizes PostGIS to compute weights/statistics of spatial data sets in Rails Apps.\n\n## Installation\n\nAdd this line to your application's Gemfile:\n\n```ruby\ngem 'spatial_stats'\n```\n\nAnd then execute:\n\n```bash\n$ bundle\n```\n\nOr install it yourself as:\n\n```bash\n$ gem install spatial_stats\n```\n\n## Usage\n\n### Weights\n\nWeights define the spatial relation between members of a dataset. Contiguous operations are supported for `polygons` and `multipolygons`, and distant operations are supported for `points`.\n\nTo compute weights, you need an `ActiveRecord::Relation` scope and a geometry field. From there, you can pick what type of weight operation to compute (`knn`, `queen neighbors`, etc.).\n\n#### Compute Queen Weights\n\n```ruby\n# County table has the following fields: avg_income: float, geom: multipolygon.\nscope = County.all\ngeom_field = :geom\nweights = SpatialStats::Weights::Contiguous.queen(scope, geom_field)\n# =\u003e #\u003cSpatialStats::Weights::WeightsMatrix\u003e\n```\n\n#### Compute KNN of Centroids\n\nThe field being queried does not have to be defined in the schema, but could be computed during the query for scope.\n\nThis example finds the inverse distance weighted, 5 nearest neighbors for the centroid of each county.\n\n```ruby\nscope = County.all.select(\"*, st_centroid(geom) as geom\")\nweights = SpatialStats::Weights::Distant.idw_knn(scope, :geom, 5)\n# =\u003e #\u003cSpatialStats::Weights::WeightsMatrix\u003e\n```\n\n#### Define WeightsMatrix without Query\n\nWeight matrices can be defined by a hash that describes each key's neighbor and weight.\n\nExample: Define WeightsMatrix and get the matrix in row_standardized format.\n\n```ruby\nweights = {\n    1 =\u003e [{ id: 2, weight: 1 }, { id: 4, weight: 1 }],\n    2 =\u003e [{ id: 1, weight: 1 }],\n    3 =\u003e [{ id: 4, weight: 1 }],\n    4 =\u003e [{ id: 1, weight: 1 }, { id: 3, weight: 1 }]\n}\nkeys = weights.keys\nwm = SpatialStats::Weights::WeightsMatrix.new(weights)\n#  =\u003e #\u003cSpatialStats::Weights::WeightsMatrix:0x0000561e205677c0 @keys=[1, 2, 3, 4], @weights={1=\u003e[{:id=\u003e2, :weight=\u003e1}, {:id=\u003e4, :weight=\u003e1}], 2=\u003e[{:id=\u003e1, :weight=\u003e1}], 3=\u003e[{:id=\u003e4, :weight=\u003e1}], 4=\u003e[{:id=\u003e1, :weight=\u003e1}, {:id=\u003e3, :weight=\u003e1}]}, @n=4\u003e\n\nwm = wm.standardize\n#  =\u003e #\u003cSpatialStats::Weights::WeightsMatrix:0x0000561e205677c0 @keys=[1, 2, 3, 4], @weights={1=\u003e[{:id=\u003e2, :weight=\u003e0.5}, {:id=\u003e4, :weight=\u003e0.5}], 2=\u003e[{:id=\u003e1, :weight=\u003e1}], 3=\u003e[{:id=\u003e4, :weight=\u003e1}], 4=\u003e[{:id=\u003e1, :weight=\u003e0.5}, {:id=\u003e3, :weight=\u003e0.5}]}, @n=4\u003e\n\nwm.dense\n# =\u003e Numo::DFloat[\n#    [0, 0.5, 0, 0.5],\n#    [1, 0, 0, 0],\n#    [0, 0, 0, 1],\n#    [0.5, 0, 0.5, 0]\n#   ]\n\nwm.sparse\n# =\u003e #\u003cSpatialStats::Weights::CSRMatrix @m=4, @n=4, @nnz=6\u003e\n```\n\n### Lagged Variables\n\nSpatially lagged variables can be computed with weights matrix and 1-D vector (`Array`).\n\n#### Compute a Lagged Variable\n\n```ruby\nweights = {\n    1 =\u003e [{ id: 2, weight: 1 }, { id: 4, weight: 1 }],\n    2 =\u003e [{ id: 1, weight: 1 }],\n    3 =\u003e [{ id: 4, weight: 1 }],\n    4 =\u003e [{ id: 1, weight: 1 }, { id: 3, weight: 1 }]\n}\nwm = SpatialStats::Weights::WeightsMatrix.new(weights).standardize\nvec = [1, 2, 3, 4]\nlagged_var = SpatialStats::Utils::Lag.neighbor_sum(wm, vec)\n# =\u003e [3.0, 1.0, 4.0, 2.0]\n```\n\n### Global Stats\n\nGlobal stats compute a value for the dataset, like how clustered the observations are within the region.\n\nMost `stat` classes take three parameters: `scope`, `data_field`, and `weights`. All `stat` classes have the `stat` method that will compute the target statistic. These are also aliased with the common name of the statistic, such as `i` for `Moran` or `c` for `Geary`.\n\n#### Compute Moran's I\n\n```ruby\nscope = County.all\nweights = SpatialStats::Weights::Contiguous.rook(scope, :geom)\nmoran = SpatialStats::Global::Moran.new(scope, :avg_income, weights)\n# =\u003e \u003cSpatialStats::Global::Moran\u003e\n\nmoran.stat\n# =\u003e 0.834\n\nmoran.i\n# =\u003e 0.834\n```\n\n#### Compute Moran's I without Querying Data\n\nTo calculate the statistic by using an array of data and not querying a database field. The order of the data must correspond to the order of `weights.keys`.\n\n```ruby\nscope = County.all\nweights = SpatialStats::Weights::Contiguous.rook(scope, :geom)\n\nfield = nil\nmoran = SpatialStats::Global::Moran.new(scope, field, weights)\n# =\u003e \u003cSpatialStats::Global::Moran\u003e\n\n# data is automatically standardized on input\ndata = [1,2,3,4,5,6]\nmoran.x = data\n\nmoran.stat\n# =\u003e 0.521\n```\n\n#### Compute Moran's I Z-Score\n\n```ruby\nscope = County.all\nweights = SpatialStats::Weights::Contiguous.rook(scope, :geom)\nmoran = SpatialStats::Global::Moran.new(scope, :avg_income, weights)\n# =\u003e \u003cSpatialStats::Global::Moran\u003e\n\nmoran.z_score\n# =\u003e 3.2\n```\n\n#### Run a Permutation Test on Moran's I\n\nAll stat classes have the `mc` method which takes `permutations` and `seed` as its parameters. `mc` runs a permutation test on the class and returns the psuedo p-value.\n\n```ruby\nscope = County.all\nweights = SpatialStats::Weights::Contiguous.rook(scope, :geom)\nmoran = SpatialStats::Global::Moran.new(scope, :avg_income, weights)\n# =\u003e \u003cSpatialStats::Global::Moran\u003e\n\nmoran.mc(999, 123_456)\n# =\u003e 0.003\n```\n\n#### Get Summary of Permutation Test\n\nAll stat classes have the `summary` method which takes `permutations` and `seed` as its parameters. `summary` runs `stat` and `mc` then combines the results into a hash.\n\n```ruby\nscope = County.all\nweights = SpatialStats::Weights::Contiguous.rook(scope, :geom)\nmoran = SpatialStats::Global::Moran.new(scope, :avg_income, weights)\n# =\u003e \u003cSpatialStats::Global::Moran\u003e\n\nmoran.summary(999, 123_456)\n# =\u003e {stat: 0.834, p: 0.003}\n```\n\n### Local Stats\n\nLocal stats compute a value each observation in the dataset, like how similar its neighbors are to itself. Local stats operate similarly to global stats, except that almost every operation will return an array of length `n` where `n` is the number of observations in the dataset.\n\nMost `stat` classes take three parameters: `scope`, `data_field`, and `weights`. All `stat` classes have the `stat` method that will compute the target statistic. These are also aliased with the common name of the statistic, such as `i` for `Moran` or `c` for `Geary`.\n\n#### Compute Moran's I\n\n```ruby\nscope = County.all\nweights = SpatialStats::Weights::Contiguous.rook(scope, :geom)\nmoran = SpatialStats::Local::Moran.new(scope, :avg_income, weights)\n# =\u003e \u003cSpatialStats::Local::Moran\u003e\n\nmoran.stat\n# =\u003e [0.888, 0.675, 0.2345, -0.987, -0.42, ...]\n\nmoran.i\n# =\u003e [0.888, 0.675, 0.2345, -0.987, -0.42, ...]\n```\n\n#### Compute Moran's I without Querying Data\n\nTo calculate the statistic by using an array of data and not querying a database field. The order of the data must correspond to the order of `weights.keys`.\n\n```ruby\nscope = County.all\nweights = SpatialStats::Weights::Contiguous.rook(scope, :geom)\n\nfield = nil\nmoran = SpatialStats::Local::Moran.new(scope, field, weights)\n# =\u003e \u003cSpatialStats::Local::Moran\u003e\n\n# data is automatically standardized on input\ndata = [1,2,3,4,5,6]\nmoran.x = data\n\nmoran.stat\n# =\u003e [0.521, 0.123, -0.432, -0.56,. ...]\n```\n\n#### Compute Moran's I Z-Scores\n\nNote: Many classes do not have a variance or expectation method implemented and this will raise a `NotImplementedError`.\n\n```ruby\nscope = County.all\nweights = SpatialStats::Weights::Contiguous.rook(scope, :geom)\nmoran = SpatialStats::Local::Moran.new(scope, :avg_income, weights)\n# =\u003e \u003cSpatialStats::Local::Moran\u003e\n\nmoran.z_score\n# =\u003e # =\u003e [0.65, 1.23, 0.42, 3.45, -0.34, ...]\n```\n\n#### Run a Permutation Test on Moran's I\n\nAll stat classes have the `mc` method which takes `permutations` and `seed` as its parameters. `mc` runs a permutation test on the class and returns the psuedo p-values.\n\n```ruby\nscope = County.all\nweights = SpatialStats::Weights::Contiguous.rook(scope, :geom)\nmoran = SpatialStats::Local::Moran.new(scope, :avg_income, weights)\n# =\u003e \u003cSpatialStats::Local::Moran\u003e\n\nmoran.mc(999, 123_456)\n# =\u003e [0.24, 0.13, 0.53, 0.023, 0.65, ...]\n```\n\n#### Get Summary of Permutation Test\n\nAll stat classes have the `summary` method which takes `permutations` and `seed` as its parameters. `summary` runs `stat`, `mc`, and `groups` then combines the results into a hash array indexed by `weight.keys`.\n\n```ruby\nscope = County.all\nweights = SpatialStats::Weights::Contiguous.rook(scope, :geom)\nmoran = SpatialStats::Local::Moran.new(scope, :avg_income, weights)\n# =\u003e \u003cSpatialStats::Local::Moran\u003e\n\nmoran.summary(999, 123_456)\n# =\u003e [{key: 1, stat: 0.521, p: 0.24, group: 'HH'}, ...]\n```\n\n## Contributing\n\nOnce cloned, run the following commands to setup the test database.\n\n```bash\ncd ./spatial_stats\nbundle install\ncd test/dummy\nrake db:create\nrake db:migrate\n```\n\nIf you are getting an error, you may need to set the following environment variables.\n\n```bash\n$PGUSER # default \"postgres\"\n$PGPASSWORD # default \"\"\n$PGHOST # default \"127.0.0.1\"\n$PGPORT # default \"5432\"\n$PGDATABASE # default \"spatial_stats_test\"\n```\n\nIf the dummy app is setup correctly, run the following:\n\n```bash\ncd ../..\nrake\n```\n\nThis will run the tests. If they all pass, then your environment is setup correctly.\n\nNote: It is recommended to have GEOS installed and linked to RGeo. You can test this by running the following:\n\n```bash\ncd test/dummy\nrails c\n\nRGeo::Geos.supported?\n# =\u003e true\n```\n\n## Path Forward\n\nSummaries of milestones for v1.x and v2.0. These lists are subject to change. If you have an additional feature you want to see for either milestone, open up an issue or PR.\n\n### v1.x\n\n1. Global Measurements\n   - `Geary`'s C\n   - `GetisOrd`\n2. Local Measurements\n   - `Join Count`\n3. Utilities\n   - Add support for .gal/.swm file imports\n   - Add support for Rate variables\n   - Add support for Bayes smoothing\n   - ~Add support for Bonferroni Bounds and FDR~\n4. General\n   - ~Add new stat constructors that only rely on a weights matrix and data vector~\n   - Point Pattern Analysis Module\n   - Regression Module\n\n### v2.0\n\n- Break gem into core `spatial_stats` that will not include queries module and `spatial_stats-activerecord`. This will remove the dependency on rails for the core gem.\n- Create `spatial_stats-import/geojson/shp` gem that will allow importing files and generating a `WeightsMatrix`. Will likely rely on `RGeo` or another spatial lib.\n\n### Other TODOs\n\n- Update Docs to show `from_observation` when version is bumped\n- Refactor `MultivariateGeary` so that it can be used without `activerecord` by adding `from_observations` and supporting methods.\n\n## License\n\nThe gem is available as open source under the terms of the [BSD-3-Clause](https://opensource.org/licenses/BSD-3-Clause).\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkeithdoggett%2Fspatial_stats","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkeithdoggett%2Fspatial_stats","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkeithdoggett%2Fspatial_stats/lists"}