{"id":26501270,"url":"https://github.com/mohiteamit/rim-weighting","last_synced_at":"2026-05-21T14:08:35.621Z","repository":{"id":277447178,"uuid":"932090200","full_name":"mohiteamit/rim-weighting","owner":"mohiteamit","description":"Random Iterative Method (RIM) weighting for survey data, using iterative proportional fitting to align sample distributions with known population margins.","archived":false,"fork":false,"pushed_at":"2025-02-23T00:03:32.000Z","size":68,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-20T16:45:36.619Z","etag":null,"topics":["deming-stephan","iterative-proportional-fitting","pandas","pyspark","python","random-iteration-algorithm","rim-weighting","rms","root-mean-square","statistics","survey-data","weighting"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mohiteamit.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-02-13T10:52:19.000Z","updated_at":"2025-02-23T00:03:35.000Z","dependencies_parsed_at":"2025-02-14T01:28:15.209Z","dependency_job_id":"f8ed4290-2379-4faa-a759-7c76b3233b32","html_url":"https://github.com/mohiteamit/rim-weighting","commit_stats":null,"previous_names":["mohiteamit/rim-weighting"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/mohiteamit/rim-weighting","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mohiteamit%2Frim-weighting","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mohiteamit%2Frim-weighting/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mohiteamit%2Frim-weighting/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mohiteamit%2Frim-weighting/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mohiteamit","download_url":"https://codeload.github.com/mohiteamit/rim-weighting/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mohiteamit%2Frim-weighting/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33303183,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-21T12:23:38.849Z","status":"ssl_error","status_checked_at":"2026-05-21T12:22:11.673Z","response_time":62,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deming-stephan","iterative-proportional-fitting","pandas","pyspark","python","random-iteration-algorithm","rim-weighting","rms","root-mean-square","statistics","survey-data","weighting"],"created_at":"2025-03-20T16:39:56.783Z","updated_at":"2026-05-21T14:08:35.599Z","avatar_url":"https://github.com/mohiteamit.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# RIMWeightingPandas\n\n**RIMWeightingPandas** is a Python implementation of the **Random Iterative Method (RIM)** weighting algorithm, as described in the paper:\n\n- Deming, W. E., \u0026 Stephan, F. F. (1940). *On a least squares adjustment of a sampled frequency table when the expected marginal totals are known*. Annals of Mathematical Statistics, 11, 427-444.\n\nThis method is used to adjust the weights of survey data so that they align with known marginal totals, helping to improve the accuracy of weighted statistical analysis.\n\n## Key Features:\n- Adjusts survey data to match known target distributions (marginal totals).\n- Implements the **Random Iterative Method (RIM)** as described by Deming and Stephan.\n- Provides convergence checks to ensure that weights stabilize within specified bounds.\n- Includes functionality to handle pre-existing weights and apply iterative adjustments.\n\n## Usage\n\n### Importing and Initializing the Class\n\nFirst, import the class and prepare your dataset.\n\n```python\nimport pandas as pd\nfrom RIMWeightingPandas import RIMWeightingPandas\n\n# Example DataFrame (survey data)\ndata = pd.DataFrame({\n    'age_group': ['18-25', '26-35', '36-45', '46-60', '60+'],\n    'gender': ['M', 'F', 'M', 'F', 'M'],\n    'weight': [1, 1, 1, 1, 1]  # Pre-existing weights (optional)\n})\n\n# Define the specification (target proportions for each category)\nspec = {\n    'age_group': {'18-25': 0.2, '26-35': 0.2, '36-45': 0.2, '46-60': 0.2, '60+': 0.2},\n    'gender': {'M': 0.5, 'F': 0.5}\n}\n\n# Initialize the RIMWeightingPandas object\nrim = RIMWeightingPandas(data, spec, pre_weight='weight')\n\n# Apply RIM weighting\nweighted_data = rim.apply_weights()\n```\n\n### Detailed Explanation of Methods\n\n- **`__init__(data, spec, pre_weight=None, tolerance=0.001, weight_col_name='rim_weight')`**: Initializes the RIMWeightingPandas object.\n  - `data`: The survey data as a Pandas DataFrame.\n  - `spec`: A dictionary of target proportions for each category (marginal totals).\n  - `pre_weight`: Optional column name containing existing weights in the data. Defaults to `None`.\n  - `tolerance`: Convergence threshold. Defaults to `0.001`.\n  - `weight_col_name`: The name of the weight column in the DataFrame. Defaults to `'rim_weight'`.\n\n- **`apply_weights(max_iterations=30, min_weight=0.5, max_weight=1.5)`**: Applies the RIM weighting algorithm.\n  - `max_iterations`: Maximum number of iterations allowed.\n  - `min_weight`: Minimum allowable weight for each observation. Adjusted weights that fall below this value are clipped.\n  - `max_weight`: Maximum allowable weight for each observation. Adjusted weights that exceed this value are clipped.\n  - **Returns**: The DataFrame with the adjusted `rim_weight` column.\n\n- **`weighting_efficiency()`**: Computes the RIM weighting efficiency as a percentage, based on the formula:\n\n  $$\n  \\text{Efficiency (\\%)} = \\frac{\\left( \\sum (P_j \\times R_j) \\right)^2 }{\\sum P_j \\times \\sum P_j \\times (R_j^2) }\n  $$\n\n  where $P_j$ are the pre-weights and $R_j$ are the adjusted rim weights.\n\n- **`generate_summary()`**: Generates a summary of the unweighted and weighted counts per variable, including:\n  - Unweighted counts \u0026 percentages.\n  - Weighted counts \u0026 percentages.\n  - Min/Max weights per category.\n\n# RIMWeightingPySpark\n- This code is written by ChatGPT o3-mini-high based on RIMWeightingPandas\n- This code is untested\n\n# Example Output\n\nAfter applying the RIM weighting, you can generate a summary of your data:\n\n```python\nrim.generate_summary()\n```\n\n# Installation\n\nTo install, use the following command:\n\n```bash\npip install git+https://github.com/mohiteamit/rim-weighting.git\n```\n\nAlternatively, you can clone the repository and install it manually:\n\n```bash\ngit clone https://github.com/mohiteamit/rim-weighting.git\ncd rim-weighting\npip install .\n```\n\n\nThis will display a formatted summary showing the unweighted and weighted counts for each variable.\n\n# Contributing\n\nI welcome contributions! If you'd like to help improve this project, feel free to fork the repository and submit a pull request.\n\n# License\n\nThis project is licensed under the **MIT License**. You can freely use, modify, and distribute this code. However, **you must provide appropriate credit** by mentioning this repository in any usage of the code. For more details, see the `LICENSE` file.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmohiteamit%2Frim-weighting","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmohiteamit%2Frim-weighting","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmohiteamit%2Frim-weighting/lists"}