{"id":20486287,"url":"https://github.com/undp-data/geo-gim-model","last_synced_at":"2025-10-10T17:07:34.614Z","repository":{"id":142769739,"uuid":"608226989","full_name":"UNDP-Data/geo-gim-model","owner":"UNDP-Data","description":null,"archived":false,"fork":false,"pushed_at":"2023-11-03T18:00:03.000Z","size":30151,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-10-10T17:07:31.746Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/UNDP-Data.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-03-01T15:27:12.000Z","updated_at":"2023-03-20T18:17:49.000Z","dependencies_parsed_at":null,"dependency_job_id":"e132709b-601e-46cf-816a-98cd2096022a","html_url":"https://github.com/UNDP-Data/geo-gim-model","commit_stats":null,"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"purl":"pkg:github/UNDP-Data/geo-gim-model","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/UNDP-Data%2Fgeo-gim-model","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/UNDP-Data%2Fgeo-gim-model/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/UNDP-Data%2Fgeo-gim-model/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/UNDP-Data%2Fgeo-gim-model/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/UNDP-Data","download_url":"https://codeload.github.com/UNDP-Data/geo-gim-model/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/UNDP-Data%2Fgeo-gim-model/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279004815,"owners_count":26083783,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-10T02:00:06.843Z","response_time":62,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-15T16:35:57.659Z","updated_at":"2025-10-10T17:07:34.597Z","avatar_url":"https://github.com/UNDP-Data.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"## Geo GIM Model Package\nThe GIM Computer Vision Package\n\nTo use this package, you need to have docker installed on your computer. You can download docker from [here](https://www.docker.com/products/docker-desktop).\n\n### Setting up the directory structure\nThe directory structure should be as follows:\n\nCreate any directories that are missing exist.\n\n```\n.\n├── root\n|   ├── data\n|   |   ├── local\n|   |   ├── processed\n|   |   ├── volumes ── ebs_inference_storage\n|   |   ├── train\n|   |   ├── val\n|   ├── gim_cv\n|   ├── MODELS\n|   ├── INFER\n|   ├── TRAIN\n|   ├── PREDICTIONS\n|   ├── tests\n|   ├── saved_models\n|   ├── README.md\n|   ├── .gitignore\n|   ├── .gitattributes\n|   ├── .dockerignore\n|   ├── Dockerfile\n|   ├── docker-compose.yml\n|   ├── requirements.txt\n```\n\nThe Sample dataset is the `Medellin_40cm.tif` that is set in the `datasets.py` module of the `gim_cv` package\n\nEnsure that you have both the raw and mask data in the `TRAIN/raster/Medellin_40cm.tif` and `TRAIN/mask/Medellin_ground_truth.tif` respectively. to run it on the sample dataset.\nBy default this data is identifiable by the `train_tif` tag in the `datasets.py` module.\nSo using the default settings, you can run the training script as follows:\n\n```\ndocker exec -it \u003cCONTAINER-ID\u003e python3 training_segmentalist.py -d train_tif --epochs 10 --batch_size 32 -lr 0.001\n```\n### Setting up the docker container\nTo build the container, run the following command in the root directory:\n```\ndocker-compose build\n```\n\nTo start the container in detached mode, run the following command:\n```\ndocker-compose up -d\n```\n\nYou can also run the container in interactive mode by running the following command:\n```\ndocker-compose up\n```\nThis will allow you to see the container logs in the terminal. The project at this point is running the `training_segmentalist.py` script. You can stop the container by pressing `Ctrl+C`.\n### Training the model\nTo train the Segmentalist model using custom data, run the following command in the root directory:\n```\ndocker exec -it \u003cCONTAINER-ID\u003e python3 training_segmentalist.py -tt /Path/To/First/Raster.tif /Path/To/First/Mask.tif /Path/To/Second/Raster.tif /Path/To/Second/Mask.tif --epochs 10 --batch_size 32 --lr 0.001\n```\nThe `CONTAINER-ID` can be found by running the following command:\n```\ndocker ps\n```\nThe `CONTAINER-ID` is the first column in the output of the above command. The `training_segmentalist.py` script takes in the following arguments:\n```\nUsage: python3 training_segmentalist.py [-h] [-tt TRAINING_GEOTIFFS [TRAINING_GEOTIFFS ...]] [-d DATASETS]\n                             [-tsr TARGET_SPATIAL_RESOLUTION] [-pp | -npp] [-ds | -nds] [-lc | -nlc]\n                             [-ecbam | -necbam] [-dcbam | -ndcbam] [-sag | -nsag | -csag | -ncsag]\n                             [-lb LAYER_BLOCKS] [-ldb LAST_DECODER_LAYER_BLOCKS] [-if INITIAL_FILTERS]\n                             [-rf RESIDUAL_FILTERS] [-ik INITIAL_KERNEL_SIZE] [-hk HEAD_KERNEL_SIZE]\n                             [-cd CARDINALITY]\n\noptional arguments:\n  -h, --help            show this help message and exit\n  -tt TRAINING_GEOTIFFS [TRAINING_GEOTIFFS ...], --training-geotiff TRAINING_GEOTIFFS [TRAINING_GEOTIFFS ...]\n                        A list of geotiff files to train on\n  -d DATASETS, --datasets DATASETS\n                        Comma delimited string of dataset tags. Available datasets are:\n                        train_tif\n  -tsr TARGET_SPATIAL_RESOLUTION, --target-spatial-res TARGET_SPATIAL_RESOLUTION\n                        spatial resolution to resample to. native resolution for all datasets if == 0.\n  -pp, --input-pyramid  Enable input pyramid\n  -npp, --no-input-pyramid\n                        Disable input pyramid\n  -ds, --deep-supervision\n                        Enable deep supervision\n  -nds, --no-deep-supervision\n                        Disable deep supervision\n  -lc, --lambda-conv    Replace main 3x3 convolutions in residual blocks with Lambda convolutions\n  -nlc, --no-lambda-conv\n                        Don't replace main 3x3 convolutions in residual blocks with Lambda convolutions\n  -ecbam, --encoder-cbam\n                        enable CBAM blocks in encoder residual blocks\n  -necbam, --no-encoder-cbam\n                        disable CBAM blocks in encoder residual blocks\n  -dcbam, --decoder-cbam\n                        enable CBAM blocks in decoder residual blocks\n  -ndcbam, --no-decoder-cbam\n                        disable CBAM blocks in decoder residual blocks\n  -sag, --attention-gate\n                        Enable spatial attention gate\n  -nsag, --no-attention-gate\n                        Disable spatial attention gate\n  -csag, --channel-spatial-attention-gate\n                        Enable channel-spatial attention gate\n  -ncsag, --no-channel-spatial-attention-gate\n                        Disable channel-spatial attention gate\n  -lb LAYER_BLOCKS, --layer-blocks LAYER_BLOCKS\n                        Comma-delimited list of the number of residual blocks per layer of the encoder. The last number fixes\n                        those in the bridge which is unique. The decoder mirrors these blocks excluding the bridge. The final\n                        block of the decoder uses the last_layer_decoder_blocks argument to fix the number of residual convblocks.\n  -ldb LAST_DECODER_LAYER_BLOCKS, --last-decoder-layer-blocks LAST_DECODER_LAYER_BLOCKS\n                        The number of residual conv blocks in the final decoder block.\n  -if INITIAL_FILTERS, --initial-filters INITIAL_FILTERS\n                        The number of filters in the first large-kernel-size ResNet convolution in the encoder.\n  -rf RESIDUAL_FILTERS, --residual-filters RESIDUAL_FILTERS\n                        Comma-delimited list of the number of filters in the residual convolutions in the encoder and decoder.\n  -ik -initial-kernel-size The kernel size for the initial convolution in the encoder block. Usually ResNet style 7x7.\n    -hk -head-kernel-size The kernel size for the head (final segmentation layer). Typically 1x1.\n    -cd -cardinality The cardinality of ResNeXt grouped convolutions in main blocks (if lambda_conv is false).\n    -act -activation String name of activation function used throughout.\n    -dsmp -downsample Mechanism used for downsampling feature maps: \"pool\" or \"strides\".\n    -s -patch-size No. of pixels per image patch used for training.\n    -ot -overlap-tiles Flag to toggle on overlapping tiles in training data (half-step).\n    -not -no-overlap-tiles Flag to toggle off overlapping tiles in training data (half-step) (no overlapping).\n    -ep -epochs No. of training epochs.\n    -bs -batch_size Batch size.\n    -l -loss-fn Loss function name as string (looks in building_age.losses). Optionally provide kwargs afterwards using a colon to delineate the beginning of comma-separated keyword args, e.g. `custom_loss_fn:gamma=1.5,alpha=0.2`.\n    -opt -optimiser Gradient descent optimizer (adam, sgd or ranger).\n    -swa -stochastic-weight-averaging Apply stochastic weight averaging to optimizer.\n    -nswa -no-stochastic-weight-averaging Do not apply stochastic weight averaging to optimizer.\n    -dswa -duration-swa No. of epochs before last where SWA is applied.\n    -pswa -period-swa Period in epochs over which to average weights with SWA.\n    -vl -use-val Switch: evaluate on validation data every epoch and track this.\n    -p -patience Patience.\n    -rs -seed Random seed.\n    -vf -val-frac Validation fraction.\n    -tf -test-frac Test fraction.\n    -fa -fancy-augs Flag whether to use fancy augmentations (albumentations + FancyPCA).\n    -lr -lr-init Initial learning rate.\n    -lrmin -lr-min Minimum learning rate if reduce LR on plateau callback used.\n    -lrf -lr-reduce-factor Multiplicative LR reduction factor for reduce LR on plateau callback.\n    -lrp -lr-reduce-patience Epochs patience for LR reduction application if reduce LR on plateau.\n    -ocp -use-ocp Enable one-cycle policy (not used atm).\n    -ba -balanced-oversample Oversample training arrays to balance different datasets. Makes an \"epoch\" much longer.\n    -md -model-dir Directory to save model checkpoints to.\n    -dt -dump-test-data Dump test arrays to zarr.\n    -da -dump-first-batches Precalculate first chunk of training array and dump to disk for inspection.\n    -c -use-cache Try to read preprocessed arrays from file if serialised.\n    -sc -save-to-cache Save preprocessed arrays to file for future training runs.\n ```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fundp-data%2Fgeo-gim-model","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fundp-data%2Fgeo-gim-model","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fundp-data%2Fgeo-gim-model/lists"}