{"id":50605317,"url":"https://github.com/doguilmak/inferencevision","last_synced_at":"2026-06-05T22:01:47.901Z","repository":{"id":242267980,"uuid":"798780763","full_name":"doguilmak/InferenceVision","owner":"doguilmak","description":"Provide geographic coordinates from bounding boxes.","archived":false,"fork":false,"pushed_at":"2026-01-16T22:31:26.000Z","size":69496,"stargazers_count":0,"open_issues_count":0,"forks_count":2,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-01-17T10:08:45.317Z","etag":null,"topics":["coordinate-reference-system","object-detection","question-answering-assistant","remote-sensing","ultralytics"],"latest_commit_sha":null,"homepage":"https://doguilmak.github.io/InferenceVision/","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/doguilmak.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2024-05-10T13:07:15.000Z","updated_at":"2026-01-16T22:31:29.000Z","dependencies_parsed_at":"2025-04-12T16:49:30.549Z","dependency_job_id":"623c1aae-01b2-4823-ba78-95ae142eaf24","html_url":"https://github.com/doguilmak/InferenceVision","commit_stats":null,"previous_names":["doguilmak/inferencevision"],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/doguilmak/InferenceVision","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/doguilmak%2FInferenceVision","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/doguilmak%2FInferenceVision/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/doguilmak%2FInferenceVision/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/doguilmak%2FInferenceVision/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/doguilmak","download_url":"https://codeload.github.com/doguilmak/InferenceVision/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/doguilmak%2FInferenceVision/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33961252,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-05T02:00:06.157Z","response_time":120,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["coordinate-reference-system","object-detection","question-answering-assistant","remote-sensing","ultralytics"],"created_at":"2026-06-05T22:01:47.834Z","updated_at":"2026-06-05T22:01:47.891Z","avatar_url":"https://github.com/doguilmak.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cimg  src=\"https://github.com/doguilmak/InferenceVision/blob/main/assets/Inference%20Vision%20Cover.png\" alt=\"github.com/doguilmak/InferenceVision\"/\u003e\r\n\r\nIn contemporary scientific research and applications, there is an increasing demand for accurate geospatial analysis to address various real-world challenges, ranging from environmental monitoring to urban planning and disaster response. The ability to precisely locate and identify objects within geographic areas plays a pivotal role in such endeavors. In this scientific project, we aim to enhance geospatial analysis by integrating object detection techniques with geographic coordinate calculations. Please check our [website](https://doguilmak.github.io/InferenceVision/) for the library. You'll find a wealth of information and materials available to enrich your knowledge and learning experience.\r\n\r\n\u003cbr\u003e\r\n\r\nStay tuned for regular updates on our progress and new developments:\r\n\r\n\u003cdetails\u003e\r\n\r\n\u003csummary\u003eLatest updates...\u003c/summary\u003e\r\n\r\n\u003cbr\u003e\r\n\r\n\u003cb\u003eAugust 2025\u003c/b\u003e\r\n\u003col\u003e\r\n\t\u003cli\u003eThe fine-tuned GPT-Neo-1.3B InferenceVision Q\u0026A LLM model is now available on \u003ca href=\"https://huggingface.co/doguilmak/inferencevision-gpt-neo-1.3B\"\u003eHuggingFace\u003c/a\u003e.\u003c/li\u003e\r\n\u003c/ol\u003e\r\n\r\n\u003cb\u003eMay 2025\u003c/b\u003e\r\n\u003col\u003e\r\n\t\u003cli\u003eThe fine-tuned pythia-1B InferenceVision Q\u0026A LLM model is now available on \u003ca href=\"https://huggingface.co/doguilmak/inferencevision-pythia-1B\"\u003eHuggingFace\u003c/a\u003e.\u003c/li\u003e\r\n\u003c/ol\u003e\r\n\r\n\u003cb\u003eDecember 2024\u003c/b\u003e\r\n\u003col\u003e\r\n\t\u003cli\u003eIntroduced the advanced language model for technical Q\u0026A with InferenceVision!\u003c/li\u003e\r\n\u003c/ol\u003e\r\n\r\n\u003cb\u003eAugust 2024\u003c/b\u003e\r\n\u003col\u003e\r\n\t\u003cli\u003eLaunched InferenceVision version 1.1!\u003c/li\u003e\r\n\t\u003cli\u003eLaunched InferenceVision version 1.2!\u003c/li\u003e\r\n\u003c/ol\u003e\r\n\r\n\u003cb\u003eJune 2024\u003c/b\u003e\r\n\u003col\u003e\r\n\t\u003cli\u003eLaunched InferenceVision version 1.0!\u003c/li\u003e\r\n\u003c/ol\u003e\r\n\r\n\u003c/details\u003e\r\n\r\n\u003cbr\u003e\r\n\r\n## **Problem Statement**\r\nTraditional methods of geospatial analysis often rely on manual identification and mapping of objects within geographical regions. However, these methods are time-consuming, labor-intensive, and prone to errors. Moreover, they may lack the scalability required for large-scale analyses. Therefore, there is a need for automated solutions that can accurately detect and locate objects within geographic areas, enabling efficient and scalable geospatial analysis.\r\n\r\n\u003cbr\u003e\r\n\r\n## **Project Objective**\r\nOur project seeks to address the aforementioned challenges by developing an automated system that combines object detection algorithms with geographic coordinate calculations. By integrating these components, we aim to achieve the following objectives:\r\n\r\n1. **Object Detection:** Utilize state-of-the-art object detection algorithms, such as YOLO (You Only Look Once), to automatically identify and localize objects within satellite or aerial imagery.\r\n\r\n2. **Geographic Coordinate Calculation:** Develop algorithms to calculate the geographic coordinates (latitude and longitude) of detected objects relative to a given bounding polygon. This involves converting normalized center coordinates of objects within the bounding polygon to precise geographic coordinates.\r\n\r\n3. **Integration and Visualization:** Integrate object detection results with calculated geographic coordinates to create a comprehensive geospatial dataset. Visualize the detected objects and their geographic locations on maps for further analysis and interpretation.\r\n\r\n\u003cbr\u003e\r\n\r\n## **Scientific Significance**\r\nThe proposed project has several scientific implications and contributions:\r\n\r\n- **Automation and Efficiency:** By automating the process of object detection and geographic coordinate calculation, our system significantly reduces the time and effort required for geospatial analysis, thereby enhancing efficiency and scalability.\r\n\r\n- **Accuracy and Precision:** Through the integration of advanced algorithms, our system ensures high accuracy and precision in object detection and geographic coordinate calculation, leading to reliable and trustworthy results.\r\n\r\n- **Versatility and Adaptability:** The developed system is versatile and adaptable to various applications, including environmental monitoring, urban planning, agriculture, and disaster response. It provides researchers and practitioners with a powerful tool for analyzing geospatial data in diverse contexts.\r\n\r\n\u003cbr\u003e\r\n\r\n## **Methodology**\r\n\r\nIn this section, we outline the methodology employed for deriving geographic coordinates from input data within the InferenceVision framework. This methodological approach combines advanced techniques in satellite image analysis, object detection, and geographic coordinate calculation to enable precise geospatial analysis and visualization. Let's delve into the steps involved:\r\n\r\n\u003cimg  src=\"https://github.com/doguilmak/InferenceVision/blob/main/assets/Inference%20Vision%20Intro.png\" alt=\"github.com/doguilmak/InferenceVision\"/\u003e\r\n\r\n**Given a set of inputs, the calculation unfolds as follows:**\r\n\r\n\u003cbr\u003e\r\n\r\n**1- Transform VHR Satellite Image Coordinates to WGS 84 (EPSG:4326) and Extract Polygon Coordinates:** The target Coordinate Reference System (CRS) is WGS 84, representing a geographic coordinate system. Converting to this CRS standardizes the data. We use Nearest Neighbor interpolation, which can result in a blocky appearance. Transformed coordinates are precise to 9 decimal places (as default). First, we transform image coordinates to WGS 84. The coordinates of the polygons are defined as G (geometric shapes) and the following transformation operations are applied to convert these shapes to the geographic coordinate system:\r\n\r\n\u003cbr\u003e\r\n\r\n$$ G_{EPSG:4326} = transform(G_{dataset}, CRS_{dataset}) $$\r\n\r\n\u003cbr\u003e\r\n\r\nThen, we extract polygon coordinates, defining the geographical extent with top-left ($TL$) and bottom-right ($BR$) corners as reference points for computing the geographic coordinates of normalized centers.\r\n\r\n\u003cbr\u003e\r\n\r\n\u003cbr\u003e\r\n\r\n**2- Calculate Normalized Centers:** In the second stage, model making prediction and making detections. Then, the center coordinates of the detected objects are calculated from their bounding boxes. The edge coordinates determined for each object ($x_{min}$, $y_{min}$, $x_{max}$, $y_{max}$) are used. $x_{min}$ and $x_{max}$ are the pixel coordinates of the left and right edges of the bounding boxes on the x-axis, and $y_{min}$ and $y_{max}$ are the pixel coordinates of the top and bottom edges of the bounding boxes on the y-axis. The center point of the object is determined by the following formula:\r\n\r\n\u003cbr\u003e\r\n\r\n$$ (x_{center}, y_{center}) = (\\frac{x_{min} + x_{max}}{2} + \\frac{y_{min}+y_{max}}{2})$$\r\n\r\n\u003cbr\u003e\r\n\r\nThe centroids of the bounding boxes are then normalized, which is necessary to convert the pixel coordinates of object locations within the image into a standard format. Normalization can be expressed as:\r\n\r\n\u003cbr\u003e\r\n\r\n$$N_x = \\frac{x_{center}}{W}$$\r\n\r\n$$N_y = \\frac{y_{center}}{H}$$\r\n\r\nWhere:\r\n- **$N_x$**: The value of the normalized pixel coordinate of the center point along the x-axis.\r\n- **$N_y$**: The value of the normalized pixel coordinate of the center point along the y-axis.\r\n- **$X_{center}$**: The x pixel coordinate of the center of the bounding box.\r\n- **$Y_{center}$**: The y pixel coordinate of the center of the bounding box.\r\n- **$W$**: The total width of the raster image.\r\n- **$H$**: The total height of the raster image.\r\n\r\n\u003cbr\u003e\r\n\r\n\u003cbr\u003e\r\n\r\n**3- Calculate Geographic Coordinates:** Finally, the geographic coordinates are calculated using the normalized center coordinates. In this stage, the corner coordinates of the extracted polygon are taken as reference. The normalized values ​​are used to determine the actual locations on the geographic area by associating them with these corner coordinates. The calculation is carried out with the following formulas:\r\n\r\n\u003cbr\u003e\r\n\r\n$$ lat = lat_{TL} + (lat_{BR} - lat_{TL}) \\times N_{x} $$\r\n\r\n$$ lon = lon_{TL} + (lon_{BR} - lon_{TL}) \\times N_{y} $$\r\n\r\n   \r\n   **Where:**\r\n\r\n   - $lat$ represents latitude.\r\n   - $lon$ represents longitude.\r\n   - $N_{x}$ and $N_{y}$ are the normalized center coordinates.\r\n   - $lat_{TL}, lon_{TL}, lat_{BR},$ and $lon_{BR}$ are the latitude and longitude of the top-left and bottom-right corners of the polygon, respectively.\r\n\r\n\u003cbr\u003e\r\n\r\n\u003cimg  src=\"https://github.com/doguilmak/InferenceVision/blob/main/docs/DifferenceMap.gif\" alt=\"github.com/doguilmak/InferenceVision\"/\u003e\r\n\r\n\u003cbr\u003e\r\n\r\n**NOTE: The input image must have a CRS set to ensure accurate geographic coordinate calculation.**\r\n\r\n\u003cbr\u003e\r\n\r\n## **Install and Use the Library**\r\n\r\n\u003cbr\u003e\r\n\r\n**1- Install the Library Run the following command in a code cell to install `inference_vision` from GitHub:**\r\n\r\n    git clone https://github.com/doguilmak/InferenceVision.git\r\n    cd InferenceVision\r\n\r\n\r\n**2- Install requirements using `requirements.txt` file.**\r\n\t\r\n    pip install -r requirements.txt -q\r\n\r\n**3- Once the installation is complete, import the `InferenceVision` class from the library.** \r\n\r\n    from inference_vision import InferenceVision\r\n\r\n**4- Here's a simple example demonstrating how to use `InferenceVision`:**\r\n\r\n    inference = InferenceVision(\r\n         tif_path=\"path/to/image.tif\",\r\n\t     model_path=\"path/to/model.pt\"\r\n\t)\r\n    \r\n    inference.process_image(build_csv=True, csv_filename=\"output.csv\")\r\n\r\n\u003cbr\u003e\r\n\r\n**In addition, you can see how to use the `inference_vision` library step by step in an [IPython Notebook](https://github.com/doguilmak/InferenceVision/blob/main/usage/InferenceVision_Usage.ipynb) environment.**\r\n\r\n\u003cbr\u003e\r\n\r\n### **Debugging and Future Improvements**\r\n\r\n**Debugging:** In case of any errors or unexpected behavior during image processing, carefully review the input data, model configuration, and method calls. Use debugging tools such as print statements, logging, or interactive debugging to identify and resolve issues.\r\n\r\n\u003cbr\u003e\r\n\r\n**Future Improvements:** Consider incorporating additional features or enhancements to further optimize the performance and usability of the `InferenceVision` class. Potential improvements may include support for alternative object detection models, integration with other geospatial libraries, or optimization of computational efficiency.\r\n\r\n\u003cbr\u003e\r\n\r\n## **Advanced Language Models for InferenceVision**\r\n\r\nTo enhance your experience with **InferenceVision**, we've integrated two advanced language models—[**InferenceVision-Pythia-1B**](https://huggingface.co/doguilmak/inferencevision-pythia-1B) and [**InferenceVision-GPTNeo-1.3B**](https://huggingface.co/doguilmak/inferencevision-gpt-neo-1.3B)—that deliver intelligent, context-aware support for technical topics. These models are designed to assist with questions related to geospatial analysis, object detection, and geographic coordinate calculations, helping users understand the technical foundation of the platform.\r\n\r\n### **Model Overview**\r\n\r\n#### [**InferenceVision-Pythia-1B**](https://huggingface.co/doguilmak/inferencevision-pythia-1B)\r\nBased on the **EleutherAI/pythia-1b** architecture, this high-capacity model is fine-tuned specifically for the InferenceVision domain. It excels at generating detailed responses to complex questions involving object detection, spatial data handling, and coordinate systems. The model’s training was tailored to domain-specific documentation and technical prompts, ensuring precise and relevant answers.\r\n\r\n#### [**InferenceVision-GPTNeo-1.3B**](https://huggingface.co/doguilmak/inferencevision-gpt-neo-1.3B)\r\nBuilt on the **EleutherAI/gpt-neo-1.3B** model, this transformer-based language model has been fine-tuned using a structured Q\u0026A dataset customized for InferenceVision. It offers highly accurate responses related to geospatial workflows, coordinate transformations, and detection pipelines, making it a reliable assistant for navigating technical content.\r\n\r\n### **Key Features:**\r\n- **Domain-Specific Tuning**: Both models are optimized using project-specific content to ensure high relevance and contextual precision.\r\n- **Large-Scale Performance**: With 1B+ parameters, these models handle complex language tasks and detailed technical queries.\r\n- **Q\u0026A Optimization**: Structured to deliver targeted support for object detection, spatial analysis, and platform workflows.\r\n- **Context-Aware Responses**: Capable of understanding and responding to intricate prompts across geospatial and AI domains.\r\n\r\nFor a hands-on guide on fine-tuning and using these models with **InferenceVision**, check out the [interactive notebook](https://github.com/doguilmak/InferenceVision/blob/main/usage/InferenceVision_LLM_QA.ipynb).\r\n\r\n\u003cbr\u003e\r\n\r\n## **Conclusion**\r\nThis calculation elucidates the process of deriving geographic coordinates from given inputs, a pivotal step within `InferenceVision` framework. It facilitates the transformation of normalized center coordinates into precise geographic coordinates, fostering accurate geospatial analysis and visualization. Geographic coordinates, namely latitude and longitude, are indispensable for pinpointing specific locations on Earth's surface. This process outlined here harmonizes normalized center coordinates, relative values within a bounding area, into a set of coordinates mappable onto a geographical map for comprehensive analysis. In conclusion, our scientific project aims to advance the field of geospatial analysis by leveraging cutting-edge technologies and methodologies. By combining object detection with geographic coordinate calculation, we strive to provide researchers and practitioners with an efficient, accurate, and versatile solution for addressing complex geospatial challenges.\r\n\r\n\u003cbr\u003e\r\n\r\n## **Citation**\r\n\r\nFor a detailed exploration of related work, refer to the research article available at [IEEE](https://ieeexplore.ieee.org/document/10642920). Presented at IEEE IGARSS 2024 in Athens, our article delves into the application of object detection techniques in geospatial contexts, highlighting the ultimate use of Very High Resolution (VHR) satellite imagery for analyzing disaster impacts.\r\n\r\n\u003cbr\u003e\r\n\r\n**BibTeX**:\r\n\r\n    @INPROCEEDINGS{10642920,\r\n      author = {Ilmak, Dogu and Iban, Muzaffer Can and Zafer Şeker, Dursun},\r\n      title = {A Geospatial Dataframe of Collapsed Buildings in Antakya City after the 2023 Kahramanmaraş Earthquakes Using Object Detection Based on YOLO and VHR Satellite Images},\r\n      booktitle = {IGARSS 2024 - 2024 IEEE International Geoscience and Remote Sensing Symposium},\r\n      year = {2024},\r\n      pages = {3915-3919},\r\n      keywords = {YOLO; Buildings; Urban areas; Earthquakes; Geoscience and remote sensing; Satellite images; Sensors; Geospatial analysis; Context modeling; Deep Learning; Object Detection; Very High-Resolution Satellite Imagery; Remote Sensing; Earthquake Damage Assessment},\r\n      doi = {10.1109/IGARSS53475.2024.10642920}\r\n    }\r\n\r\n\r\n\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdoguilmak%2Finferencevision","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdoguilmak%2Finferencevision","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdoguilmak%2Finferencevision/lists"}