{"id":24800520,"url":"https://github.com/codeofrahul/us_visa_approval_prediction_mlops","last_synced_at":"2025-03-25T00:45:44.133Z","repository":{"id":267638400,"uuid":"901839618","full_name":"CodeofRahul/US_visa_approval_prediction_MLOPS","owner":"CodeofRahul","description":"This project implements an end-to-end Machine Learning Operations (MLOps) pipeline to predict the approval status of US visa applications. By leveraging a comprehensive dataset containing various applicant and job-related features, I aim to build a robust model capable of accurately predicting whether a visa application will be certified or denied.","archived":false,"fork":false,"pushed_at":"2025-03-12T10:09:54.000Z","size":41051,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-12T11:23:28.732Z","etag":null,"topics":["cicd-pipeline","machinelearning","mlops-project","mlops-workflow","predictive-analysis"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/CodeofRahul.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-12-11T12:14:13.000Z","updated_at":"2025-03-12T10:15:21.000Z","dependencies_parsed_at":"2024-12-11T15:19:56.657Z","dependency_job_id":"e27558da-0b7d-46f1-aee9-6e6c9fb3a981","html_url":"https://github.com/CodeofRahul/US_visa_approval_prediction_MLOPS","commit_stats":null,"previous_names":["codeofrahul/us_visa_approval_prediction_mlops"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CodeofRahul%2FUS_visa_approval_prediction_MLOPS","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CodeofRahul%2FUS_visa_approval_prediction_MLOPS/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CodeofRahul%2FUS_visa_approval_prediction_MLOPS/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CodeofRahul%2FUS_visa_approval_prediction_MLOPS/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/CodeofRahul","download_url":"https://codeload.github.com/CodeofRahul/US_visa_approval_prediction_MLOPS/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245377959,"owners_count":20605375,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cicd-pipeline","machinelearning","mlops-project","mlops-workflow","predictive-analysis"],"created_at":"2025-01-30T03:19:06.001Z","updated_at":"2025-03-25T00:45:44.117Z","avatar_url":"https://github.com/CodeofRahul.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![MLOps](https://img.shields.io/badge/MLOps-Enabled-green)]()\n[![Pipeline](https://img.shields.io/badge/Pipeline-Automated-blue)]()\n[![CI/CD](https://img.shields.io/badge/CI%2FCD-GitHub%20Actions-blue?logo=github)](https://github.com/features/actions)\n[![Machine Learning](https://img.shields.io/badge/Machine%20Learning-Enabled-orange)]()\n[![License](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![Python Version](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/release/python-380/)\n\n\n# US_visa_approval_prediction_MLOPS\n\nThis project implements an end-to-end **Machine Learning Operations (MLOps)** pipeline to predict the approval status of US visa applications. By leveraging a comprehensive dataset containing various applicant and job-related features, I aim to build a robust model capable of accurately predicting whether a visa application will be certified or denied. This project showcases a full MLOps workflow, including data ingestion, validation, transformation, model training, evaluation, CICD and deployment, all within a structured and scalable architecture.\n\n\n## Problem Statement:\n**US visa approval status** \u003cbr\u003e\nGiven certain set of feature such as (continent, education, job-experience, training, employment, current age etc.)\nI have to predict weather the application for the visa will be approved or not.\n\n**Data Structure and Features:**\n\n- **continent:** The continent of the employee.\n- **education_of_employee:** The highest level of education attained by the employee.\n- **has_job_experience:** A binary variable indicating whether the employee has job experience (Y/N).   \n- **requires_job_training:** A binary variable indicating whether the job requires training (Y/N).\n- **no_of_employees:** The number of employees in the company.\n- **yr_of_estab:** The year the company was established.\n- **region_of_employment:** The region of employment in the US.\n- **prevailing_wage:** The prevailing wage offered for the position.\n- **unit_of_wage:** The unit of the prevailing wage (Hour/Year).\n- **full_time_position:** A binary variable indicating whether the position is full-time (Y/N).   \n- **case_status:** The target variable, indicating whether the visa application was certified or denied.\n\n**Solution Scope:** \u003cbr\u003e\nThis can be used on real life by Us visa applicants so that they can improve their Resume and criteria for the approval process.\n\n**Solution Approach:** \u003cbr\u003e\n1.\tMachine learning : ML Classification Algorithms\n2.\tDeep Learning: Custom ANN with sigmoid activation Function\n\n**Solution Proposed:** \u003cbr\u003e\nI will be using ML\n1.\tLoad the data from DB\n2.\tPerform EDA and feature engineering to select the desirable features.\n3.\tFit the ML classification Algorithm and find out which one performs better.\n4.\tSelect top few and tune hyperparameters.\n5.\t Select the best model based on desired metrics.\n\n## Key Features\n\n* **Automated MLOps Pipeline:** A complete pipeline for data processing, model training, and deployment, ensuring reproducibility and efficiency.\n* **Data Validation and Transformation:** Rigorous data validation to maintain data quality and effective transformation for optimal model performance.\n* **Model Training and Evaluation:** Utilization of advanced machine learning techniques to build and evaluate predictive models.\n* **Continuous Integration/Continuous Deployment (CI/CD):** Automated deployment using CI/CD pipelines for seamless updates.\n* **Scalable Architecture:** Modular design for easy expansion and maintenance.\n* **MongoDB Integration:** Utilizes MongoDB for data storage and retrieval, demonstrating database connectivity.\n* **Comprehensive Logging and Exception Handling:** Robust logging and exception handling for improved monitoring and debugging.\n\n\n\n\n\n\n```Powershell\nto create file using CMD/Powershell : type \u003cfilename\u003e (type template.py)\n```\n\n\n- To create environment = `conda create -p visa python=3.8 -y`\n- To check available envs = `conda env list`\n- To check available envs = `conda info --envs`\n- To activate environment = `conda activate visa`\n- To install requirements.txt = `pip install -r requirements.txt`\n- To check install packages = `pip list`\n- To check detailed about package = `pip show package_name`\n- To install package = `pip install package_name`\n- To uninstall package = `pip uninstall package_name`\n\n\n```python\n\u003e\u003e\u003e from pathlib import Path\n\u003e\u003e\u003e path = \"test/test.py\"\n\u003e\u003e\u003e path = \"test/test.py\"\n\u003e\u003e\u003e Path(path)\nWindowsPath('test/test.py')\n```\n\n## Git commands \n\n- To add all file = `git add .`\n- To add any particular file = `git add \u003cfile_name\u003e`\n- To commit = `git commit -m \"commit message\"`\n- To push the code = `git push origin main`\n\n\n- MongoDB : https://account.mongodb.com/account/login\n\n\n## Workflow:\n\n1. constants\n2. entity\n3. components\n4. pipelines\n5. Main file\n\n### Export the environment variable\n```bash\n\nexport MONGODB_URL=\"mongodb+srv://\u003cusername\u003e:\u003cpassword\u003e....\"\n\nexport AWS_ACCESS_KEY_ID = \u003cAWS_ACCESS_KEY_ID\u003e\n\nexport AWS_SECRET_ACCESS_KEY = \u003cAWS_SECRET_ACCESS_KEY\u003e\n\n```\n\n\n\n### How Amazon S3 works\n\nAmazon S3 stores data as objects within buckets. An object is a file and any metadata that describes the file. A bucket is a container for objects. To store your data in Amazon S3, you first create a bucket and specify a bucket name and AWS Region. Then, you upload your data to that bucket as objects in Amazon S3. Each object has a key (or key name), which is the unique identifier for the object within the bucket.\n\nS3 provides features that you can configure to support your specific use case. For example, you can use S3 Versioning to keep multiple versions of an object in the same bucket, which allows you to restore objects that are accidentally deleted or overwritten. Buckets and the objects in them are private and can only be accessed with explicitly granted access permissions. You can use bucket policies, AWS Identity and Access Management (IAM) policies, S3 Access Points, and access control lists (ACLs) to manage access.\n\n## AWS-CICD-Deployment-with-Github-action \n\n### 1. Login to AWS console.\n\n### 2. Create IAM user for deployment \n\n```python\n\n# with specific access\n\n1. EC2 access : It is virtual machine\n\n2. ECR: Elastic Container registry to save your docker image in aws\n\n\n# Description: About the deployment\n\n1. Build docker image of the source code\n\n2. Push your docker image to ECR\n\n3. Launch Your EC2\n\n4. Pull Your image from ECR in EC2\n\n5. Launch your docker image in EC2\n\n# Policy:\n\n1. AmazonEC2ContainerRegistryFullAccess\n\n2. AmazonEC2FullAccess\n\n```\n\n### 3. Create ECR repo to store/save docker image\n\n```\nSave the URI: 315865595366.dkr.ecr.us-east-1.amazonaws.com/visarepo\n```\n\n### 4. Create EC2 machine (Ubuntu)\n\n### 5. Open EC2 and Install docker in EC2 Machine:\n\n```\n#optinal\n\nsudo apt-get update -y\n\nsudo apt-get upgrade\n\n#required\n\ncurl -fsSL https://get.docker.com -o get-docker.sh\n\nsudo sh get-docker.sh\n\nsudo usermod -aG docker ubuntu\n\nnewgrp docker\n```\n\n### 6. Configure EC2 as self-hosted runner:\n\n```\nsetting\u003eactions\u003erunner\u003enew self hosted runner\u003e choose os\u003e then run command one by one\n```\n\n### 7. Setup github secrets:\n\n- AWS_ACCESS_KEY_ID\n- AWS_SECRET_ACCESS_KEY\n- AWS_DEFAULT_REGION\n- ECR_REPO\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcodeofrahul%2Fus_visa_approval_prediction_mlops","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcodeofrahul%2Fus_visa_approval_prediction_mlops","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcodeofrahul%2Fus_visa_approval_prediction_mlops/lists"}