{"id":26517040,"url":"https://github.com/vicperal/ai-genai_projects","last_synced_at":"2026-04-02T02:49:55.521Z","repository":{"id":280990640,"uuid":"943841188","full_name":"vicperal/AI-GenAI_projects","owner":"vicperal","description":"Python projects about LLM and ML use cases. I am using modules such as Pandas, Numpy, Plotly, scikit-learn, Transformers, Flask, JSON, etc. to analyze data, predict, generate insights and create text from models such as LLMs, linear regression, assembly methods, etc. Server- Front-End using Flask","archived":false,"fork":false,"pushed_at":"2025-03-20T09:38:09.000Z","size":106,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-20T10:34:05.872Z","etag":null,"topics":["assembly","clinical-trials","flask","json","linear-regression","llm","ml","numpy","pandas","plotly","price-prediction","python","rag","random-forest","scikit-learn","sentimental-analysis","sql","text-summarization","tokens-counter","transformers"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/vicperal.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-03-06T10:58:59.000Z","updated_at":"2025-03-20T09:38:12.000Z","dependencies_parsed_at":null,"dependency_job_id":"a14d8bce-9138-41e7-91fc-e7445da42bbb","html_url":"https://github.com/vicperal/AI-GenAI_projects","commit_stats":null,"previous_names":["vicperal/ai-genai_projects"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vicperal%2FAI-GenAI_projects","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vicperal%2FAI-GenAI_projects/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vicperal%2FAI-GenAI_projects/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vicperal%2FAI-GenAI_projects/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/vicperal","download_url":"https://codeload.github.com/vicperal/AI-GenAI_projects/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244759961,"owners_count":20505716,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["assembly","clinical-trials","flask","json","linear-regression","llm","ml","numpy","pandas","plotly","price-prediction","python","rag","random-forest","scikit-learn","sentimental-analysis","sql","text-summarization","tokens-counter","transformers"],"created_at":"2025-03-21T08:17:50.054Z","updated_at":"2025-12-30T19:58:28.926Z","avatar_url":"https://github.com/vicperal.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# My AI/ML and GenAI projects in Python\n\n# Just a few words... \nWelcome to my GitHub repository! Here you will find several projects on Python I have been working on for some time, that helped me connect some key principles and fundamental Advanced Analytics \u0026 Data Science concepts, with special focus on ML and LLM.  \n\nOn my way to explore the Data Science \u0026 AI field, I really believe it is important to play, test, build, make mistakes, learn and have fun! \u003cb\u003e #Never_Stop_Learning \u003c/b\u003e🚀\n\n# 1. multi-ML comparison with a Web server-frontend in Flask\n\n\u003cb\u003e multi-ML_comparison_Flask_frontend.py\u003c/b\u003e is a multi-ML choice application, it uses  http://api.worldbank.org public data for model training. Server and Frontend code: index.html shows real and predicted values, and compares the models performance statistics. The application plots 3d graphs created with Plotly that are shown in a web app created with Flask .\n\nIn this project I'm using modules such as Pandas, Numpy, Plotly, scikit-learn, Flask, JSON, etc. to analyze the relationship between data from databases and calculate predictions using models such as linear regression, assembly methods, etc. It display the graphs in a Web app created in Flask.\n\n# 2. LLM use cases projects\n\n2.1. \u003cb\u003e app contador de tokens_transf_huggingface.py:\u003c/b\u003e \n\n\u003cb\u003e Tokens counter \u003c/b\u003e. Python code to  calculate the number of tokens of a text using the Hugging Face library\n\n2.2. \u003cb\u003e app RAG_transformer_huggingface.py:\u003c/b\u003e\n\n\u003cb\u003e RAG application \u003c/b\u003e . The code uses two different models to implement the RAG system:\n\nFor the calculation of embeddings: The model used is 'paraphrase-MiniLM-L6-v2', which is a variant of the all-MiniLM-L6-v2 model. This model maps sentences and paragraphs to a vector space. \nIt is used to encode both the knowledge base and the user's query into vectors, allowing similarity search to retrieve the relevant context.\n\nFor the generation of the response: The model used is 'google/flan-t5-base', which is a larger version of the T5 (\nText-to-Text Transfer Transformer) model developed by Google. This model is used to generate the final response based on the retrieved context and the user's question.\n\n2.3. \u003cb\u003eapp sentiment analysis_transf_huggingface.py:\u003c/b\u003e\n\n \u003cb\u003e Sentiment analysis \u003c/b\u003e. This code performs the sentiment analysis of a given text using the Hugging Face Transformers library. \n\n2.4. \u003cb\u003eapp text summary_transformers_huggingface.py:\u003c/b\u003e\n\n \u003cb\u003e Text summarization \u003c/b\u003e. This code performs text summarization of a given text using the Hugging Face Transformers library. \n\n# 3. ML-based price prediction in three simulation scenarios of demand/competition\n\n\u003cb\u003eML price prediction_demand_competition scenarios.py:\u003c/b\u003e ML application to predict the price given 3 scenarios of competition (severe, mid,low). index.html is generated to plot the 3d graphs that shows the predicted price evolution.\nIt uses Linear Regression model, the graphs are created with Plotly and the server-web FrontEnd with Flask.\n\n# 4. ML use case in Pharma: drug efficacy prediction from a random clinical trial data\n\n\u003cb\u003eML_drug_efficacy prediction_clinical_study_SQL_randomDB.py:\u003c/b\u003e ML model for clinical study - prediction of the drug efficacy based on the dose and age. Usage of a linear regression model.\n\n\u003cb\u003e DISCLAIMER: the model is trained with a random dataset of clinical study created just for the purpose of validating the end-to-end process of ML model creation \u003c/b\u003e\n\n# 5. Recommendation System \n\u003cb\u003eRecommendation_System.ipynb:\u003c/b\u003e application that generates a list of the 10 more similar movies to a given movie. It calculates the item similarity matrix and the user similarity matrix using the cosine similarity. It uses two databases: movies.csv contains a table with 4.800 movies and ratings.csv contains a list with 100.835 user ratings. \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvicperal%2Fai-genai_projects","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvicperal%2Fai-genai_projects","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvicperal%2Fai-genai_projects/lists"}