{"id":22994000,"url":"https://github.com/vojay-dev/biasight","last_synced_at":"2026-02-13T03:49:50.134Z","repository":{"id":267982565,"uuid":"874873120","full_name":"vojay-dev/biasight","owner":"vojay-dev","description":"Gender Bias Detection on Websites using AI","archived":false,"fork":false,"pushed_at":"2025-03-16T12:07:36.000Z","size":3404,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-03-16T13:22:04.411Z","etag":null,"topics":["ai","fastapi","gemini","gender-bias","gender-equality","google-cloud","python","vertex-ai"],"latest_commit_sha":null,"homepage":"https://biasight.com/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/vojay-dev.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-10-18T16:05:43.000Z","updated_at":"2025-03-16T12:07:39.000Z","dependencies_parsed_at":null,"dependency_job_id":"bdc9f476-73bc-44eb-bddd-99e9fca002b5","html_url":"https://github.com/vojay-dev/biasight","commit_stats":null,"previous_names":["vojay-dev/biasight"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vojay-dev%2Fbiasight","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vojay-dev%2Fbiasight/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vojay-dev%2Fbiasight/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vojay-dev%2Fbiasight/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/vojay-dev","download_url":"https://codeload.github.com/vojay-dev/biasight/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246819283,"owners_count":20839086,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","fastapi","gemini","gender-bias","gender-equality","google-cloud","python","vertex-ai"],"created_at":"2024-12-15T05:16:33.978Z","updated_at":"2026-02-13T03:49:45.115Z","avatar_url":"https://github.com/vojay-dev.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# BiaSight - Gender Bias Detection on Websites using AI\n\n![logo](doc/logo.png)\n\nWords matter. In a world where gender inequality persists despite decades of progress, BiaSight addresses one of the\nmost pervasive yet often overlooked aspects of discrimination: the language we use in our digital spaces. BiaSight uses\nthe power of Google's cutting-edge AI, including Gemini, to analyze and improve the inclusivity of online content.\n\nWhile content creators and website authors often focus on performance, usability, and visual appeal, the impact of words\non discrimination against women and girls and how this impacts equality is frequently underestimated. BiaSight aims to\nchange this by providing an intuitive, AI-driven analysis of web content across various equality categories, much like\nhow Google PageSpeed Insights has become an indispensable tool for web performance optimization.\n\nThe vision of BiaSight is to make gender-inclusive language as integral to web development as responsive design or SEO\noptimization and to inspire creators for change.\n\nRemember, words matter. They shape perceptions, influence behaviors, and can either reinforce or challenge the gender\ninequalities that persist in our society.\n\n**Try it yourself**: [biasight.com](https://biasight.com/)\n\nThis project was created as part of the [She Builds AI Hackathon 2024](https://womentechmakers.devpost.com/).\n\n![mockup](doc/mockup.png)\n\n---\n\n## Backend\n\nThe BiaSight backend is a powerful engine built with FastAPI and Python. It leverages BeautifulSoup to extract readable\ncontent from web pages, preparing it for analysis. Using Jinja templating, prompt generation is modularized, allowing\nseamless integration of web content into advanced prompts for Google’s Gemini LLM.\n\nTo ensure both accurate and deterministic results, Gemini is configured to use JSON mode for structured output and a\nlow-temperature setting is applied to minimize variability in its generation. Pydantic ensures robust data modeling and\nvalidation, while Poetry manages dependencies efficiently. Docker streamlines deployment, and Ruff, combined with\nGitHub Actions, maintains high code quality through automated testing and linting.\n\nFor optimal performance and user experience, the backend employs a TTLCache, reducing analysis time by caching recent\nresults. This architecture fosters easy and secure extensibility, allowing for future enhancements and integrations as\nBiaSight continues to evolve.\n\n## Frontend\n\nThe frontend is powered by Vue 3 and Vite, supported by daisyUI and Tailwind CSS for efficient frontend development.\nTogether, these tools provide users with a sleek and modern interface for seamless interaction with the backend.\n\n![architecture](doc/architecture.png)\n\nThis is the backend part of the project. **Frontend**: [biasight-ui](https://github.com/vojay-dev/biasight-ui)\n\n---\n\n\n## Tech stack\n\n- Python 3.12 + [FastAPI](https://fastapi.tiangolo.com/) API development\n- [Jinja](https://jinja.palletsprojects.com/) templating for modular prompt generation\n- [Pydantic](https://docs.pydantic.dev/latest/) for data modeling and validation\n- [Poetry](https://python-poetry.org/) for dependency management\n- [Docker](https://www.docker.com/) for deployment\n- [Gemini](https://cloud.google.com/vertex-ai/docs/generative-ai/model-reference/gemini) via [VertexAI](https://cloud.google.com/vertex-ai) for evaluating web content\n- [BeautifulSoup](https://www.crummy.com/software/BeautifulSoup/bs4/doc/) for extracting content from web pages\n- [Ruff](https://docs.astral.sh/ruff/) as linter and code formatter together with [pre-commit](https://pre-commit.com/) hooks\n- [Github Actions](https://github.com/features/actions) to automatically run tests and linter on every push\n\n## Makefile\n\nThe project includes a `Makefile` with common tasks like setting up the virtual environment with Poetry, running the\nservice locally and within Docker, running test, linter and more. Simply run:\n```sh\nmake help\n```\nto get an overview of all available tasks.\n\n![make help](doc/make-help.png)\n\n## Configuration\n\n**Prerequisite**\n\n- GCP project with VertexAI API enabled and access to Gemini (recommended: `gemini-1.5-flash-002` or `gemini-1.5-pro-002`)\n- JSON credentials file for GCP Service Account with VertexAI permissions\n\nThe API is configured via environment variables. If a `.env` file is present in the project root, it will be loaded\nautomatically. You can copy the `.env.dist` file from the repository as a basis.\n\nThe following variables must be set:\n\n- `GCP_PROJECT_ID`: The ID of the Google Cloud Platform (GCP) project used for VertexAI and Gemini.\n- `GCP_LOCATION`: The location used for prediction processes.\n- `GCP_SERVICE_ACCOUNT_FILE`: The path to the service account file used for authentication with GCP.\n\n**Gemini model**\n\nThe default model used for Gemini is `gemini-1.5-flash-002`. To use a different model, simply adjust the `GCP_GEMINI_MODEL`\nconfig in the `.env` file. For this use-case, the Flash model delivers good and cost-efficient results.\n\n## Project setup\n\n**(Optional) Configure poetry to use in-project virtualenvs**:\n```sh\npoetry config virtualenvs.in-project true\n```\n\n**Install dependencies**:\n```sh\npoetry install\n```\n\n**Run**:\n\n*Please check the Configuration section to ensure all requirements are met.*\n```sh\ncurl -s -X POST localhost:8000/analyze \\\n  -H 'Content-Type: application/json' \\\n  -d '{\"uri\": \"https://womentechmakers.devpost.com/\"}' | jq .\n```\n\n![example](doc/example.png)\n\n## Docker\n\nAll Docker commands are also encapsulated in the `Makefile` for convenience.\n\n### Build\n\n```sh\ndocker build -t biasight .\n```\n\n### Run\n\n```sh\ndocker run -d --rm --name biasight -p 9091:9091 biasight\ncurl -s -X POST localhost:9091/analyze \\\n  -H 'Content-Type: application/json' \\\n  -d '{\"uri\": \"https://womentechmakers.devpost.com/\"}' | jq .\ndocker stop biasight\n```\n\n### Save image for deployment\n\n```sh\ndocker save biasight:latest | gzip \u003e biasight_latest.tar.gz\n```\n\n## Gemini interaction\n\nGemini interaction is encapsulated in the `GeminiClient` class. To ensure a high quality of prompt responses and to\navoid unnecessary parsing issues. The `GeminiClient` class uses the Gemini JSON format mode.\n\nSee: [https://ai.google.dev/gemini-api/docs/structured-output?lang=python](https://ai.google.dev/gemini-api/docs/structured-output?lang=python)\n\nThis ensures Gemini replies with valid JSON, whereas the schema is attached to the individual prompt, for example:\n```\nReturn your analysis in this JSON format:\n\n{\n  \"summary\": str,\n  \"stereotyping_feedback\": str,\n  \"stereotyping_score\": int,\n  \"stereotyping_example\": str,\n  \"representation_feedback\": str,\n  \"representation_score\": int,\n  \"representation_example\": str,\n  \"language_feedback\": str,\n  \"language_score\": int,\n  \"language_example\": str,\n  \"framing_feedback\": str,\n  \"framing_score\": int,\n  \"framing_example\": str,\n  \"positive_aspects\": str,\n  \"improvement_suggestions\": str,\n  \"male_to_female_mention_ratio\": float,\n  \"gender_neutral_language_percentage\": float\n}\n```\n\nThis approach is then combined with Pydantic models to ensure the correctness of datatypes and the overall structure:\n```py\n    def analyze(self, text: str) -\u003e AnalyzeResult:\n        prompt = self._render_template(text)\n        chat: ChatSession = self.gemini_client.start_chat()\n        chat_response: str = self.gemini_client.get_chat_response(chat, prompt)\n\n        analyze_result = AnalyzeResult.model_validate(from_json(chat_response))\n\n        # overall score is calculated via Python instead of using the LLM to ensure deterministic results\n        analyze_result.overall_score = self._calculate_score(analyze_result)\n\n        return analyze_result\n```\n\nThis is a great example how to programmatically interact with Gemini, ensure the quality of the responses and use a LLM\nto cover core business logic.\n\n## Score calculation\n\nAs a first step, a score is assigned by Gemini for each of the four bias categories: stereotyping, representation,\nlanguage, and framing. The overall score is then calculated using the average of the four categories.\n\nThen, two additional factors are applied:\n\n* **Ratio Boost**: A bonus is added to the base score based on the male-to-female mention ratio. The closer the ratio is  \n  to 1 (meaning equal mentions), the higher the bonus, with a maximum boost of 30% when the ratio is exactly 1.\n* **Neutral Language Boost**: A bonus is added based on the percentage of gender-neutral language used in the text. The\n  higher the percentage of gender-neutral language, the higher the bonus, with a maximum boost of 10% when the language\n  is 100% gender-neutral.\n\nThese boosts aim to acknowledge and reward content with a more balanced gender representation and inclusive language.\n\nThe final overall score is then capped between 1 (extremely biased) and 100 (completely free of bias), providing a\ncomprehensive evaluation of the content's inclusivity.\n\n$$\n\\begin{align*}\n\\text{Base Score} \u0026= \\frac{Stereotyping Score + Representation Score + Language Score + Framing Score}{4} \\\\\n\\\\\n\\text{Ratio Boost} \u0026= \\begin{cases}\n30 \\times (1 - |1 - Male To Female Mention Ratio|) \u0026 \\text{if } Male To Female Mention Ratio \u003e 0 \\\\\n0 \u0026 \\text{if } Male To Female Mention Ratio = 0\n\\end{cases} \\\\\n\\\\\n\\text{Neutral Language Boost} \u0026= \\frac{Gender Neutral Language Percentage}{100} \\times 10 \\\\\n\\\\\n\\text{Boosted Score} \u0026= \\text{Base Score} \\times \\left( 1 + \\frac{\\text{Ratio Boost}}{100} + \\frac{\\text{Neutral Language Boost}}{100} \\right) \\\\\n\\\\\n\\text{Final Score} \u0026= \\text{round}(\\text{max}(1, \\text{min}(100, \\text{Boosted Score})))\n\\end{align*}\n$$\n\n## Example\n\n```sh\ncurl -s -X POST localhost:8000/analyze \\\n  -H 'Content-Type: application/json' \\\n  -d '{\"uri\": \"https://womentechmakers.devpost.com/\"}' | jq .\n```\n\n```json\n{\n  \"uri\": \"https://womentechmakers.devpost.com/\",\n  \"result\": {\n    \"summary\": \"The webpage shows a strong commitment to gender equality through its focus on a hackathon addressing UN SDG 5.  However, while the language used is largely inclusive, the high number of mentions related to women and girls compared to men could be perceived as unbalanced.  Further improvements could enhance the overall inclusivity.\",\n    \"overall_score\": 96,\n    \"stereotyping_feedback\": \"The webpage avoids reinforcing traditional gender stereotypes. The focus is on addressing gender inequality, not perpetuating it.\",\n    \"stereotyping_score\": 95,\n    \"stereotyping_example\": \"The hackathon's theme directly challenges gender inequality by focusing on UN SDG 5.\",\n    \"representation_feedback\": \"While the hackathon aims for inclusivity, the overwhelming focus on women and girls in the description might inadvertently overshadow the participation of other genders.\",\n    \"representation_score\": 75,\n    \"representation_example\": \"The repeated emphasis on \\\"women and girls\\\" in the description and prize categories.\",\n    \"language_feedback\": \"The language used is largely gender-neutral and inclusive, using terms like \\\"participants\\\" instead of gendered terms. However, the frequent mention of \\\"women and girls\\\" could be balanced.\",\n    \"language_score\": 85,\n    \"language_example\": \"The use of \\\"participants\\\" instead of gender-specific terms like \\\"participants\\\" and the explicit statement that the hackathon is open to all genders.\",\n    \"framing_feedback\": \"The framing of the hackathon positively promotes gender equality and empowerment.  There is no victim-blaming or minimization of women's experiences.\",\n    \"framing_score\": 90,\n    \"framing_example\": \"The hackathon's focus on UN SDG 5 and its emphasis on addressing real-world challenges faced by women and girls.\",\n    \"positive_aspects\": \"The webpage's clear commitment to gender equality through its focus on a hackathon addressing UN SDG 5 is commendable. The use of inclusive language and the explicit statement welcoming participants of all genders are positive steps.\",\n    \"improvement_suggestions\": \"1. Balance the focus on women and girls with more inclusive language that acknowledges the participation and contributions of all genders. 2.  Highlight success stories and contributions from participants of all genders in promotional materials. 3.  Ensure that judging criteria are equally applicable and unbiased towards all participants regardless of gender.\",\n    \"male_to_female_mention_ratio\": 0.1,\n    \"gender_neutral_language_percentage\": 80.0\n  }\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvojay-dev%2Fbiasight","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvojay-dev%2Fbiasight","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvojay-dev%2Fbiasight/lists"}