An open API service indexing awesome lists of open source software.

https://github.com/dimagi/language-dashboard


https://github.com/dimagi/language-dashboard

Last synced: 5 months ago
JSON representation

Awesome Lists containing this project

README

          

# LRL Evaluation Portal - Low-Resource Language Model Benchmarking

A comprehensive web dashboard for visualizing and analyzing the performance of Large Language Models (LLMs) across low-resource languages. This application provides interactive visualizations, comparative analysis, and detailed metrics for evaluating AI-generated text quality.

## Overview

This project aims to close the gap in AI development and evaluation by systematically evaluating the quality of sentences generated by different LLMs across multiple African languages. Through careful assessment of AI-generated text across critical dimensions, we gather essential feedback from expert reviewers that helps us understand the current strengths and weaknesses of AI-generated text in these languages.

## Features

### Round 3 Analysis (Q4 2025)
- **8 Nigerian Languages**: Bura-Pabir, Fulani, Hausa, Igbo, Marghi, Nigerian Pidgin, Shuwa Arabic, Yoruba
- **3 Metrics** (1-7 scale):
- **Clarity**: How easy it is to read and understand the text
- **Naturalness**: Whether the sentence sounds like native speech
- **Correctness**: Technical accuracy (spelling, grammar, verb tenses)
- **Primary/Secondary Data Sources**: Compare evaluations from different reviewers
- **Interactive Charts**: Bar charts with error bars for each language
- **Model Filtering**: Filter by provider (Anthropic, Google, OpenAI) and individual models
- **Language Navigation**: Browse by individual language or view all languages

### Round 2 Analysis
- **12 African Languages** organized by geographic regions:
- **East African**: Amharic, Swahili, Luo
- **West African**: Yoruba, Hausa, Kanuri, Twi, Wolof, Yemba
- **Southern African**: Chichewa
- **Central African**: Luganda, Ewondo
- **5 Metrics**:
- **Readability** (1-7 scale): How easy it is to read and understand the translation
- **Adequacy** (1-7 scale): How accurately the translation captures the original meaning
- **Grammatical Correct (%)** (0-100%): Percentage of grammatically correct sentences
- **Real Words (%)** (0-100%): Percentage of sentences with only real words
- **Notable Error (%)** (0-100%): Percentage of sentences with notable errors (lower is better)
- **Primary/Secondary Reviewer**: Compare evaluations from different reviewers
- **Language Group Filtering**: Browse by geographic region or individual language
- **Interactive Charts**: Bar charts with error bars, percentage scales for percentage metrics

### Common Features
- **Dark/Light Theme**: Toggle between themes
- **Responsive Design**: Optimized for desktop and mobile devices
- **Summary Statistics**:
- Overall Leader (best performing model)
- Languages Analyzed count
- Models Compared count
- Total Samples count
- **Data Export**: Download filtered data as CSV or JSON
- **Round Toggle**: Switch between Round 2 and Round 3 analyses
- **Metric Tooltips**: Hover over metrics to see definitions

## Project Structure

```
web/
├── public/
│ ├── data/ # Round 3 CSV files
│ └── round2/ # Round 2 CSV files (by language)
├── src/
│ ├── components/
│ │ ├── Controls.jsx # Round 3 filtering controls
│ │ ├── ControlsRound2.jsx # Round 2 filtering controls
│ │ ├── LanguageChart.jsx # Round 3 chart component
│ │ ├── LanguageChartRound2.jsx # Round 2 chart component
│ │ └── Toast.jsx # Toast notification component
│ ├── pages/
│ │ ├── LandingPage.jsx # Landing page with analysis cards
│ │ ├── Dashboard.jsx # Round 3 dashboard
│ │ └── DashboardRound2.jsx # Round 2 dashboard
│ ├── utils/
│ │ ├── data.js # Round 3 data loading and processing
│ │ └── dataRound2.js # Round 2 data loading and processing
│ ├── App.jsx # Main app component with routing
│ └── main.jsx # Entry point
└── package.json
```

## Getting Started

### Prerequisites

- Node.js (v18 or higher)
- npm or yarn

### Installation

1. Navigate to the `web` directory:
```bash
cd web
```

2. Install dependencies:
```bash
npm install
# or
yarn install
```

### Development

Start the development server:
```bash
npm run dev
# or
yarn dev
```

The application will be available at `http://localhost:5173`

### Building for Production

Build the application:
```bash
npm run build
# or
yarn build
```

Preview the production build:
```bash
npm run preview
# or
yarn preview
```

## Usage

### Landing Page

The landing page provides:
- **Project Motivation**: Overview of the project goals
- **Available Analyses**: Cards for each round of analysis
- Click on a round card to see a preview with:
- Languages included
- Model performance summary
- Navigation options (View Full Analysis, Browse by Language/Group)

### Dashboard Navigation

#### Round 3 Dashboard
- **Metrics**: Toggle between Clarity, Naturalness, and Correctness
- **Data Source**: Switch between Primary and Secondary evaluations
- **Providers**: Filter by Anthropic, Google, or OpenAI
- **Models**: Select specific models within each provider
- **Language Selector**: When viewing a single language, switch between languages without returning to the landing page

#### Round 2 Dashboard
- **Metrics**: Toggle between Readability, Adequacy, Grammatical Correct (%), Real Words (%), and Notable Error (%)
- **Reviewer**: Switch between Primary and Secondary reviewers
- **Providers**: Filter by Anthropic, Google, or OpenAI
- **Models**: Select specific models within each provider
- **Language Group/Language Selector**: Browse by geographic region or individual language

### Features

- **Interactive Charts**: Hover over bars to see detailed statistics
- **Summary Cards**: View overall leader, language count, model count, and total samples
- **Download Data**: Export current filtered view as CSV or JSON
- **Round Toggle**: Switch between Round 2 and Round 3 from the dashboard
- **Theme Toggle**: Switch between dark and light themes

## Technologies Used

- **React 19**: UI framework
- **Vite**: Build tool and dev server
- **React Router DOM**: Client-side routing
- **Recharts**: Interactive chart library
- **PapaParse**: CSV parsing
- **Lodash**: Data manipulation utilities
- **Lucide React**: Icon library

## License

BSD 3-Clause License

Copyright (c) 2025, Dimagi Inc., Cory Zue

See [LICENSE](../LICENSE) for full license text.

## Contributing

This is an internal project for evaluating LLM performance on low-resource languages. For questions or contributions, please contact the project maintainers.