Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/hyparam/csv-validator

CSV validator for Guardrails AI
https://github.com/hyparam/csv-validator

ai csv guardrails

Last synced: 2 months ago
JSON representation

CSV validator for Guardrails AI

Awesome Lists containing this project

README

        

# Guardrails CSV Validator

![csv-validator](csv-validator.jpg)

[![apache license](https://img.shields.io/badge/License-Apache2-blue.svg)](https://opensource.org/licenses/Apache-2-0)

|||
|---| --- |
| Developed by | Hyperparam |
| Date of development | Feb 15, 2024 |
| Validator type | Format |
| Blog | |
| License | Apache 2 |
| Input/Output | Output |

## Description

### Intended Use

A CSV validator for [Guardrails AI](https://www.guardrailsai.com/).

This validator checks for various CSV issues such as mismatched column lengths, or mismatched quote delimiters.

### Requirements

* Dependencies:
- guardrails-ai>=0.4.0

## Installation

```bash
$ guardrails hub install hub://hyparam/csv_validator
```

## Usage Examples

### Validating string output via Python

In this example, we apply the validator to a string output generated by an LLM.

```python
# Import Guard and Validator
from guardrails.hub import CsvMatch
from guardrails import Guard

# Setup Guard
guard = Guard().use(
CsvMatch
)

guard.validate("name,email\njohn,[email protected]\njane,[email protected]") # Validator passes
guard.validate("name,email\njohn\njane,[email protected]") # Validator fails
```

### Validating JSON output via Python

In this example, we apply the validator to a string field of a JSON output generated by an LLM.

```python
# Import Guard and Validator
from pydantic import BaseModel, Field
from guardrails.hub import CsvMatch
from guardrails import Guard

# Initialize Validator
val = CsvMatch()

# Create Pydantic BaseModel
class DbBackup(BaseModel):
db_name: str
data: str = Field(validators=[val])

# Create a Guard to check for valid Pydantic output
guard = Guard.from_pydantic(output_class=DbBackup)

# Run LLM output generating JSON through guard
guard.parse("""
{
"db_name": "USERS",
"data": "name,email\njohn,[email protected]\njane,[email protected]"
}
""")
```

# API Reference

**`__init__(self, on_fail="noop")`**


    Initializes a new instance of the CsvMatch class.

    **Parameters**
    - **`delimiter`** *(str)*: String delimiter for csv. Defaults to `,`.
    - **`on_fail`** *(str, Callable)*: The policy to enact when a validator fails. If `str`, must be one of `reask`, `fix`, `filter`, `refrain`, `noop`, `exception` or `fix_reask`. Otherwise, must be a function that is called when the validator fails.



**`validate(self, value, metadata) -> ValidationResult`**


    Validates the given `value` using the rules defined in this validator, relying on the `metadata` provided to customize the validation process. This method is automatically invoked by `guard.parse(...)`, ensuring the validation logic is applied to the input data.

    Note:

    1. This method should not be called directly by the user. Instead, invoke `guard.parse(...)` where this method will be called internally for each associated Validator.
    2. When invoking `guard.parse(...)`, ensure to pass the appropriate `metadata` dictionary that includes keys and values required by this validator. If `guard` is associated with multiple validators, combine all necessary metadata into a single dictionary.

    **Parameters**
    - **`value`** *(Any):* The input value to validate.
    - **`metadata`** *(dict):* A dictionary containing metadata required for validation. No additional metadata keys are needed for this validator.