Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/os-climate/financial-entity-cleaner

cleaning for entity matching
https://github.com/os-climate/financial-entity-cleaner

Last synced: 7 days ago
JSON representation

cleaning for entity matching

Awesome Lists containing this project

README

        

# financial-entity-cleaner
The financial-entity-cleaner is a library that is part of the Entity-Matching project developed by OS-Climate Foundation. The main purpose of the financial-cleaner is to provide methods for validation and standardization of data used in the banking industry as to solve the problem of determining if two entities in a data set refer to the same real-world object (entity matching).

Currently, the library provides three main components:
- a validator for banking identifiers (Sedol,Isin and Lei),
- a validator for country information, and
- a cleaner for company's name.

## Install from PyPi

```
pip install financial-entity-cleaner
```

## How to use the library

The following jupyter notebooks teaches how to use the library:

- [How to clean a company's name](https://github.com/os-climate/financial-entity-cleaner/blob/main/notebooks/how-to/Clean%20up%20company's%20name.ipynb)
- [How to normalize country information](https://github.com/os-climate/financial-entity-cleaner/blob/main/notebooks/how-to/Normalize%20country%20information.ipynb)
- [How to validate banking ids, such as: LEI, ISIN and SEDOL](https://github.com/os-climate/financial-entity-cleaner/blob/main/notebooks/how-to/Clean%20up%20and%20validate%20banking%20IDs.ipynb)