An open API service indexing awesome lists of open source software.

https://github.com/dsfsi/project-state-capture

Zondo Commission or State Capture Commission Transcripts
https://github.com/dsfsi/project-state-capture

dsfsi-datasets natural-language-processing nlp south-africa

Last synced: 7 months ago
JSON representation

Zondo Commission or State Capture Commission Transcripts

Awesome Lists containing this project

README

          

# South African State Capture Commision Transcripts - Zondo Commission

Give Feedback 📑: [DSFSI Resource Feedback Form](https://docs.google.com/forms/d/e/1FAIpQLSf7S36dyAUPx2egmXbFpnTBuzoRulhL5Elu-N1eoMhaO7v10w/formResponse)

## About State Capture Comission

The Judicial Commission of Inquiry into Allegations of State Capture, Corruption and Fraud in the Public Sector including Organs of State, better known as the Zondo Commission or State Capture Commission, is a public inquiry established in January 2018 by former President Jacob Zuma to investigate allegations of state capture, corruption, and fraud in the public sector in South Africa.[2][3]

Source: [https://en.wikipedia.org/wiki/Zondo_Commission](https://en.wikipedia.org/wiki/Zondo_Commission)

## About Dataset

We extracted plaintext versions of thhe published transcripts (from [https://www.statecapture.org.za/site/transcripts](https://www.statecapture.org.za/site/transcripts). There is minimal clearning but we believe these can be sued for textual analysis.

| file/folder | description| url |
|-----------------|-----|---------------|
| data/interim | Folder with individuaual *.txt* files of extracted transcripts by day. | [/data/interim/](/data/interim/) |
| state-capture-transcripts-day-1-399.txt.zip | zip file wiht all transcripts. | [state-capture-transcripts-day-1-399.txt.zip](/data/state-capture-transcripts-day-1-399.txt.zip)|

## TODOs

* Clean up the data
* Extract sentences
* Tag conversations by who is talking (speaker)

## Authors

* **Tsholofelo Gomba**
* **Vukosi Marivate** - [@vukosi](https://twitter.com/vukosi)

See also the list of [contributors](https://github.com/dsfsi/project-state-capture/contributors) who participated in this project.

## Citation

TBA

## License

Data is Licensed under CC 4.0 BY SA

Code is Licences under MIT License.