https://github.com/ssaishruthi/insight_donation
This repo contains program given for insight data engineer program
https://github.com/ssaishruthi/insight_donation
src
Last synced: about 1 month ago
JSON representation
This repo contains program given for insight data engineer program
- Host: GitHub
- URL: https://github.com/ssaishruthi/insight_donation
- Owner: SSaishruthi
- Created: 2018-02-13T05:12:31.000Z (about 7 years ago)
- Default Branch: master
- Last Pushed: 2018-02-13T07:23:12.000Z (about 7 years ago)
- Last Synced: 2025-01-29T00:49:25.951Z (3 months ago)
- Topics: src
- Language: Jupyter Notebook
- Size: 18.6 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Insight_donation
This repo contains program given for insight data engineer programProblem Statement
Identify repeated donors and calculate the following for each combination of recipient, zip code and calendar year
1. Total dollars received
2. Total number of contributions received
3. Donation amount in a given percentileInput
1. Contribution file
2. Percentile fileLanguage - Python
Libraries needed
1. Pandas
2. Numpy
3. re
4. csv
5. os
6. datetime
7. decimalInput filter conditions
Eliminate records if it satisfies following conditions
1. Get only first digits from the zip code and eliminate if the field is empty or contains digits less than length 5
2. Other_id has value
3. committee id is empty
4. Amount field is empty
5. Improper date field or date in range specified
6. Improper name fieldInput needed
1. start and end year
2. Current year for which final output processing has to be done*** Path of input and output file needs to be updated before running ***
*** Both jupyter and py files are in src folder **