https://github.com/adamrossnelson/collegeapps
Repo to support research related to college applications.
https://github.com/adamrossnelson/collegeapps
Last synced: 3 months ago
JSON representation
Repo to support research related to college applications.
- Host: GitHub
- URL: https://github.com/adamrossnelson/collegeapps
- Owner: adamrossnelson
- License: mit
- Created: 2018-04-09T16:28:09.000Z (about 7 years ago)
- Default Branch: master
- Last Pushed: 2018-05-24T23:22:39.000Z (almost 7 years ago)
- Last Synced: 2025-01-10T04:54:24.887Z (4 months ago)
- Language: Stata
- Size: 160 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# CollegeApps
Repo to support research related to college applications.## File Descriptions
**Gathering data**
* `ColAppScrape.do` - Do file that prepares smaller version of `IPEDSDirInfo02to16.dta`.
* `ColAppScrape.jpynb` - Notebook designed to web scrape paper / pdf undergradute college applications. Uses the list of web domains provided by `ColAppScrape.do`.
* `ColEarlyDecScrape.ipynb` - Notebook that provided a prototype for `ColAppScrape.jpynb`.**Processing data**
* `ColAppTextProc.ipynb` - Notebook that pre-processes text data scraped by `ColAppScrape.jpynb`.
* `ColAppTextProcDemo.ipynb` - Notebook that provides a prototype for `ColAppTextProc.ipynb`.**Testing data**
* ` ColAppTester.ipynb` - Notebook that applies applies machine learning to predict if a pdf is an undergraduate application for admission.
**Subfolder `App_Rec_Train`**
* Scratch space used to prepare training data.