{"id":22901530,"url":"https://github.com/p5quared/datafest23","last_synced_at":"2026-04-24T23:34:18.375Z","repository":{"id":155398266,"uuid":"628331249","full_name":"p5quared/Datafest23","owner":"p5quared","description":"First place submission (for all categories) for ASA Datafest '23.","archived":false,"fork":false,"pushed_at":"2026-02-11T14:25:44.000Z","size":1987,"stargazers_count":0,"open_issues_count":4,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2026-04-04T10:19:49.566Z","etag":null,"topics":["pandas","statistics"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/p5quared.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2023-04-15T16:03:32.000Z","updated_at":"2025-05-27T16:55:01.000Z","dependencies_parsed_at":"2024-01-13T23:45:27.652Z","dependency_job_id":"d6351b57-9ec6-4d2b-b809-494fc4060559","html_url":"https://github.com/p5quared/Datafest23","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/p5quared/Datafest23","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/p5quared%2FDatafest23","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/p5quared%2FDatafest23/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/p5quared%2FDatafest23/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/p5quared%2FDatafest23/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/p5quared","download_url":"https://codeload.github.com/p5quared/Datafest23/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/p5quared%2FDatafest23/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32245149,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-24T13:21:15.438Z","status":"ssl_error","status_checked_at":"2026-04-24T13:21:15.005Z","response_time":64,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["pandas","statistics"],"created_at":"2024-12-14T01:34:32.487Z","updated_at":"2026-04-24T23:34:18.359Z","avatar_url":"https://github.com/p5quared.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Datafest 23'\nDatafest is a data-focused hackathon organized by the [American Statistical Society (ASA)](https://ww2.amstat.org/education/datafest/datafestinabox.cfm).\nIn it, teams are given a dataset and a problem to solve. The teams have limited time to analyze the data and compile\na presentation to be studied and judged.\n\nThis year, the dataset came from the American Bar Association's organization for [free legal answers](https://ny.freelegalanswers.org/)\n, a program to provide free legal advice to people who cannot afford a lawyer. The dataset contains information about \nthe users, the questions they asked, and the answers they received. The problem centered around better understanding\ntheir userbase and its needs, and coming up with ways to improve the platform's effectiveness.\n\n# Our Analysis\n![Introduction](slides/Intro.png)\n## Introduction - User Demographics\nIn analyzing the user demographics, we came across a couple interesting findings, or data points we considered to be unexpected.\nThe three categories were:\n* Gender\n* Question Categories\n* User Geography\n\n### Gender\nWomen are overrepresented, with over 2/3 of users self-identified as female, versus about 50% of the general population.\n### Categories\nThe most common category by quite a wide margin is family law, with over 1/3 of all questions being in this category.\nThis might explain the high number of female users.\n### Geography\nStudying the geographic distribution of users, we found that the most of New England, and the Pacific Northwest are\nunderrepresented, while the South and Midwest are represented proportionally, and two outliers, Wyoming and Indiana were\nhighly overrepresented.\n\n![Equality of Access](slides/EqAccess.png)\n## Equality of Access\nIn this section, we looked closer at the geographic distribution of our users, and found that there are some areas\nwhere access to legal advice is not equal. Our analysis concluded that one of the primary reasons for this desparity\nis the financial qualification rules for the program. The program is only available to people who make less than some\namount of money, and the amount of money varies by state. In states like California and Washington, the income limit\nis disproportionately low compared to the median income in the state, while in states like Wyoming and Indiana, the\nincome limit is disproportionately high compared to the median income in the state.\n\nTherefore, to improve access to legal advice, we recommend that the program adjust its income limits to be more\nproportional to the median income in each state. This would allow the program to reach more people in states with \nhigher median incomes.\n\n![Supply and Demand Equilibrium](slides/DemandEq.png)\n## Supply and Demand Equilibrium\nIn this section, we looked at the supply and demand of legal advice. We came to the assumption that the ratio of 'questions\nasked to answers given' is a good indicator of the supply and demand of legal advice. We found that the ratio is not \nfixed, and varies quite a lot through the year, and differently across categories. For example, we can see the two most\nin demand periods for legal advice pertaining to income maintenance fall in March and September, each month coinciding with\ntax filing deadline and the extended tax filing deadline, which makes sense.\n\nFrom this chart, we can infer the best times to recruit new volunteers to answer questions, and the best times to recruit\nmore users to ask questions. For example, lawyers seem to be much more available to volunteer during October and December\nin general, which implies these might generally be the best times to recruit new volunteers to the program.\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fp5quared%2Fdatafest23","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fp5quared%2Fdatafest23","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fp5quared%2Fdatafest23/lists"}