https://github.com/arogozhnikov/sloths
when pandas is too much hassle
https://github.com/arogozhnikov/sloths
Last synced: 3 months ago
JSON representation
when pandas is too much hassle
- Host: GitHub
- URL: https://github.com/arogozhnikov/sloths
- Owner: arogozhnikov
- Created: 2023-11-16T07:22:38.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-01-11T01:00:28.000Z (over 1 year ago)
- Last Synced: 2025-02-08T09:46:52.293Z (4 months ago)
- Size: 7.81 KB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# sloths
when pandas is too much hassle
Sloths package targets a scenario when you want a dict of dicts.
Or a dict of dicts of dicts of lists.In 'normal' python, after a couple of levels this turns into a mess very quickly.
**Comment from later:** I realized that a lot of functionality is missing in this concept
to support my critical scenarios, so I'll be reworking the concept**before**
```python
# this one is ugly
year2company2employee2diplomas = defaultdict(lambda: defaultdict(lambda : defaultdict(list)))# this one is more or less ok
for year in years:
for company in companies:
for name, surname, diploma in get_diplomas_for_year_and_compan(year, company)
year2company2employee2diplomas[year][company][name, surname].append(diploma)# when we need to iterate the data, it is just terrible
for year, company2employee2diplomas in year2company2employee2diplomas.items():
for company, employee2diplomas in company2employee2diplomas.items():
for (name, surname), diplomas in employee2diplomas.items():
for diploma in diplomas:
finally_we_can_do_something(year, company, name, surname, diploma)# that's specially terrible if e.g. we only needed a list of all achievemnts for a company.
```Now, pandas does not help much with data until you completely collected it. Appending data to pandas on-the-go is quite a bad idea.
(and has other issues like auto-conversion of types, which you don't want to happen to the data without seeing the effect).
Sloth essentially works as a universal storage, where you can throw data to change its shape later.
**after**
```pythonsloth = Sloth()
sloth[year][company].append_at((name, surname), diploma)for company, diplomas in sloth.iterate('year:company:name surname:[diploma] -> company [diploma]'):
print(f'{company} has in total {len(diplomas)}')# and that's it, diplomas are grouped by company
```Nested collections are created automatically, and a list is also created automatically (since we pointed at this by using `append_at`).
There is no need to think about this forward anymore.