Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/ryan-williams/pandas-groupby-axis-1


https://github.com/ryan-williams/pandas-groupby-axis-1

Last synced: about 2 months ago
JSON representation

Awesome Lists containing this project

README

        

# pandas-groupby-axis-1
Use case for Pandas' `groupby(axis=1)` (now deprecated: [pandas#51203]), and 2 example workarounds (using 2 and 4 "transpose" operations, resp.)

- [proj.py] shows a simple example of using `groupby(axis=1)`
- [The `tt` branch][tt diff] shows one workaround for the deprecation:
- Transpose before `.groupby`
- Transpose logic in the function passed to DataFrameGroupBy.apply
- Transpose again after `.apply`
- [The `tttt` branch][tttt diff] shows a simpler workaround, but which introduces 4 transpose operations:
- Transpose before `.groupby`
- Transpose at the start of the function passed to DataFrameGroupBy.apply
- Transpose at the end of the function passed to DataFrameGroupBy.apply
- Transpose again after `.apply`

Input table

[ytds.csv](ytds.csv)




crashes
cyclist
driver
passenger
pedestrian



cur_ytd
prv_end
prv_ytd
cur_ytd
prv_end
prv_ytd
cur_ytd
prv_end
prv_ytd
cur_ytd
prv_end
prv_ytd
cur_ytd
prv_end
prv_ytd


county



















Atlantic
3
36
3
0
2
0
2
17
1
0
7
0
1
13
2


Bergen
5
36
2
0
1
0
2
21
0
0
4
0
3
12
2


Camden
3
41
3
0
5
0
0
19
1
0
7
2
3
11
0


Cape May
0
7
1
0
0
0
0
4
1
0
1
0
0
2
0


Essex
2
50
6
0
2
0
0
23
1
0
5
1
2
24
4


Gloucester
1
33
2
0
1
0
1
22
2
0
7
1
0
5
0


Hudson
2
25
2
0
3
0
1
11
1
0
3
0
1
10
1


Hunterdon
1
4
1
0
1
1
0
3
0
0
0
0
1
0
0


Mercer
2
31
1
0
0
0
2
16
0
0
3
0
0
12
1


Middlesex
6
62
8
0
2
0
2
32
5
2
9
0
4
21
3


Monmouth
5
38
5
0
4
1
1
18
0
3
7
0
2
9
4


Morris
0
22
2
0
0
0
0
14
1
0
4
0
0
4
1


Ocean
6
41
3
0
1
0
2
28
2
0
7
0
4
8
1


Passaic
1
24
1
0
0
0
0
15
1
0
1
0
1
9
0


Union
1
34
5
0
2
0
0
13
3
0
6
1
1
15
2


Warren
0
12
1
0
0
0
0
8
0
0
1
0
0
3
1


Burlington
3
34
0
0
1
0
2
26
0
1
3
0
0
6
0


Cumberland
1
20
0
0
0
0
0
13
0
0
5
0
1
4
0


Salem
0
11
0
0
0
0
0
8
0
0
2
0
0
2
0


Somerset
0
22
0
0
0
0
0
14
0
0
4
0
0
6
0


Sussex
0
6
0
0
0
0
0
6
0
0
2
0
0
1
0

Output table

[proj.csv](proj.csv)




crashes
cyclist
driver
passenger
pedestrian



roy
projected
roy
projected
roy
projected
roy
projected
roy
projected


county














Atlantic
33
36
2
2
17
19
6
6
11
12


Bergen
39
44
1
1
21
23
4
4
10
13


Camden
38
41
5
5
16
16
5
5
13
16


Cape May
5
5
0
0
3
3
1
1
2
2


Essex
41
43
2
2
20
20
4
4
19
21


Gloucester
30
31
1
1
19
20
5
5
5
5


Hudson
23
25
3
3
10
11
3
3
9
10


Hunterdon
3
4
0
0
3
3
0
0
1
2


Mercer
33
35
0
0
16
18
3
3
10
10


Middlesex
53
59
2
2
26
28
10
12
19
23


Monmouth
33
38
3
3
17
18
9
12
5
7


Morris
18
18
0
0
12
12
4
4
3
3


Ocean
41
47
1
1
26
28
6
6
9
13


Passaic
23
24
0
0
13
13
1
1
9
10


Union
27
28
2
2
9
9
5
5
12
13


Warren
10
10
0
0
7
7
1
1
2
2


Burlington
34
37
1
1
26
28
4
5
5
5


Cumberland
19
20
0
0
12
12
5
5
5
6


Salem
10
10
0
0
7
7
2
2
2
2


Somerset
20
20
0
0
13
13
4
4
5
5


Sussex
5
5
0
0
5
5
2
2
1
1

[pandas#51203]: https://github.com/pandas-dev/pandas/issues/51203
[proj.py]: proj.py
[DataFrameGroupBy.apply]: https://pandas.pydata.org/docs/reference/api/pandas.core.groupby.DataFrameGroupBy.apply.html
[tt diff]: https://github.com/ryan-williams/pandas-groupby-axis-1/commit/tt
[tttt diff]: https://github.com/ryan-williams/pandas-groupby-axis-1/commit/tttt