What does the group_keys argument to pandas.groupby actually do?

group_keys parameter in groupby comes handy during apply operations that creates an additional index column corresponding to the grouped columns (group_keys=True) and eliminates in the case (group_keys=False) especially during the case when trying to perform operations on individual columns.

One such instance:

In [21]: gby = df.groupby('x',group_keys=True).apply(lambda row: row['x'])

In [22]: gby
Out[22]: 
x   
0  0    0
2  3    2
   4    2
3  1    3
   2    3
Name: x, dtype: int64

In [23]: gby_k = df.groupby('x', group_keys=False).apply(lambda row: row['x'])

In [24]: gby_k
Out[24]: 
0    0
3    2
4    2
1    3
2    3
Name: x, dtype: int64

One of its intended applications could be to group by one of the levels of the hierarchy by converting it to a Multi-index dataframe object.

In [27]: gby.groupby(level="x").sum()
Out[27]: 
x
0    0
2    4
3    6
Name: x, dtype: int64

Leave a Comment