pandas_paddles.contexts#
Module summary#
Factories for closures wrapping dataframe and series context.
Classes
|
Abstract base-class for generating DataFrame and Series context closures. |
|
Build callable to access columns and operators. |
|
Build callable for series attributes and operators. |
Details#
Classes
- class pandas_paddles.contexts.ClosureFactoryBase(closures=None)[source]#
Bases:
object
Abstract base-class for generating DataFrame and Series context closures.
- Parameters
closures (Optional[Iterable[pandas_paddles.closures.ClosureBase]]) –
- __init__(closures=None)[source]#
- Parameters
closures (Optional[Iterable[pandas_paddles.closures.ClosureBase]]) – Iterable of callables to extract attributes from data frames or series.
- class pandas_paddles.contexts.DataframeContext(closures=None)[source]#
Bases:
pandas_paddles.contexts.ClosureFactoryBase
Build callable to access columns and operators.
Use the global instance like:
from pandas_paddles import DF df.loc[DF["x"] < 3]
All operations (item/attribute access, method calls) are passed to the data frame of the context.
This is useful in combination with
loc
,iloc
,assign()
and other methods that accept callables taking the data frame to act on as single argument.Examples
df = pd.DataFrame({"x": [1, 2, 3, 4]}) df.loc[DF["x"] <= 2] # Out: # x # 0 1 # 1 2
Usage with
assign()
:df.assign(y = DF["x"] * 2) # Out: # x y # 0 1 2 # 1 2 4 # 2 3 6 # 3 4 8
- Parameters
closures (Optional[Iterable[pandas_paddles.closures.ClosureBase]]) –
- __init__(closures=None)#
- Parameters
closures (Optional[Iterable[pandas_paddles.closures.ClosureBase]]) – Iterable of callables to extract attributes from data frames or series.
- wrapped_cls#
alias of
pandas.core.frame.DataFrame
- class pandas_paddles.contexts.SeriesContext(closures=None)[source]#
Bases:
pandas_paddles.contexts.ClosureFactoryBase
Build callable for series attributes and operators.
Use the global instance like:
from pandas_paddles import S s[S < 0]
All operations (item/attribute access, method calls) are passed to the series of the context.
This is useful in combination with
loc
,iloc
, and other methods that accept callables taking the series to act on as argument, e.g., .agg() after a group-by.Examples
Usage with
[]
,.loc[]
or.iloc[]
:from pandas_paddles import S s = pd.Series(range(10)) s[S <= 2] # Out: # 0 0 # 1 1 # 2 2 # dtype: int64
Aggregating a single
groupby``ed column with ``groupby(...)[col].agg()
:df = pd.DataFrame({ "x": [1, 2, 3, 4], "y": ["a", "b", "b", "a"], "z": [0.1, 0.5, 0.6, 0.9], }) df.groupby("y")["x"].agg(S.max() - S.min()) # Out: # y # a 3 # b 1 # Name: x, dtype: int64
Appying multiple aggregations to a single column:
df.groupby("y")["x"].agg([ S.max() - S.min(), S.mean(), ]) # Out: # S.max() - S.min() S.min() # y # a 3 1 # b 1 2
Aggregating multiple columns (Note: You must wrap the
S
-expressions in alist
even when using only one expression!):df.groupby("y").agg([S.min()]) # Out: # x z # S.min() S.min() # y # a 1 0.1 # b 2 0.5
Multiple
S
-expressions work the same:df.groupby("y").agg([S.min(), S.mean()]) # Out: # x z # S.min() S.mean() S.min() S.mean() # y # a 1 2.5 0.1 0.50 # b 2 2.5 0.5 0.55
S
-expressions can alsoe be passed in adict
argument to.agg()
(Again, they always need to be wrapped in alist
!):df.groupby("y").agg({ "x": [S.min(), S.mean()], "z": [S.max(), S.max() - S.min()], }) # Out: # x z # S.min() S.mean() S.max() S.max() - S.min() # y # a 1 2.5 0.9 0.8 # b 2 2.5 0.6 0.1
- Parameters
closures (Optional[Iterable[pandas_paddles.closures.ClosureBase]]) –
- __init__(closures=None)#
- Parameters
closures (Optional[Iterable[pandas_paddles.closures.ClosureBase]]) – Iterable of callables to extract attributes from data frames or series.
- wrapped_cls#
alias of
pandas.core.series.Series