pandas_paddles.contexts#

Module summary#

Factories for closures wrapping dataframe and series context.

Classes

ClosureFactoryBase([closures])

Abstract base-class for generating DataFrame and Series context closures.

DataframeContext([closures])

Build callable to access columns and operators.

SeriesContext([closures])

Build callable for series attributes and operators.

Details#

Classes

class pandas_paddles.contexts.ClosureFactoryBase(closures=None)[source]#

Bases: object

Abstract base-class for generating DataFrame and Series context closures.

Parameters

closures (Optional[Iterable[pandas_paddles.closures.ClosureBase]]) –

__init__(closures=None)[source]#
Parameters

closures (Optional[Iterable[pandas_paddles.closures.ClosureBase]]) – Iterable of callables to extract attributes from data frames or series.

class pandas_paddles.contexts.DataframeContext(closures=None)[source]#

Bases: pandas_paddles.contexts.ClosureFactoryBase

Build callable to access columns and operators.

Use the global instance like:

from pandas_paddles import DF
df.loc[DF["x"] < 3]

All operations (item/attribute access, method calls) are passed to the data frame of the context.

This is useful in combination with loc, iloc, assign() and other methods that accept callables taking the data frame to act on as single argument.

Examples

Usage with loc or iloc:

df = pd.DataFrame({"x": [1, 2, 3, 4]})
df.loc[DF["x"] <= 2]
# Out:
#    x
# 0  1
# 1  2

Usage with assign():

df.assign(y = DF["x"] * 2)
# Out:
#    x  y
# 0  1  2
# 1  2  4
# 2  3  6
# 3  4  8
Parameters

closures (Optional[Iterable[pandas_paddles.closures.ClosureBase]]) –

__init__(closures=None)#
Parameters

closures (Optional[Iterable[pandas_paddles.closures.ClosureBase]]) – Iterable of callables to extract attributes from data frames or series.

wrapped_cls#

alias of pandas.core.frame.DataFrame

class pandas_paddles.contexts.SeriesContext(closures=None)[source]#

Bases: pandas_paddles.contexts.ClosureFactoryBase

Build callable for series attributes and operators.

Use the global instance like:

from pandas_paddles import S
s[S < 0]

All operations (item/attribute access, method calls) are passed to the series of the context.

This is useful in combination with loc, iloc, and other methods that accept callables taking the series to act on as argument, e.g., .agg() after a group-by.

Examples

Usage with [], .loc[] or .iloc[]:

from pandas_paddles import S
s = pd.Series(range(10))
s[S <= 2]
# Out:
# 0    0
# 1    1
# 2    2
# dtype: int64

Aggregating a single groupby``ed column with ``groupby(...)[col].agg():

df = pd.DataFrame({
    "x": [1, 2, 3, 4],
    "y": ["a", "b", "b", "a"],
    "z": [0.1, 0.5, 0.6, 0.9],
})
df.groupby("y")["x"].agg(S.max() - S.min())
# Out:
# y
# a    3
# b    1
# Name: x, dtype: int64

Appying multiple aggregations to a single column:

df.groupby("y")["x"].agg([
    S.max() - S.min(),
    S.mean(),
])
# Out:
#    S.max() - S.min()  S.min()
# y
# a                  3        1
# b                  1        2

Aggregating multiple columns (Note: You must wrap the S-expressions in a list even when using only one expression!):

df.groupby("y").agg([S.min()])
# Out:
#         x       z
#   S.min() S.min()
# y
# a       1     0.1
# b       2     0.5

Multiple S-expressions work the same:

df.groupby("y").agg([S.min(), S.mean()])
# Out:
#         x                z
#   S.min() S.mean() S.min() S.mean()
# y
# a       1      2.5     0.1     0.50
# b       2      2.5     0.5     0.55

S-expressions can alsoe be passed in a dict argument to .agg() (Again, they always need to be wrapped in a list!):

df.groupby("y").agg({
    "x": [S.min(), S.mean()],
    "z": [S.max(), S.max() - S.min()],
})
# Out:
#         x                z
#   S.min() S.mean() S.max() S.max() - S.min()
# y
# a       1      2.5     0.9               0.8
# b       2      2.5     0.6               0.1
Parameters

closures (Optional[Iterable[pandas_paddles.closures.ClosureBase]]) –

__init__(closures=None)#
Parameters

closures (Optional[Iterable[pandas_paddles.closures.ClosureBase]]) – Iterable of callables to extract attributes from data frames or series.

wrapped_cls#

alias of pandas.core.series.Series