pandas_paddles.axis#

Module summary#

Select axis labels (columns or index) of a data frame.

Functions

intersect_indices(left, right)

union_indices(left, right)

Classes

BaseOp()

API definition of the closure object.

BinaryOp(left, right, op)

Combine two selection operators with a binary operator.

ColumnSelectionComposer([op, sample_size])

Compose callable to select or sort columns.

DtypeComposer(axis[, sample_size])

Compose callable to select columns by dtype.

DtypesOp(dtypes[, sample_size])

Select columns by dtype.

EllipsisOp()

Select all labels (i.e.

IndexSelectionComposer([op])

Compose callable to select or sort index labels.

LabelComposer(axis[, op, level])

Compose callable to select columns by name.

LabelPredicateOp(meth, args, kwargs[, level])

Select labels by a predicate, e.g.

LabelSelectionOp(labels[, level])

Explicitely select labels.

LeveledComposer(axis)

Compose callable to access multi-level index labels.

OpComposerBase(axis, op)

Base-class for composing column/row selection operations.

Selection([included, excluded, mask])

Container for selection along a data frame axis with combination logic.

SelectionComposerBase(axis[, op])

Base class to compose callable to select or sort axis labels (index and columns).

UnaryOp(wrapped, op)

Apply unary operator on selection operator.

Details#

Classes

class pandas_paddles.axis.BaseOp[source]#

Bases: object

API definition of the closure object.

__init__()#
class pandas_paddles.axis.BinaryOp(left, right, op)[source]#

Bases: pandas_paddles.axis.BaseOp

Combine two selection operators with a binary operator.

Used to implement, e.g.:

sel_1 | sel_2
Parameters
__init__(left, right, op)[source]#
Parameters
class pandas_paddles.axis.ColumnSelectionComposer(op=None, sample_size=None)[source]#

Bases: pandas_paddles.axis.SelectionComposerBase

Compose callable to select or sort columns.

This acts as global entrypoint.

Use the global instance like:

# Move columns x, z to left
from pandas_paddles import C
df.loc[:, C["x", "z"] | ...]

Other use-cases:

  • Select slices of columns, e.g., when handling Excel-like named columns (A, B, …):

    df.loc[:, C["B":"E"] | C["P":"S"]]
    
  • Select by dtype:

    df.loc[:, C.dtype == str]
    df.loc[:, C.dtype == int]
    df.loc[:, C.dtype.isin((str, int))]
    

    Note that for “non-trivial” dtypes (i.e. those stored in object-typed columns, e.g. str), a subsample of the dataframe is tested explicitely. The sample-size can be set with sample_size.

  • Select all columns starting with "PRE":

    df.loc[:, C.startswith("PRE")]
    # or just move them to the left and keep the remaining columns in
    # the data frame
    df.loc[:, C.startswith("PRE") | ...]
    
  • Access the level of a multi-index with:

    C.levels[0]
    C.levels["level-name"]
    

Selections can be combined with & (intersection) and | or + (union). In intersections, the right-most order takes precedence, while it’s the left-most for unions, e.g. the following will select all columns with first-level label “b” starting with the columns with second-level labels “Y” and “Z” followed by all other second-level labels with first-level “b”:

C.levels[0]["b"] & (C.levels[1]["Y"] | ...)

Inversion (negation) of selections is possible with ~, e.g. to select all but first-level label “b”:

~C.levels[0]["b"]

This can also be applied to composed selections:

~(C.levels[0]["b"] | C.levels[1]["X", "Y"])
__init__(op=None, sample_size=None)[source]#
get_other_op(other)#

Get/create a wrapped operation for composing operations.

property sample_size#

Sample size for dtype determination of object-typed columns.

class pandas_paddles.axis.DtypeComposer(axis, sample_size=10)[source]#

Bases: object

Compose callable to select columns by dtype.

__init__(axis, sample_size=10)[source]#
class pandas_paddles.axis.DtypesOp(dtypes, sample_size=10)[source]#

Bases: pandas_paddles.axis.BaseOp

Select columns by dtype.

Parameters
  • dtypes (Sequence) –

  • sample_size (int) –

__init__(dtypes, sample_size=10)[source]#
Parameters
  • dtypes (Sequence) –

  • sample_size (int) –

class pandas_paddles.axis.EllipsisOp[source]#

Bases: pandas_paddles.axis.BaseOp

Select all labels (i.e. columns or rows).

__init__()#
class pandas_paddles.axis.IndexSelectionComposer(op=None)[source]#

Bases: pandas_paddles.axis.SelectionComposerBase

Compose callable to select or sort index labels.

Note

Use ColumnSelectionComposer (C) if you want to select columns.

Use the global instance like:

# Move rows x, z to the top
from pandas_paddles import I
df.loc[I["x", "z"] | ...]

Other use-cases:

  • Select slices of rows:

    df.loc[I["B":"E"] | I["P":"S"]]
    
  • Select all rows with index starting with "PRE":

    df.loc[I.startswith("PRE")]
    # or just move them to the top and keep the remaining columns in
    # the data frame
    df.loc[I.startswith("PRE") | ...]
    
  • Access the level of a multi-index with:

    I.levels[0]
    I.levels["level-name"]
    

Selections can be combined with & (intersection) and | or + (union). In intersections, the right-most order takes precedence, while it’s the left-most for unions, e.g. the following will select all rows with first-level label “b” starting with the rows with second-level labels “Y” and “Z” followed by all other second-level labels with first-level “b”:

I.levels[0]["b"] & (I.levels[1]["Y"] | ...)

Inversion (negation) of selections is possible with ~, e.g. to select all but first-level label “b”:

~I.levels[0]["b"]

This can also be applied to composed selections:

~(I.levels[0]["b"] | I.levels[1]["X", "Y"])
__init__(op=None)[source]#
get_other_op(other)#

Get/create a wrapped operation for composing operations.

class pandas_paddles.axis.LabelComposer(axis, op=None, level=None)[source]#

Bases: pandas_paddles.axis.OpComposerBase

Compose callable to select columns by name.

Columns can be selected by name or string predicates: - startswith - endswith - contains - match which are passed through to pd.Series.str.

__init__(axis, op=None, level=None)[source]#
get_other_op(other)#

Get/create a wrapped operation for composing operations.

class pandas_paddles.axis.LabelPredicateOp(meth, args, kwargs, level=None)[source]#

Bases: pandas_paddles.axis.BaseOp

Select labels by a predicate, e.g. startswith.

__init__(meth, args, kwargs, level=None)[source]#
class pandas_paddles.axis.LabelSelectionOp(labels, level=None)[source]#

Bases: pandas_paddles.axis.BaseOp

Explicitely select labels.

__init__(labels, level=None)[source]#
class pandas_paddles.axis.LeveledComposer(axis)[source]#

Bases: object

Compose callable to access multi-level index labels.

__init__(axis)[source]#
class pandas_paddles.axis.OpComposerBase(axis, op)[source]#

Bases: object

Base-class for composing column/row selection operations.

This class wraps around the actual operation and overloads the relevant operators (+, &, |, and ~) and defers the evaluation of the operators until called (by the context data-frame in .loc[]).

Parameters

axis (Literal['columns', 'index']) –

__init__(axis, op)[source]#
Parameters

axis (Literal['columns', 'index']) –

get_other_op(other)[source]#

Get/create a wrapped operation for composing operations.

class pandas_paddles.axis.Selection(included=None, excluded=None, *, mask=None)[source]#

Bases: object

Container for selection along a data frame axis with combination logic.

Parameters
  • included (Optional[List[int]]) –

  • excluded (Optional[List[int]]) –

  • mask (Optional[Sequence[int]]) –

__init__(included=None, excluded=None, *, mask=None)[source]#

If mask is passed, included and excluded must be None!

Parameters
  • included (Optional[List[int]]) – List of indices included in the selection.

  • excluded (Optional[List[int]]) – List of indices excluded from the selection.

  • mask (Optional[Sequence[int]]) – Boolean array that will be converted to list of included indices: All indices with corresponding truthy/non-zero value will be included in the selection.

class pandas_paddles.axis.SelectionComposerBase(axis, op=None)[source]#

Bases: pandas_paddles.axis.LabelComposer

Base class to compose callable to select or sort axis labels (index and columns).

__init__(axis, op=None)[source]#
get_other_op(other)#

Get/create a wrapped operation for composing operations.

class pandas_paddles.axis.UnaryOp(wrapped, op)[source]#

Bases: pandas_paddles.axis.BaseOp

Apply unary operator on selection operator.

Used to implement, e.g., negation:

~sel
Parameters
__init__(wrapped, op)[source]#
Parameters