Polars Service

Polars DataFrame and LazyFrame type annotations meant to integrate with pydantic’s validation system, meaning I can set them as attributes on pydantic models without pydantic seeing them as ‘arbitrary types’

Services

join_dfs(dfs: t.Iterable[_types.PolarsMaybeLazyDfT], join_on: str, how: HowJoinT = 'inner') _types.PolarsLazyDfT

join an iterable of polars DataFrame or LazyFrame objects into a single LazyFrame

Parameters:
  • dfs – iterable of polars objects

  • join_on – the column we are joining on, all dataframes must have this column

  • how – the type of join, ‘inner’, ‘outer’, ‘left’, if ‘left’ the first dataframe in the iterable is the ‘left’ one

Returns:

the joined polars LazyFrame

Return type:

pl.LazyFrame

stack_dfs(dfs: t.List[_types.PolarsMaybeLazyDfT]) _types.PolarsLazyDfT

vertically stack an iterable of polars DataFrame or LazyFrame objects into a single LazyFrame. This is a vertical stack.

Parameters:

dfs – iterable of polars objects

Returns:

the stacked polars LazyFrame

Return type:

pl.LazyFrame

get_json_safe_column_dict(df: _types.PolarsMaybeLazyDfT) _types.ColumnarDictT

dump the polars LazyFrame or DataFrame into json-encodeable dict with keys being column names and values being lists containing the data for the given column.

Parameters:

df – LazyFrame or DataFrame instance

Returns:

a dict of lists

get_json_safe_column_dict_lazy_peek(lazy_df: _types.PolarsMaybeLazyDfT, num_rows: int = 3) _types.ColumnarDictT

dump the first n rows of a polars LazyFrame into a json-encodeable dict with keys being column names and values being lists containing the data for the given column.

Parameters:
  • lazy_df – LazyFrame instance

  • num_rows – the number of rows to collect

Returns:

a dict of lists

get_json_safe_row_dicts(df: _types.PolarsMaybeLazyDfT) _types.RowDictsT

dump the polars LazyFrame or DataFrame into json-encodeable list of dictionaries :param df: LazyFrame or DataFrame instance

Returns:

a list of dicts

get_json_safe_row_dicts_lazy_peek(lazy_df: _types.PolarsLazyDfT, num_rows: int = 3) _types.RowDictsT

dump the first n rows of a polars LazyFrame into a json-encodeable list of dictionaries, without having to collect the entire LazyFrame.

Parameters:
  • lazy_df – LazyFrame instance

  • num_rows – the number of rows to collect

Returns:

a list of dicts

Types

type annotations and type adapters for polars classes

PolarsDfT

An annotated type representing a polars DataFrame.

This extra metadata allows us to include polars collected dataframes in the pydantic ecosystem, e.g. as a field on a model. The pydantic behavior is as follows:

  • DataFrame instance will be parsed as a DataFrame instance, unchanged

  • LazyFrame instance will be collected to be a DataFrame.

  • list of dicts will be parsed as a DataFrame. (row orientation)

  • dict of lists will be parsed as a DataFrame (columnar orientation)

  • serialization will return a row orientation of json-encodeable dicts

Notes: this is not safe for roundtrip serialization, i.e. the a serialized instance of an instance

of this type will not necessarily deserialize to be equal to the original instance

alias of DataFrame[DataFrame]

PolarsDfTypeAdapter: TypeAdapter[DataFrame] = <pydantic.type_adapter.TypeAdapter object>

type adapter for PolarsDfT

PolarsLazyDfT

An annotated type representing a polars LazyFrame.

This extra metadata allows us to include polars lazy dataframes in the pydantic ecosystem, e.g. as a field on a model. The pydantic behavior is as follows:

  • LazyFrame instance will be parsed as a LazyFrame instance, unchanged

  • DataFrame instance will be cast to a LazyFrame.

  • list of dicts will be parsed as a LazyFrame. (row orientation)

  • dict of lists will be parsed as a LazyFrame (columnar orientation)

  • serialization will return a row orientation of json-encodeable dicts, only the first 3

Notes: this is not safe for roundtrip serialization, i.e. the a serialized instance of an instance

of this type will not necessarily deserialize to be equal to the original instance

alias of LazyFrame[LazyFrame]

PolarsLazyDfTypeAdapter: TypeAdapter[LazyFrame] = <pydantic.type_adapter.TypeAdapter object>

type adapter for PolarsLazyDfT

class PolarsDfTV

common type var that could be either a lazy or collected dataframe

alias of TypeVar(‘PolarsDfTV’, ~polars.dataframe.frame.DataFrame, ~polars.lazyframe.frame.LazyFrame)

PolarsMaybeLazyDfT

An annotated type representing a union of polars DataFrame or LazyFrame.

This extra metadata allows us to include polars collected dataframes in the pydantic ecosystem, e.g. as a field on a model. The pydantic behavior is as follows:

  • A LazyFrame instance will be kept as a LazyFrame instance, unchanged

  • DataFrame instance will be parsed as a DataFrame instance, unchanged

  • list of dicts will be parsed as a DataFrame. (row orientation)

  • dict of lists will be parsed as a DataFrame (columnar orientation)

  • serialization will return a row orientation of json-encodeable dicts, if lazy, only the first 3

Notes: this is not safe for roundtrip serialization, i.e. the a serialized instance of an instance

of this type will not necessarily deserialize to be equal to the original instance

alias of DataFrame[DataFrame] | LazyFrame[LazyFrame][DataFrame[DataFrame] | LazyFrame[LazyFrame]]

PolarsMaybeLazyDfTypeAdapter: TypeAdapter[DataFrame | LazyFrame] = <pydantic.type_adapter.TypeAdapter object>

type adapter for PolarsMaybeLazyDfT