vayu.pandas_utils¶
Requires the vayulib[data] extra.
pandas_utils
¶
concat_frame_from_dir
¶
concat_frame_from_dir(
path,
prefix: str = None,
extension="parquet",
progress=False,
) -> DataFrame
Concatenate all dataframes in a directory
Source code in vayu/pandas_utils.py
slice_frame
¶
slice_frame(
interval: Interval,
df: DataFrame,
level: Optional[int] = None,
key: Optional[Hashable] = None,
axis: int = 0,
exclude: bool = False,
) -> DataFrame
Slice a dataframe using this interval.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
interval
|
Interval
|
The interval to slice |
required |
df
|
DataFrame
|
The dataframe to slice. |
required |
level
|
Optional[int]
|
The index or column level to slice on |
None
|
key
|
Optional[Hashable]
|
The row index or column to slice on to slice on |
None
|
axis
|
int
|
The axis to slice on (0 for index, 1 for columns) |
0
|
exclude
|
bool
|
If True, exclude the interval instead of including it |
False
|
Notes
- If key is not specified, the level (or index if level is None) should be sorted in increasing order
- If neither key nor level is specified, the 0th level of index is used for sliced on.
Source code in vayu/pandas_utils.py
is_frame_empty
¶
get_frame_window
¶
get_frame_window(
df: DataFrame, column: str = None, level: int = 0
) -> Optional[TimeWindow]
Get the time window of a dataframe.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
dataframe |
required |
column
|
str
|
If specified, the window is computed from the min and max of the column. |
None
|
level
|
int
|
Window is computed from the min and max of the index at the specified level. |
0
|
Source code in vayu/pandas_utils.py
split_frame
¶
split_frame(
obj: Union[Series, DataFrame],
n: Optional[Union[int, float, datetime]] = 0.5,
) -> Union[
Tuple[Series, Series], Tuple[DataFrame, DataFrame]
]
Split a dataframe or series into two parts.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
obj
|
Union[Series, DataFrame]
|
The object to split |
required |
n
|
Optional[Union[int, float, datetime]]
|
The index to split at. If float, it is treated as a fraction of the length of the object. If int, it is treated as an index. If datetime, it is treated as a timestamp. |
0.5
|
Returns:
| Type | Description |
|---|---|
Union[Tuple[Series, Series], Tuple[DataFrame, DataFrame]]
|
A tuple of two objects, the first part and the second part. |
Source code in vayu/pandas_utils.py
select_frame
¶
Select rows from a dataframe based on conditions.