Column Operations¶
Operations for adding, dropping, renaming, and transforming columns.
Overview¶
Column operations modify the structure of a DataFrame by adding, removing, or transforming columns. All operations return the TransformPlan instance for method chaining.
from transformplan import TransformPlan
plan = (
TransformPlan()
.col_rename("old_name", "new_name")
.col_drop("temp_column")
.col_cast("price", pl.Float64)
.col_add("status", value="active")
)
Class Reference¶
ColumnOps
¶
Mixin providing column-level operations.
col_drop
¶
Drop a column from the DataFrame.
Returns:
| Type | Description |
|---|---|
Self
|
Self for method chaining. |
col_rename
¶
Rename a column.
Returns:
| Type | Description |
|---|---|
Self
|
Self for method chaining. |
col_cast
¶
Cast a column to a different dtype.
Returns:
| Type | Description |
|---|---|
Self
|
Self for method chaining. |
col_reorder
¶
Reorder columns. Unlisted columns are dropped.
Returns:
| Type | Description |
|---|---|
Self
|
Self for method chaining. |
col_select
¶
Keep only the specified columns (order preserved).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
columns
|
Sequence[str]
|
Columns to keep. |
required |
Returns:
| Type | Description |
|---|---|
Self
|
Self for method chaining. |
Source code in transformplan/ops/column.py
col_duplicate
¶
Duplicate a column under a new name.
Returns:
| Type | Description |
|---|---|
Self
|
Self for method chaining. |
Source code in transformplan/ops/column.py
col_fill_null
¶
Fill null values in a column.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
column
|
str
|
Column to fill. |
required |
value
|
Any
|
Value to fill nulls with (if strategy is None). |
None
|
strategy
|
FillNullStrategy | None
|
Fill strategy - 'forward', 'backward', 'mean', 'min', 'max', 'zero', 'one'. |
None
|
Returns:
| Type | Description |
|---|---|
Self
|
Self for method chaining. |
Source code in transformplan/ops/column.py
col_drop_null
¶
Drop rows with null values in specified columns.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
columns
|
str | Sequence[str] | None
|
Column(s) to check for nulls. If None, checks all columns. |
None
|
Returns:
| Type | Description |
|---|---|
Self
|
Self for method chaining. |
Source code in transformplan/ops/column.py
col_drop_zero
¶
Drop rows where the specified column is zero.
Returns:
| Type | Description |
|---|---|
Self
|
Self for method chaining. |
col_add
¶
Add a new column with a constant value or expression.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
new_column
|
str
|
Name of the new column. |
required |
expr
|
str | float | None
|
Column name to copy from, or None for constant value. |
None
|
value
|
Any
|
Constant value to fill the column with. |
None
|
Returns:
| Type | Description |
|---|---|
Self
|
Self for method chaining. |
Source code in transformplan/ops/column.py
col_add_uuid
¶
Add a column with unique random identifiers.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
column
|
str
|
Name of the new column. |
required |
length
|
int
|
Length of the identifier string. |
16
|
Returns:
| Type | Description |
|---|---|
Self
|
Self for method chaining. |
Source code in transformplan/ops/column.py
col_hash
¶
Hash one or more columns into a new column.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
columns
|
str | Sequence[str]
|
Column(s) to hash. |
required |
new_column
|
str
|
Name for the hash column. |
required |
salt
|
str
|
Optional salt to add to the hash. |
''
|
Returns:
| Type | Description |
|---|---|
Self
|
Self for method chaining. |
Source code in transformplan/ops/column.py
col_coalesce
¶
Take the first non-null value across multiple columns.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
columns
|
Sequence[str]
|
Columns to coalesce (in priority order). |
required |
new_column
|
str
|
Name for the result column. |
required |
Returns:
| Type | Description |
|---|---|
Self
|
Self for method chaining. |
Source code in transformplan/ops/column.py
Examples¶
Basic Column Operations¶
# Drop a column
plan = TransformPlan().col_drop("temp")
# Rename a column
plan = TransformPlan().col_rename("old", "new")
# Cast to a different type
plan = TransformPlan().col_cast("price", pl.Float64)
Column Selection¶
# Keep only specific columns (in order)
plan = TransformPlan().col_select(["id", "name", "value"])
# Reorder columns (drops unlisted columns)
plan = TransformPlan().col_reorder(["value", "name", "id"])
Adding Columns¶
# Add column with constant value
plan = TransformPlan().col_add("status", value="pending")
# Copy from existing column
plan = TransformPlan().col_add("price_backup", expr="price")
# Add unique identifiers
plan = TransformPlan().col_add_uuid("row_id", length=16)
Handling Null Values¶
# Fill nulls with a value
plan = TransformPlan().col_fill_null("score", value=0)
# Fill with strategy
plan = TransformPlan().col_fill_null("value", strategy="forward")
# Drop rows with nulls
plan = TransformPlan().col_drop_null(columns=["required_field"])
Advanced Operations¶
# Create hash from multiple columns
plan = TransformPlan().col_hash(
columns=["first_name", "last_name", "email"],
new_column="user_hash",
salt="my_salt"
)
# Take first non-null from multiple columns
plan = TransformPlan().col_coalesce(
columns=["primary_email", "secondary_email", "backup_email"],
new_column="contact_email"
)
# Duplicate a column
plan = TransformPlan().col_duplicate("original", "copy")