Filtering API
API reference for filtering DataFrame rows based on conditions.
filter()
Filter rows based on a boolean expression.
filter(expr: Expr): DataFrame<T>Parameters:
expr- Boolean expression determining which rows to keep
Returns: DataFrame containing only rows where expression is true
Example:
import { col } from "molniya";
df.filter(col("age").gte(18))
df.filter(col("status").eq("active"))where()
Alias for filter().
where(expr: Expr): DataFrame<T>Example:
df.where(col("score").gt(90))Comparison Operators
eq()
Equal comparison.
eq(other: Expr | number | string | boolean | null): ComparisonExprExample:
col("status").eq("active")
col("id").eq(col("ref_id"))neq()
Not equal comparison.
neq(other: Expr | number | string | boolean | null): ComparisonExprExample:
col("type").neq("deleted")gt()
Greater than.
gt(other: Expr | number): ComparisonExprExample:
col("age").gt(18)
col("price").gt(col("cost"))gte()
Greater than or equal.
gte(other: Expr | number): ComparisonExprExample:
col("score").gte(60)lt()
Less than.
lt(other: Expr | number): ComparisonExprExample:
col("price").lt(100)lte()
Less than or equal.
lte(other: Expr | number): ComparisonExprExample:
col("quantity").lte(10)Logical Operators
and()
Logical AND - all conditions must be true.
and(...exprs: Expr[]): LogicalExprExample:
import { and, col } from "molniya";
and(
col("age").gte(18),
col("status").eq("active")
)or()
Logical OR - any condition can be true.
or(...exprs: Expr[]): LogicalExprExample:
import { or, col } from "molniya";
or(
col("category").eq("electronics"),
col("category").eq("computers")
)not()
Logical NOT - negate a condition.
not(expr: Expr): LogicalExprExample:
import { not, col } from "molniya";
not(col("deleted").eq(true))Null Checks
isNull()
Check if value is null.
isNull(): ComparisonExprExample:
col("email").isNull()isNotNull()
Check if value is not null.
isNotNull(): ComparisonExprExample:
col("email").isNotNull()String Filters
contains()
Check if string contains substring.
contains(substring: string): ComparisonExprExample:
col("email").contains("@company.com")startsWith()
Check if string starts with prefix.
startsWith(prefix: string): ComparisonExprExample:
col("phone").startsWith("+1")endsWith()
Check if string ends with suffix.
endsWith(suffix: string): ComparisonExprExample:
col("file").endsWith(".csv")like()
SQL-style pattern matching.
like(pattern: string): ComparisonExprPatterns:
%- matches any sequence of characters_- matches any single character
Example:
col("name").like("John%") // Starts with John
col("code").like("US-___") // US- followed by 3 charsregexpMatch()
Regular expression matching.
regexpMatch(pattern: string): ComparisonExprExample:
col("email").regexpMatch("^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$")Date/Time Filters
Date Comparisons
import { col } from "molniya";
// Compare dates
col("order_date").gt(new Date("2024-01-01"))
col("created_at").gte(Date.now() - 86400000) // Last 24 hoursYear/Month/Day Extraction
import { col, year, month, day } from "molniya";
// Filter by date component
year(col("date")).eq(2024)
month(col("date")).eq(1) // January
day(col("date")).eq(15) // 15th of monthBetween Filters
between()
Check if value is within range (inclusive).
between(lower: Expr | number, upper: Expr | number): ComparisonExprExample:
col("age").between(18, 65)
col("price").between(col("min_price"), col("max_price"))In Filters
isIn()
Check if value is in a list.
isIn(values: (string | number | boolean)[]): ComparisonExprExample:
col("status").isIn(["active", "pending", "verified"])
col("category").isIn(["electronics", "computers", "phones"])Filter Chaining
Multiple filters can be chained:
df.filter(col("age").gte(18))
.filter(col("status").eq("active"))
.filter(col("country").eq("US"))Equivalent to:
import { and, col } from "molniya";
df.filter(and(
col("age").gte(18),
col("status").eq("active"),
col("country").eq("US")
))Complex Filter Examples
Multi-Condition Filter
import { and, or, col } from "molniya";
df.filter(and(
or(
col("category").eq("electronics"),
col("category").eq("computers")
),
col("price").gte(100),
col("in_stock").eq(true),
col("deleted_at").isNull()
))Date Range Filter
import { and, col } from "molniya";
const startDate = new Date("2024-01-01");
const endDate = new Date("2024-12-31");
df.filter(and(
col("order_date").gte(startDate),
col("order_date").lte(endDate)
))Search Filter
import { or, col } from "molniya";
const searchTerm = "laptop";
df.filter(or(
col("name").lower().contains(searchTerm),
col("description").lower().contains(searchTerm),
col("tags").lower().contains(searchTerm)
))Performance Notes
- Filters are pushed down to data sources when possible
- Multiple filters are combined with AND automatically
- Filter before expensive operations like joins
- String comparisons use dictionary encoding for efficiency