Selects

SELECT statements are likely the most common operation you'll perform with Iceaxe. They're used to fetch data from the database and receive them in Python instances or primitives. Let's consider the following table definition to show how we can interact with different types of data:

from iceaxe import DBConnection, select, TableBase, Field

class Employee(TableBase):
    id: int = Field(primary_key=True)
    name: str
    age: int

Let's start off with the following users in the database:

id	name	age
1	Marcus Holloway	28
2	Astrid Keller	42
3	Jasper Okafor	36

If we want to get an exact match for a single row, we can use the select function with an exact match filter for Marcus. You'd write this in the same way that you would check for equality in a Python if statement.

query = select(Employee).where(Employee.name == "Marcus Holloway")

All defined class attributes in the Employee class are available as columns, and these columns can be used in equality operations like ==, !=, >, <, >=, <=, etc. If we want, we can also preview the SQL and variables that are outputted for this query:

query.build()
> ('SELECT "employee".* FROM "employee" WHERE "employee"."name" = $1', ['Marcus Holloway'])

As you see from the build output from query.build(), we're constructing a SQL with a single WHERE filter that requests matches to $1. This is a placeholder for the actual value that is passed to the database engine separately. The database engine handles the escaping of any special values in this variable to avoid SQL injection attacks. Let's go ahead and execute this query against the database to get the actual values back:

results = await conn.exec(query)
results
> [Employee(id=1, name='Marcus Holloway', age=28)]

This will return a list of Employee instances that match the query. These instances are full Python objects; you can access and modify their attributes as you would with any other object. Once received, it's detached from the database connection unless you explicitly choose to save it. For more details on how to update data, see the page on update statements.

Chaining conditions

By default, conditions are combined with an AND operator. If you want to combine conditions with an OR operator, you can use the or_ function. For example, if you want to get all employees that are either 28 or 36 years old:

from iceaxe import and_, or_

query = select(Employee).where(or_(Employee.age == 28, Employee.age == 36))

query.build()
> ('SELECT "employee".* FROM "employee"
    WHERE ("employee"."age" = $1 OR "employee"."age" = $2)', [28, 36])

These operations nest within themselves to create more complicated selection chains. Let's extend the filter above to also get anyone who's named Marcus Holloway or Astrid Keller. We expect to only receive one result.

query = select(Employee).where(
    and_(
        or_(
            Employee.age == 28, Employee.age == 36,
        ),
        or_(
            Employee.name == "Marcus Holloway", Employee.name == "Astrid Keller"
        )
    )
)

query.build()
> ('SELECT "employee".* FROM "employee"
    WHERE (
        ("employee"."age" = $1 OR "employee"."age" = $2)
        AND
        ("employee"."name" = $3 OR "employee"."name" = $4)
    )',
    [28, 36, 'Marcus Holloway', 'Astrid Keller'])

Note that the outside and_ is optional in this case since where will take care of that for you. Still, it can sometimes make your code a bit more readable to include it for more complicated query chains.

Limit columns

In the above queries, we returned a full instance. For small queries the bandwidth of fetching an entire object is usually negligible compared to the convenience of having a full object that you can work with in other places in your pipeline. It's easier to track down references to Employee versus scattered strings that correspond to their name. Still - for larger queries or larger tables, there are times when it's way more efficient to only fetch the columns that you need. You can do this by specifying the columns you want within the initial selection:

query = select((Employee.id, Employee.name)).where(Employee.name == "Marcus Holloway")

query.build()
> ('SELECT "employee"."id", "employee"."name" FROM "employee"
    WHERE "employee"."name" = $1', ['Marcus Holloway'])

Unlike before when we passed a regular Employee class to the select function, we're now passing a tuple of the different columns that we want to fetch. For any selection of more than one object, you'll need to pass a tuple.

Filter lists

Exact equality or boolean operations are not the only way to filter data. You can also use some extensions to filter data in a more complex way. For example, let's say that we want to get all employees that are either 28 or 36 years old. We could make two separate equality filters like the example above. But it's often easier to use the in_ function to filter data based on a list of values:

from iceaxe import column

query = select(Employee).where(column(Employee.age).in_([28, 36]))

query.build()
> ('SELECT "employee".* FROM "employee" WHERE "employee"."age" = ANY($1)', [[28, 36]])

This will work at runtime with or without the call to column. But if you use static analysis like mypy or pyright to validate your code, you'll need to use the column typecast to avoid type errors. Why? In Iceaxe, the type of the column mirrors the type of the instance attribute. For the purposes of static analysis, Employee.age here validates to an integer. And the integer primitive in Python doesn't have a method called in_. By using the column typecast, you're explicitly telling the static analysis tool that you're working with a column object that supports this method accessor.

Aggregation queries

Aggregation queries allow you to perform calculations on groups of rows in your database. Iceaxe provides a simple way to construct these queries using the group_by method and aggregation functions. To make things more interesting, let's add a bit more data with some overlapping values:

id	name	age
1	Marcus Holloway	28
2	Astrid Keller	42
3	Jasper Okafor	36
4	Kaitlyn Winters	36
5	Merilyn Keller	36

Let's get the average age of all employees:

from iceaxe import func

query = select(func.avg(Employee.age))

query.build()
> ('SELECT avg("employee"."age") AS aggregate_0 FROM "employee"', [])

await conn.exec(query)
> [35.6]

Grouping rows and then executing these functions is even more powerful, because you can perform different rollups. A simple one to find the amount of employees by age.

query = select(
  (func.count(Employee.id), Employee.age),
).group_by(Employee.age)

query.build()
> ('SELECT count("employee"."id") AS aggregate_0, "employee"."age" FROM "employee" GROUP BY "employee"."age"', [])

await conn.exec(query)
> [(1, 28), (3, 36), (1, 42)]

Raw SQL

Iceaxe has the philosophy that any queries constructed by the ORM should work efficiently and be typehinted correctly. If we can't typehint it correctly, we won't offer it within the Python layer. For these cases - and more complex queries in general - read this section to learn how to manually cast queries.

If you're working on a more complex, and still want to receive the results as Iceaxe typed objects, you can do a manual casting of the results. Using the text() method of a selection query will override any other dynamic query building and just pass your SQL directly to the database engine.

from iceaxe import sql

query = select(Employee).text(
    f"""
    SELECT {sql.select(Employee)} FROM employee
    WHERE {sql(Employee.age)} = $1
    """,
    28
)
query.build()
> 'SELECT "employee"."id" AS "employee_id", "employee"."name" AS "employee_name", "employee"."age" AS "employee_age"
> FROM employee
> WHERE "employee"."age" = $1'
>, [28]

The sql class is a utility class that allows you to cast Table definitions to a SQL fstring. This gives you the ability to use the same Table definitions in your SQL as you do in your Python code, alongside basic typehinting of table columns. If you change column names, your linter should detect the conflict in this SQL as well.

There are a few different sql methods depending on where you're inserting them in your SQL.

Method	Input	Output	Description
`sql.select(Table)`	`User`	`"users"."id" AS "users_id", "users"."name" AS "users_name"`	Full table selection with aliased columns
`sql.select(Table.column)`	`User.name`	`"users"."name" AS "users_name"`	Single column selection with alias
`sql(Table)`	`User`	`"users"`	Qualified table name
`sql(Table.column)`	`User.name`	`"users"."name"`	Qualified column name
`sql.raw(Table)`	`User`	`"users"`	Raw table name
`sql.raw(Table.column)`	`User.name`	`"name"`	Raw column name

Always use sql.select for select queries, since these will internally map to our ORM convention that's required to cast the results to the correct Python object during postprocessing.

We will typecast the results as whatever you pass to the original select method, so make sure this value is always kept in sync with what you query in your manual SQL text.

The text method accept a single string of a SQL operation, then as many variables as you pass within the keyword arguments. These variables are accessible to your SQL string as one-indexed variables that correspond to the order of arguments. These are formatted as $1, $2, etc. Variables passed via Python variable will be encoded correctly to avoid SQL injection attacks so are recommended for any user input.

results = await conn.exec(query)
> [Employee(id=1, name='Marcus Holloway', age=28)]

If you have more complicated selection logic that doesn't directly reference table selections via sql.select, we also support aliasing values to directly extract them from the query.

from iceaxe import alias, func

# Simple count with alias
query = select((alias("age_int", int))).text(
    f"""
    SELECT COUNT(*) as age_int FROM employee 
    WHERE {sql(Employee.age)} = $1
    """,
    28
)

# Using functions with aliases
query = select((
    Employee,
    alias("name_length", func.length(Employee.name)),
    alias("total_employees", func.count(Employee.id))
)).text(
    f"""
    SELECT 
        {sql.select(Employee)},
        length({sql(Employee.name)}) as name_length,
        COUNT({sql(Employee.id)}) as total_employees
    FROM employee
    GROUP BY {sql(Employee.id)}
    """
)

# Complex example with multiple functions and joins
query = select((
    Employee,
    alias("avg_age", func.avg(Employee.age)),
    alias("department_size", func.count(Employee.id)),
    alias("total_salary", func.sum(Employee.salary))
)).text(
    f"""
    WITH department_stats AS (
        SELECT 
            {sql(Employee.department)},
            avg({sql(Employee.age)}) as avg_age,
            count({sql(Employee.id)}) as department_size,
            sum({sql(Employee.salary)}) as total_salary
        FROM employee
        GROUP BY {sql(Employee.department)}
    )
    SELECT 
        {sql.select(Employee)},
        ds.avg_age,
        ds.department_size,
        ds.total_salary
    FROM employee
    JOIN department_stats ds ON ds.department = {sql(Employee.department)}
    """
)

The alias function allows you to map SQL results to typed Python values, while func provides type-safe SQL function calls. When combined, they enable you to write complex queries with proper type checking and automatic deserialization.