AIDevelopment

AI-Assisted Database Design and Query Optimization

Adrian Saycon

February 21, 20265 min read

AI-Assisted Database Design and Query Optimization

Database work is one of the areas where AI assistance delivers the most value per interaction. Schema design decisions have long-lasting consequences, slow queries cost real money, and most developers (myself included) aren’t database specialists. Here’s how I use AI to make better database decisions.

Using AI for Schema Design

When starting a new feature, I describe the domain to the AI and ask for a schema design. The key is being specific about your use case, scale expectations, and query patterns. Here’s a real example from a recent project:

I asked Claude to design a schema for a multi-tenant SaaS application’s notification system. Instead of just getting a table definition, I got a well-reasoned design with partitioning recommendations:

-- Notification system schema (PostgreSQL)
-- Designed for multi-tenant SaaS, ~50K notifications/day

CREATE TABLE notifications (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  tenant_id UUID NOT NULL,
  user_id UUID NOT NULL,
  type VARCHAR(50) NOT NULL,
  title VARCHAR(255) NOT NULL,
  body TEXT,
  data JSONB DEFAULT '{}',
  is_read BOOLEAN DEFAULT FALSE,
  read_at TIMESTAMPTZ,
  created_at TIMESTAMPTZ DEFAULT NOW(),
  expires_at TIMESTAMPTZ,

  CONSTRAINT fk_tenant FOREIGN KEY (tenant_id)
    REFERENCES tenants(id) ON DELETE CASCADE,
  CONSTRAINT fk_user FOREIGN KEY (user_id)
    REFERENCES users(id) ON DELETE CASCADE
) PARTITION BY RANGE (created_at);

-- Monthly partitions for efficient querying and cleanup
CREATE TABLE notifications_2026_01 PARTITION OF notifications
  FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');
CREATE TABLE notifications_2026_02 PARTITION OF notifications
  FOR VALUES FROM ('2026-02-01') TO ('2026-03-01');

-- Indexes optimized for common query patterns
CREATE INDEX idx_notifications_user_unread
  ON notifications (user_id, created_at DESC)
  WHERE is_read = FALSE;

CREATE INDEX idx_notifications_tenant_type
  ON notifications (tenant_id, type, created_at DESC);

CREATE INDEX idx_notifications_expires
  ON notifications (expires_at)
  WHERE expires_at IS NOT NULL;

The AI suggested partitioning by date because notifications are time-series data with natural cleanup boundaries. It added partial indexes for the two most common queries (unread notifications for a user, notifications by type for a tenant). These are the kinds of decisions that take experience to get right, and AI has effectively encoded that experience.

Index Recommendations

One of my favorite workflows: paste a slow query and your table schema, and ask for index recommendations. The AI will analyze the query plan and suggest indexes that actually help. But here’s the critical step: always verify with EXPLAIN ANALYZE.

-- Before: Full table scan, 850ms
EXPLAIN ANALYZE
SELECT o.id, o.total, u.email
FROM orders o
JOIN users u ON u.id = o.user_id
WHERE o.status = 'pending'
  AND o.created_at > NOW() - INTERVAL '7 days'
ORDER BY o.created_at DESC
LIMIT 50;

-- AI recommended this composite index:
CREATE INDEX idx_orders_status_created
  ON orders (status, created_at DESC)
  INCLUDE (total, user_id);

-- After: Index scan, 3ms
-- The INCLUDE columns enable an index-only scan,
-- avoiding heap lookups entirely.

The INCLUDE clause is something many developers miss. By including the columns that the query selects, PostgreSQL can satisfy the entire query from the index without touching the table data. The AI caught this optimization immediately.

N+1 Query Detection

N+1 queries are the most common performance killer in web applications, and AI is remarkably good at spotting them. I regularly paste ORM code and ask for N+1 analysis:

// N+1 problem - 1 query for posts + N queries for authors
const posts = await db.query("SELECT * FROM posts LIMIT 20");
for (const post of posts) {
  post.author = await db.query(
    "SELECT * FROM users WHERE id = $1", [post.author_id]
  ); // This runs 20 times!
}

// Fixed - Single query with JOIN
const posts = await db.query(`
  SELECT p.*, 
    json_build_object(
      'id', u.id,
      'name', u.name,
      'email', u.email
    ) as author
  FROM posts p
  JOIN users u ON u.id = p.author_id
  ORDER BY p.created_at DESC
  LIMIT 20
`);

For ORM-heavy codebases, AI can analyze your model relationships and identify where eager loading should replace lazy loading. This is especially valuable in Django, Rails, and Laravel projects where N+1 queries hide behind clean-looking model access.

Migration Script Generation

Describing schema changes in plain English and getting migration scripts is a huge time saver:

-- Generated migration: Add soft delete to orders table
-- with backfill for existing data

BEGIN;

-- Add columns
ALTER TABLE orders
  ADD COLUMN deleted_at TIMESTAMPTZ,
  ADD COLUMN deleted_by UUID REFERENCES users(id);

-- Partial index: exclude deleted rows from common queries
CREATE INDEX idx_orders_active
  ON orders (user_id, created_at DESC)
  WHERE deleted_at IS NULL;

-- Update existing queries via a view (backward compatible)
CREATE OR REPLACE VIEW active_orders AS
  SELECT * FROM orders WHERE deleted_at IS NULL;

COMMIT;

MySQL-Specific Optimizations

For MySQL projects, the AI catches engine-specific nuances that are easy to miss:

-- MySQL: Use covering indexes differently
-- InnoDB stores PKs in secondary indexes, so:
CREATE INDEX idx_orders_user_status
  ON orders (user_id, status, created_at);
-- In MySQL, this index also implicitly "includes" the
-- primary key (id) since InnoDB appends it.
-- No INCLUDE clause needed (or available) in MySQL.

-- MySQL: Optimize GROUP BY with index ordering
-- This query can use a loose index scan:
SELECT user_id, COUNT(*) as order_count
FROM orders
WHERE created_at > '2026-01-01'
GROUP BY user_id;

-- Ensure the index starts with the GROUP BY column:
CREATE INDEX idx_orders_user_created
  ON orders (user_id, created_at);

Performance Measurement

Always measure before and after. I use this simple pattern for tracking optimization results:

-- PostgreSQL: Enable timing and analyze
timing on

-- Run query with full analysis
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT ...

-- Key metrics to track:
-- Planning Time: How long query planning takes
-- Execution Time: Actual query run time
-- Buffers: shared hit vs shared read (cache effectiveness)
-- Rows: estimated vs actual (if far off, run ANALYZE)

AI won’t replace understanding your data. But it dramatically accelerates the cycle of designing schemas, identifying bottlenecks, generating optimizations, and producing the migration code to implement them. I’ve gone from spending half a day on database optimization sessions to finishing in under an hour, with better results because the AI catches patterns I’d miss.