Skip to main content
Adzbyte
DevelopmentToolsWordPress

Building a Data Importer for WordPress: Batch Processing Without Timeouts

Adrian Saycon
Adrian Saycon
March 7, 20263 min read
Building a Data Importer for WordPress: Batch Processing Without Timeouts

Import 500 posts into WordPress through a single PHP request and you’ll hit the 30-second execution limit before you’re halfway through. I built a batch importer that processes content in chunks via AJAX, tracks progress with checksums, and handles media attachments — here’s the architecture.

The Problem with Bulk Imports

WordPress’s wp_insert_post() does a lot of work per call: sanitizing content, building slugs, updating term relationships, clearing caches, firing hooks. A single post insert can take 100-500ms depending on content size and the number of active plugins. Multiply that by hundreds of posts and you’ll exceed PHP’s max_execution_time well before finishing.

The solution: process in small batches (5-10 items), triggered by sequential AJAX calls from the browser. Each batch runs within the time limit, and the client-side JavaScript orchestrates the sequence.

The Data Format

Content lives in JSON manifests — one per post type:

// posts.json
[
  {
    "slug": "my-first-post",
    "checksum": "a1b2c3d4e5f67890",
    "post_type": "post",
    "title": "My First Post",
    "content": "<p>Post content here...</p>",
    "status": "publish",
    "featured_image": "media/my-first-post/featured.jpg",
    "taxonomy": {
      "category": ["Development"]
    }
  }
]

The checksum field is critical — it’s a hash of the slug that uniquely identifies each entry. Before importing, we check if a post with that checksum already exists. This prevents duplicates on re-import and lets us update existing content.

Server-Side: The AJAX Handler

add_action('wp_ajax_import_batch', 'handle_import_batch');

function handle_import_batch() {
    check_ajax_referer('data_importer_nonce', 'nonce');

    if (!current_user_can('manage_options')) {
        wp_send_json_error('Unauthorized');
    }

    $items = json_decode(stripslashes($_POST['items']), true);
    $results = [];

    foreach ($items as $item) {
        $results[] = process_import_item($item);
    }

    wp_send_json_success(['results' => $results]);
}

function process_import_item($item) {
    // Check for existing post by checksum
    $existing = get_posts([
        'post_type'  => $item['post_type'],
        'meta_key'   => '_import_checksum',
        'meta_value' => $item['checksum'],
        'numberposts' => 1,
    ]);

    $post_data = [
        'post_title'   => $item['title'],
        'post_content' => $item['content'],
        'post_status'  => $item['status'],
        'post_type'    => $item['post_type'],
        'post_name'    => $item['slug'],
    ];

    if (!empty($existing)) {
        $post_data['ID'] = $existing[0]->ID;
        $post_id = wp_update_post($post_data);
        $action = 'updated';
    } else {
        $post_id = wp_insert_post($post_data);
        $action = 'created';
    }

    if (is_wp_error($post_id)) {
        return ['slug' => $item['slug'], 'status' => 'error', 'message' => $post_id->get_error_message()];
    }

    update_post_meta($post_id, '_import_checksum', $item['checksum']);

    // Handle taxonomy
    if (!empty($item['taxonomy'])) {
        foreach ($item['taxonomy'] as $tax => $terms) {
            wp_set_object_terms($post_id, $terms, $tax);
        }
    }

    return ['slug' => $item['slug'], 'status' => $action, 'post_id' => $post_id];
}

Client-Side: Batch Orchestration

The JavaScript splits the full dataset into chunks and processes them sequentially:

async function importAll(items, batchSize = 5) {
  const batches = [];
  for (let i = 0; i < items.length; i += batchSize) {
    batches.push(items.slice(i, i + batchSize));
  }

  let completed = 0;

  for (const batch of batches) {
    const formData = new FormData();
    formData.append('action', 'import_batch');
    formData.append('nonce', importerData.nonce);
    formData.append('items', JSON.stringify(batch));

    const response = await fetch(importerData.ajaxUrl, {
      method: 'POST',
      body: formData,
    });

    const result = await response.json();

    if (result.success) {
      completed += batch.length;
      updateProgress(completed, items.length);
      result.data.results.forEach(logResult);
    } else {
      logError(`Batch failed: ${result.data}`);
    }
  }
}

function updateProgress(current, total) {
  const pct = Math.round((current / total) * 100);
  document.getElementById('progress-bar').style.width = pct + '%';
  document.getElementById('progress-text').textContent = `${current} / ${total}`;
}

Handling Media Attachments

Media is the slowest part. Downloading and processing images through media_handle_sideload() can take seconds per file. I process media separately from posts — first import all posts, then run a second pass for media:

function attach_featured_image($post_id, $image_path, $upload_dir) {
    $file_path = $upload_dir . '/' . $image_path;

    if (!file_exists($file_path)) {
        return new WP_Error('missing_file', "File not found: {$image_path}");
    }

    $filetype = wp_check_filetype(basename($file_path));
    $attachment = [
        'post_mime_type' => $filetype['type'],
        'post_title'     => sanitize_file_name(basename($file_path)),
        'post_content'   => '',
        'post_status'    => 'inherit',
    ];

    $attach_id = wp_insert_attachment($attachment, $file_path, $post_id);
    require_once ABSPATH . 'wp-admin/includes/image.php';
    $metadata = wp_generate_attachment_metadata($attach_id, $file_path);
    wp_update_attachment_metadata($attach_id, $metadata);
    set_post_thumbnail($post_id, $attach_id);

    return $attach_id;
}

Production Tips

  • Batch size of 5 is the sweet spot — large enough to be efficient, small enough to stay under time limits.
  • Log everything. Store import results so you can audit what was created, updated, or skipped.
  • Make it idempotent. Running the import twice should produce the same result. Checksums make this possible.
  • Handle failures gracefully. If one item fails, the batch should continue with the rest. Return per-item status.
Adrian Saycon

Written by

Adrian Saycon

A developer with a passion for emerging technologies, Adrian Saycon focuses on transforming the latest tech trends into great, functional products.

Discussion (0)

Sign in to join the discussion

No comments yet. Be the first to share your thoughts.