Building a Data Importer for WordPress: Batch Processing Without Timeouts

Import 500 posts into WordPress through a single PHP request and you’ll hit the 30-second execution limit before you’re halfway through. I built a batch importer that processes content in chunks via AJAX, tracks progress with checksums, and handles media attachments — here’s the architecture.
The Problem with Bulk Imports
WordPress’s wp_insert_post() does a lot of work per call: sanitizing content, building slugs, updating term relationships, clearing caches, firing hooks. A single post insert can take 100-500ms depending on content size and the number of active plugins. Multiply that by hundreds of posts and you’ll exceed PHP’s max_execution_time well before finishing.
The solution: process in small batches (5-10 items), triggered by sequential AJAX calls from the browser. Each batch runs within the time limit, and the client-side JavaScript orchestrates the sequence.
The Data Format
Content lives in JSON manifests — one per post type:
// posts.json
[
{
"slug": "my-first-post",
"checksum": "a1b2c3d4e5f67890",
"post_type": "post",
"title": "My First Post",
"content": "<p>Post content here...</p>",
"status": "publish",
"featured_image": "media/my-first-post/featured.jpg",
"taxonomy": {
"category": ["Development"]
}
}
]
The checksum field is critical — it’s a hash of the slug that uniquely identifies each entry. Before importing, we check if a post with that checksum already exists. This prevents duplicates on re-import and lets us update existing content.
Server-Side: The AJAX Handler
add_action('wp_ajax_import_batch', 'handle_import_batch');
function handle_import_batch() {
check_ajax_referer('data_importer_nonce', 'nonce');
if (!current_user_can('manage_options')) {
wp_send_json_error('Unauthorized');
}
$items = json_decode(stripslashes($_POST['items']), true);
$results = [];
foreach ($items as $item) {
$results[] = process_import_item($item);
}
wp_send_json_success(['results' => $results]);
}
function process_import_item($item) {
// Check for existing post by checksum
$existing = get_posts([
'post_type' => $item['post_type'],
'meta_key' => '_import_checksum',
'meta_value' => $item['checksum'],
'numberposts' => 1,
]);
$post_data = [
'post_title' => $item['title'],
'post_content' => $item['content'],
'post_status' => $item['status'],
'post_type' => $item['post_type'],
'post_name' => $item['slug'],
];
if (!empty($existing)) {
$post_data['ID'] = $existing[0]->ID;
$post_id = wp_update_post($post_data);
$action = 'updated';
} else {
$post_id = wp_insert_post($post_data);
$action = 'created';
}
if (is_wp_error($post_id)) {
return ['slug' => $item['slug'], 'status' => 'error', 'message' => $post_id->get_error_message()];
}
update_post_meta($post_id, '_import_checksum', $item['checksum']);
// Handle taxonomy
if (!empty($item['taxonomy'])) {
foreach ($item['taxonomy'] as $tax => $terms) {
wp_set_object_terms($post_id, $terms, $tax);
}
}
return ['slug' => $item['slug'], 'status' => $action, 'post_id' => $post_id];
}
Client-Side: Batch Orchestration
The JavaScript splits the full dataset into chunks and processes them sequentially:
async function importAll(items, batchSize = 5) {
const batches = [];
for (let i = 0; i < items.length; i += batchSize) {
batches.push(items.slice(i, i + batchSize));
}
let completed = 0;
for (const batch of batches) {
const formData = new FormData();
formData.append('action', 'import_batch');
formData.append('nonce', importerData.nonce);
formData.append('items', JSON.stringify(batch));
const response = await fetch(importerData.ajaxUrl, {
method: 'POST',
body: formData,
});
const result = await response.json();
if (result.success) {
completed += batch.length;
updateProgress(completed, items.length);
result.data.results.forEach(logResult);
} else {
logError(`Batch failed: ${result.data}`);
}
}
}
function updateProgress(current, total) {
const pct = Math.round((current / total) * 100);
document.getElementById('progress-bar').style.width = pct + '%';
document.getElementById('progress-text').textContent = `${current} / ${total}`;
}
Handling Media Attachments
Media is the slowest part. Downloading and processing images through media_handle_sideload() can take seconds per file. I process media separately from posts — first import all posts, then run a second pass for media:
function attach_featured_image($post_id, $image_path, $upload_dir) {
$file_path = $upload_dir . '/' . $image_path;
if (!file_exists($file_path)) {
return new WP_Error('missing_file', "File not found: {$image_path}");
}
$filetype = wp_check_filetype(basename($file_path));
$attachment = [
'post_mime_type' => $filetype['type'],
'post_title' => sanitize_file_name(basename($file_path)),
'post_content' => '',
'post_status' => 'inherit',
];
$attach_id = wp_insert_attachment($attachment, $file_path, $post_id);
require_once ABSPATH . 'wp-admin/includes/image.php';
$metadata = wp_generate_attachment_metadata($attach_id, $file_path);
wp_update_attachment_metadata($attach_id, $metadata);
set_post_thumbnail($post_id, $attach_id);
return $attach_id;
}
Production Tips
- Batch size of 5 is the sweet spot — large enough to be efficient, small enough to stay under time limits.
- Log everything. Store import results so you can audit what was created, updated, or skipped.
- Make it idempotent. Running the import twice should produce the same result. Checksums make this possible.
- Handle failures gracefully. If one item fails, the batch should continue with the rest. Return per-item status.
Written by
Adrian Saycon
A developer with a passion for emerging technologies, Adrian Saycon focuses on transforming the latest tech trends into great, functional products.


