Chunked File Uploads – Practical Implementation

03 January 2025 / Article / Satrio

In modern web applications, handling large file uploads is a common requirement. Whether you're building a file sharing service, a media platform, or a document management system, you'll likely face challenges with uploading large files. Traditional file upload methods can be problematic, leading to timeouts, memory issues, and poor user experience. In this post, we'll explore how to implement a robust chunk-based file upload system using Go and JavaScript.

The Challenge with Large File Uploads

Traditional file uploads work well for small files, but they break down when dealing with larger files. Some common issues include:

  • Browser timeouts during long uploads
  • Server memory constraints when handling large files
  • No upload progress indication
  • Failed uploads requiring complete restart

Chunk-based uploading solves these problems by breaking large files into smaller, manageable pieces and uploading them sequentially.

Understanding Chunk Upload

The core idea behind chunk-based uploads is simple: instead of sending the entire file at once, we split it into smaller chunks, upload each chunk separately, and reassemble them on the server. This approach offers several benefits:

  • Better memory management (both client and server-side)
  • Reliable upload progress tracking
  • Possibility to pause and resume uploads
  • Better error handling and recovery

Let's see how to implement this in practice.

Implementation Overview

Our implementation consists of two main parts:

  1. A frontend JavaScript application that handles file chunking and upload
  2. A Go backend server that receives chunks and reassembles them

Let's dive into each part.

Frontend Implementation

First, let's look at how we handle file chunking in the browser. We'll set a chunk size of 512KB:

const CHUNK_SIZE = 512 * 1000; // 512KB chunks function handleFiles() { selectedFile = this.files[0]; let fullChunks = Math.floor(selectedFile.size / CHUNK_SIZE); // Update UI with file information const fileSizeEl = document.getElementById("file-size"); fileSizeEl.innerHTML = "File size:" + selectedFile.size; const chunkCountEl = document.getElementById("chunk-count"); chunkCountEl.innerHTML = "Count count:" + fullChunks; }

For each file upload, we generate a unique identifier using UUID to track all chunks belonging to the same file:

function generateUUID() { let d = new Date().getTime(); let d2 = (typeof performance !== "undefined" && performance.now && performance.now() * 1000) || 0; return "xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx".replace(/[xy]/g, function (c) { let r = Math.random() * 16; if (d > 0) { r = (d + r) % 16 | 0; d = Math.floor(d / 16); } else { r = (d2 + r) % 16 | 0; d2 = Math.floor(d2 / 16); } return (c === "x" ? r : (r & 0x3) | 0x8).toString(16); }); }

The upload process handles both single files and chunked uploads based on file size:

async function handleClickUpload() { try { const fileId = generateUUID(); const fileSize = selectedFile.size; const fileName = selectedFile.name; const fullChunks = Math.floor(fileSize / CHUNK_SIZE); if (fullChunks > 0) { // Handle chunked upload for (let i = 0; i < fullChunks; i++) { const offset = CHUNK_SIZE * i; const limit = CHUNK_SIZE * (i + 1); const metadata = { order: i, fileId, offset, limit, fileSize, fileName, }; const chunkedFile = selectedFile.slice(offset, limit); data.append("file", chunkedFile); data.append("metadata", JSON.stringify(metadata)); const res = await fetch(chunkUploadURL, { method: "POST", body: data, }); if (!res.ok) { throw new Error("Response status:" + res.status); } const json = await res.json(); const fileUploadedCount = document.getElementById("uploaded-count"); uploadCount++; fileUploadedCount.innerHTML = "File uploaded count:" + uploadCount; } if (remainedChunk > 0) { const data = new FormData(); const offset = fileSize - remainedChunk; const limit = fileSize; const metadata = { order: fullChunks, fileId, offset, limit, fileSize, fileName, }; const chunkedFile = selectedFile.slice(offset, limit); data.append("file", chunkedFile); data.append("metadata", JSON.stringify(metadata)); const res = await fetch(chunkUploadURL, { method: "POST", body: data, }); if (!res.oke) { throw new Error("Response status:" + res.status); } const json = await res.json(); } } else { // Handle single file upload // ... } } catch (err) { console.error("error click upload", err); } }

Backend Implementation

Our Go backend uses the Gin framework to handle the uploads. Here's the core structure:

type Metadata struct { Order int `json:"order"` FileId string `json:"fileId"` Offset int `json:"offset"` Limit int `json:"limit"` FileSize int `json:"fileSize"` FileName string `json:"fileName"` }

The server handles two types of uploads:

  1. Single file uploads for smaller files
  2. Chunk uploads for larger files

For chunked uploads, we store each chunk temporarily and reassemble them when all chunks are received:

r.POST("/split-upload", func(c *gin.Context) { // Handle file chunk and metadata // ... if metadata.FileSize == metadata.Limit { // All chunks received, start reassembly chunks, err := filepath.Glob(filepath.Join("./uploads/temp", fmt.Sprintf("*_%s", metadata.FileId))) // Sort chunks by order sort.Slice(chunks, func(i, j int) bool { orderI, _ := strconv.Atoi(string(filepath.Base(chunks[i])[0])) orderJ, _ := strconv.Atoi(string(filepath.Base(chunks[j])[0])) return orderI < orderJ }) // Merge chunks into final file finalPath := filepath.Join("./uploads", fmt.Sprintf("merged_%s", metadata.FileName)) finalFile, err := os.Create(finalPath) if err != nil { c.String(http.StatusBadRequest, "error merging file: %s", err.Error()) return } defer finalFile.Close() for _, chunk := range chunks { chunkFile, err := os.Open(chunk) if err != nil { c.String(http.StatusBadRequest, "error open chunk file: %s", err.Error()) return } _, err = io.Copy(finalFile, chunkFile) chunkFile.Close() if err != nil { c.String(http.StatusBadRequest, "error merging chunk file: %s", err.Error()) return } } // Cleanup temporary chunks // ... } })

Key Implementation Details

  1. Chunk Size Selection: We chose 512KB as our chunk size. This balances network efficiency with memory usage and provides good upload progress granularity. You can change the chunk size bigger or smaller based on your requirements.
  2. Metadata Management: Each chunk upload includes metadata about its position and the original file, allowing proper reassembly.
  3. File Reassembly: The server reassembles chunks in order, using the metadata to ensure correct positioning.
  4. Error Handling: Both frontend and backend include error handling for failed uploads and invalid chunks.

Practices and Considerations

When implementing chunk uploads, consider:

  1. Proper Cleanup: Remove temporary chunks after successful reassembly
  2. Progress Tracking: Implement clear progress indication for users
  3. Error Recovery: Handle network issues and failed uploads gracefully
  4. Validation: Verify chunk integrity and file completeness

Future Improvements

This implementation could be enhanced with:

  1. Parallel chunk uploads for faster transfer
  2. Resume capability for interrupted uploads
  3. MD5 checksums for chunk verification
  4. Better progress reporting and error handling
  5. Cleanup of abandoned uploads

Conclusion

Chunk-based file uploading provides a robust solution for handling large files in web applications. While the implementation requires more complexity than traditional uploads, the benefits in reliability and user experience make it worthwhile for applications dealing with large files.

The code demonstrated here provides a foundation that you can build upon based on your specific needs. Consider factors like your expected file sizes, user experience requirements, and server capabilities when adapting this solution.

Remember to properly test the implementation with various file sizes and types, and implement appropriate security measures before deploying to production.

Codebase

You can read the code here in my repository

GitHub repo: halosatrio/split-upload