10 KiB
Stirling PDF File History Specification
Overview
Stirling PDF implements a client-side file history system using IndexedDB storage. File metadata, including version history and tool chains, are stored as StirlingFileStub
objects that travel alongside the actual file data. This enables comprehensive version tracking, tool history, and file lineage management without modifying PDF content.
Storage Architecture
IndexedDB-Based Storage
File history is stored in the browser's IndexedDB using the fileStorage
service, providing:
- Persistent storage: Survives browser sessions and page reloads
- Large capacity: Supports files up to 100GB+ with full metadata
- Fast queries: Optimized for file browsing and history lookups
- Type safety: Structured TypeScript interfaces
Core Data Structures
interface StirlingFileStub extends BaseFileMetadata {
id: FileId; // Unique file identifier (UUID)
quickKey: string; // Deduplication key: name|size|lastModified
thumbnailUrl?: string; // Generated thumbnail blob URL
processedFile?: ProcessedFileMetadata; // PDF page data and processing results
// File Metadata
name: string;
size: number;
type: string;
lastModified: number;
createdAt: number;
// Version Control
isLeaf: boolean; // True if this is the latest version
versionNumber?: number; // Version number (1, 2, 3, etc.)
originalFileId?: string; // UUID of the root file in version chain
parentFileId?: string; // UUID of immediate parent file
// Tool History
toolHistory?: ToolOperation[]; // Complete sequence of applied tools
}
interface ToolOperation {
toolName: string; // Tool identifier (e.g., 'compress', 'sanitize')
timestamp: number; // When the tool was applied
}
interface StoredStirlingFileRecord extends StirlingFileStub {
data: ArrayBuffer; // Actual file content
fileId: FileId; // Duplicate for indexing
}
Version Management System
Version Progression
- v1: Original uploaded file (first version)
- v2: First tool applied to original
- v3: Second tool applied (inherits from v2)
- v4: Third tool applied (inherits from v3)
- etc.
Leaf Node System
Only the latest version of each file family is marked as isLeaf: true
:
- Leaf files: Show in default file list, available for tool processing
- History files: Hidden by default, accessible via history expansion
File Relationships
document.pdf (v1, isLeaf: false)
↓ compress
document.pdf (v2, isLeaf: false)
↓ sanitize
document.pdf (v3, isLeaf: true) ← Current active version
Implementation Architecture
1. FileStorage Service (fileStorage.ts
)
Core Methods:
// Store file with complete metadata
async storeStirlingFile(stirlingFile: StirlingFile, stub: StirlingFileStub): Promise<void>
// Load file with metadata
async getStirlingFile(id: FileId): Promise<StirlingFile | null>
async getStirlingFileStub(id: FileId): Promise<StirlingFileStub | null>
// Query operations
async getLeafStirlingFileStubs(): Promise<StirlingFileStub[]>
async getAllStirlingFileStubs(): Promise<StirlingFileStub[]>
// Version management
async markFileAsProcessed(fileId: FileId): Promise<boolean> // Set isLeaf = false
async markFileAsLeaf(fileId: FileId): Promise<boolean> // Set isLeaf = true
2. File Context Integration
FileContext manages runtime state with StirlingFileStub[]
in memory:
interface FileContextState {
files: {
ids: FileId[];
byId: Record<FileId, StirlingFileStub>;
};
}
Key Operations:
addFiles()
: Stores new files with initial metadataaddStirlingFileStubs()
: Loads existing files from storage with preserved metadataconsumeFiles()
: Processes files through tools, creating new versions
3. Tool Operation Integration
Tool Processing Flow:
- Input: User selects files (marked as
isLeaf: true
) - Processing: Backend processes files and returns results
- History Creation: New
StirlingFileStub
created with:- Incremented version number
- Updated tool history
- Parent file reference
- Storage: Both parent (marked
isLeaf: false
) and child (markedisLeaf: true
) stored - UI Update: FileContext updated with new file state
Child Stub Creation:
export function createChildStub(
parentStub: StirlingFileStub,
operation: { toolName: string; timestamp: number },
resultingFile: File,
thumbnail?: string
): StirlingFileStub {
return {
id: createFileId(),
name: resultingFile.name,
size: resultingFile.size,
type: resultingFile.type,
lastModified: resultingFile.lastModified,
quickKey: createQuickKey(resultingFile),
createdAt: Date.now(),
isLeaf: true,
// Version Control
versionNumber: (parentStub.versionNumber || 1) + 1,
originalFileId: parentStub.originalFileId || parentStub.id,
parentFileId: parentStub.id,
// Tool History
toolHistory: [...(parentStub.toolHistory || []), operation],
thumbnailUrl: thumbnail
};
}
UI Integration
File Manager History Display
FileManager (FileManager.tsx
) provides:
- Default View: Shows only leaf files (
isLeaf: true
) - History Expansion: Click to show all versions of a file family
- History Groups: Nested display using
FileHistoryGroup.tsx
FileListItem (FileListItem.tsx
) displays:
- Version Badges: v1, v2, v3 indicators
- Tool Chain: Complete processing history in tooltips
- History Actions: "Show/Hide History" toggle, "Restore" for history files
FileManagerContext Integration
File Selection Flow:
// Recent files (from storage)
onRecentFileSelect: (stirlingFileStubs: StirlingFileStub[]) => void
// Calls: actions.addStirlingFileStubs(stirlingFileStubs, options)
// New uploads
onFileUpload: (files: File[]) => void
// Calls: actions.addFiles(files, options)
History Management:
// Toggle history visibility
const { expandedFileIds, onToggleExpansion } = useFileManagerContext();
// Restore history file to current
const handleAddToRecents = (file: StirlingFileStub) => {
fileStorage.markFileAsLeaf(file.id); // Make this version current
};
Data Flow
New File Upload
1. User uploads files → addFiles()
2. Generate thumbnails and page count
3. Create StirlingFileStub with isLeaf: true, versionNumber: 1
4. Store both StirlingFile + StirlingFileStub in IndexedDB
5. Dispatch to FileContext state
Tool Processing
1. User selects tool + files → useToolOperation()
2. API processes files → returns processed File objects
3. createChildStub() for each result:
- Parent marked isLeaf: false
- Child created with isLeaf: true, incremented version
4. Store all files with updated metadata
5. Update FileContext with new state
File Loading (Recent Files)
1. User selects from FileManager → onRecentFileSelect()
2. addStirlingFileStubs() with preserved metadata
3. Load actual StirlingFile data from storage
4. Files appear in workbench with complete history intact
Performance Optimizations
Metadata Regeneration
When loading files from storage, missing processedFile
data is regenerated:
// In addStirlingFileStubs()
const needsProcessing = !record.processedFile ||
!record.processedFile.pages ||
record.processedFile.pages.length === 0;
if (needsProcessing) {
const result = await generateThumbnailWithMetadata(stirlingFile);
record.processedFile = createProcessedFile(result.pageCount, result.thumbnail);
}
Memory Management
- Blob URL Tracking: Automatic cleanup of thumbnail URLs
- Lazy Loading: Files loaded from storage only when needed
- LRU Caching: File objects cached in memory with size limits
File Deduplication
QuickKey System
Files are deduplicated using quickKey
format:
const quickKey = `${file.name}|${file.size}|${file.lastModified}`;
This prevents duplicate uploads while allowing different versions of the same logical file.
Error Handling
Graceful Degradation
- Storage Failures: Files continue to work without persistence
- Metadata Issues: Missing metadata regenerated on demand
- Version Conflicts: Automatic version number resolution
Recovery Scenarios
- Corrupted Storage: Automatic cleanup and re-initialization
- Missing Files: Stubs cleaned up automatically
- Version Mismatches: Automatic version chain reconstruction
Developer Guidelines
Adding File History to New Components
- Use FileContext Actions:
const { actions } = useFileActions();
await actions.addFiles(files); // For new uploads
await actions.addStirlingFileStubs(stubs); // For existing files
- Preserve Metadata When Processing:
const childStub = createChildStub(parentStub, {
toolName: 'compress',
timestamp: Date.now()
}, processedFile, thumbnail);
- Handle Storage Operations:
await fileStorage.storeStirlingFile(stirlingFile, stirlingFileStub);
const stub = await fileStorage.getStirlingFileStub(fileId);
Testing File History
- Upload files: Should show v1, marked as leaf
- Apply tool: Should create v2, mark v1 as non-leaf
- Check FileManager: History should show both versions
- Restore old version: Should mark old version as leaf
- Check storage: Both versions should persist in IndexedDB
Future Enhancements
Potential Improvements
- Branch History: Support for parallel processing branches
- History Export: Export complete version history as JSON
- Conflict Resolution: Handle concurrent modifications
- Cloud Sync: Sync history across devices
- Compression: Compress historical file data
API Extensions
- Batch Operations: Process multiple version chains simultaneously
- Search Integration: Search within tool history and file metadata
- Analytics: Track usage patterns and tool effectiveness
Last Updated: January 2025
Implementation: Stirling PDF Frontend v2
Storage Version: IndexedDB with fileStorage service