Balázs Szücs 65e894870c
refactor(eml-to-pdf): Improve readability, maintainability, and overall standards compliance (#4065)
# Description of Changes
refactor(eml-to-pdf): Enhance compliance with PDF/ISO standards and MIME
specifications

This commit refactors the EML-to-PDF conversion utility to improve
standards compliance, implementing requirements from multiple RFCs and
ISO specifications:

### Standards Compliance Implemented:
• **PDF Standards (ISO 32000-1:2008)**: Added PDF version validation in
`attachFilesToPdf()`
  to ensure 1.7+ compatibility for Unicode file embeddings
• **MIME Processing (RFC 2045/2046)**: Implemented case-insensitive MIME
type handling
in `processPartAdvanced()` with `toLowerCase(Locale.ROOT)` normalization
• **Content Encoding (RFC 2047)**: Enhanced `safeMimeDecode()` with
UTF-8→ISO-8859-1
  charset fallback chains for robust header decoding
• **Content-ID Processing (RFC 2392)**: Added proper Content-ID
stripping with
  `replaceAll("[<>]", "")` for embedded image references
• **Multipart Safety (RFC 2046)** (best practice, not compliance
related): Implemented recursion depth limiting (max 10 levels)
• **processMultipartAdvanced()**, setCatalogViewerPreferences used to
set PageMode.USE_ATTACHMENTS, but PDF spec 12.2 (Viewer Preferences)
requires a /ViewerPreferences dictionary for full control (e.g.,
/DisplayDocTitle). Docs suggested setting additional prefs like
/NonFullScreenPageMode to ensure attachments panel opens reliably across
viewers
• **addAttachmentAnnotationToPage**, annotations are set to
/Invisible=true but must remain interactive. PDF spec 12.5.6.15 (File
Attachment Annotations) requires /F flags to control print/view (e.g.,
NoPrint if not printable).

### Technical Improvements:
• **Coordinate System Handling**: Added rotation-aware coordinate
transformations
  in PDF annotation placement following ISO 32000-1 Section 8.3
• **Charset Fallbacks**: Implemented progressive charset detection with
UTF-8
  primary and ISO-8859-1 fallback in MIME decoding
• **Error Resilience**: Enhanced exception handling with specific error
types and
  proper resource cleanup using try-with-resources patterns
• **HTML5 Compliance**: Updated email HTML generation with proper
DOCTYPE and
  charset declarations for browser compatibility

### Security & Robustness:
• **Input Validation**: Added comprehensive null checks and boundary
validation
  throughout attachment and multipart processing
• **XSS Prevention**: All user content now processed through
`escapeHtml()` or
  `CustomHtmlSanitizer` before HTML generation

### Code Quality:
• **Method Signatures**: Updated `processMultipartAdvanced()` to include
depth
  parameter for recursion tracking
• **Switch Expressions**: Modernized switch statements to use Java 17+
arrow syntax
  where applicable
• **Documentation**: Added inline RFC/ISO references for
compliance-critical sections

All changes maintain backward compatibility while significantly
improving standards
adherence. Tested with various EML formats.

No major change. No change in tests. No change in aesthetic of the
resulting PDF. No change change in "user space" (except when user relied
on compliance of aforementioned stuff then a major improvement)

<!--
Please provide a summary of the changes, including:

- What was changed
- Why the change was made
- Any challenges encountered

Closes #(issue_number)
-->

---

## Checklist

### General

- [x] I have read the [Contribution
Guidelines](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/CONTRIBUTING.md)
- [x] I have read the [Stirling-PDF Developer
Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/DeveloperGuide.md)
(if applicable)
- [ ] I have read the [How to add new languages to
Stirling-PDF](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/HowToAddNewLanguage.md)
(if applicable)
- [x] I have performed a self-review of my own code
- [ ] My changes generate no new warnings

### Documentation

- [ ] I have updated relevant docs on [Stirling-PDF's doc
repo](https://github.com/Stirling-Tools/Stirling-Tools.github.io/blob/main/docs/)
(if functionality has heavily changed)
- [ ] I have read the section [Add New Translation
Tags](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/HowToAddNewLanguage.md#add-new-translation-tags)
(for new translation tags only)

### UI Changes (if applicable)

- [ ] Screenshots or videos demonstrating the UI changes are attached
(e.g., as comments or direct attachments in the PR)

### Testing (if applicable)

- [x] I have tested my changes locally. Refer to the [Testing
Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/DeveloperGuide.md#6-testing)
for more details.
2025-08-08 13:14:57 +01:00
..