Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

251 lines
10 KiB
Gherkin
Raw Normal View History

2024-05-27 16:31:00 +01:00
Feature: API Validation
chore(cucumber): add create_pdf_with_black_boxes and convert-pdf-to-image outline; remove duplicate split-pdf-by-sections (#3937) # Description of Changes - **What was changed** - Introduced `create_pdf_with_black_boxes` helper function in `environment.py` for generating test PDFs with occluded content. - Added **Scenario Outline: Convert PDF to image** to `conversion.feature` to validate PDF→image conversion workflows. - Removed the duplicate **Scenario Outline: split-pdf-by-sections with different parameters** from `general.feature`. - **Why the change was made** - To enable testing of blacked-out content scenarios and ensure our suite covers image conversion. - To eliminate redundant tests and keep the feature files DRY and maintainable. --- ## Checklist ### General - [x] I have read the [Contribution Guidelines](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/CONTRIBUTING.md) - [x] I have read the [Stirling-PDF Developer Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/DeveloperGuide.md) (if applicable) - [ ] I have read the [How to add new languages to Stirling-PDF](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/HowToAddNewLanguage.md) (if applicable) - [x] I have performed a self-review of my own code - [x] My changes generate no new warnings ### Documentation - [ ] I have updated relevant docs on [Stirling-PDF's doc repo](https://github.com/Stirling-Tools/Stirling-Tools.github.io/blob/main/docs/) (if functionality has heavily changed) - [ ] I have read the section [Add New Translation Tags](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/HowToAddNewLanguage.md#add-new-translation-tags) (for new translation tags only) ### UI Changes (if applicable) - [ ] Screenshots or videos demonstrating the UI changes are attached (e.g., as comments or direct attachments in the PR) ### Testing (if applicable) - [x] I have tested my changes locally. Refer to the [Testing Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/DeveloperGuide.md#6-testing) for more details.
2025-07-14 13:05:17 +02:00
@libre @positive
Scenario: Repair PDF
Given I generate a PDF file as "fileInput"
When I send the API request to the endpoint "/api/v1/misc/repair"
Then the response content type should be "application/pdf"
And the response file should have size greater than 0
And the response status code should be 200
@ocr @positive
Scenario: Process PDF with OCR
Given I generate a PDF file as "fileInput"
And the request data includes
| parameter | value |
| languages | eng |
| sidecar | false |
| deskew | true |
| clean | true |
| cleanFinal | true |
| ocrType | Normal |
| ocrRenderType | hocr |
| removeImagesAfter | false |
When I send the API request to the endpoint "/api/v1/misc/ocr-pdf"
Then the response content type should be "application/pdf"
And the response file should have size greater than 0
And the response status code should be 200
@ocr @positive
Scenario: Extract Image Scans
Given I generate a PDF file as "fileInput"
And the pdf contains 3 images of size 300x300 on 2 pages
And the request data includes
| parameter | value |
| angleThreshold | 5 |
| tolerance | 20 |
| minArea | 8000 |
| minContourArea | 500 |
| borderSize | 1 |
When I send the API request to the endpoint "/api/v1/misc/extract-image-scans"
Then the response content type should be "application/octet-stream"
And the response file should have extension ".zip"
And the response ZIP should contain 2 files
And the response file should have size greater than 0
And the response status code should be 200
@ocr @positive
Scenario: Process PDF with OCR
Given I generate a PDF file as "fileInput"
And the request data includes
| parameter | value |
| languages | eng |
| sidecar | false |
| deskew | true |
| clean | true |
| cleanFinal | true |
| ocrType | Force |
| ocrRenderType | hocr |
| removeImagesAfter | false |
When I send the API request to the endpoint "/api/v1/misc/ocr-pdf"
Then the response content type should be "application/pdf"
And the response file should have size greater than 0
And the response status code should be 200
@libre @positive
Scenario Outline: Convert PDF to various word formats
Given I generate a PDF file as "fileInput"
And the pdf contains 3 pages with random text
And the request data includes
| parameter | value |
| outputFormat | <format> |
When I send the API request to the endpoint "/api/v1/convert/pdf/word"
Then the response status code should be 200
And the response file should have size greater than 100
And the response file should have extension "<extension>"
Examples:
| format | extension |
| docx | .docx |
| odt | .odt |
| doc | .doc |
@ocr @pdfa1
Scenario: PDFA
Given I use an example file at "exampleFiles/pdfa2.pdf" as parameter "fileInput"
And the request data includes
| parameter | value |
| outputFormat | pdfa |
When I send the API request to the endpoint "/api/v1/convert/pdf/pdfa"
Then the response status code should be 200
And the response file should have extension ".pdf"
And the response file should have size greater than 100
@ocr @pdfa2
Scenario: PDFA1
Given I use an example file at "exampleFiles/pdfa1.pdf" as parameter "fileInput"
And the request data includes
| parameter | value |
| outputFormat | pdfa-1 |
When I send the API request to the endpoint "/api/v1/convert/pdf/pdfa"
Then the response status code should be 200
And the response file should have extension ".pdf"
And the response file should have size greater than 100
@compress @qpdf @positive
Scenario: Compress
Given I use an example file at "exampleFiles/ghost3.pdf" as parameter "fileInput"
And the request data includes
| parameter | value |
| optimizeLevel | 4 |
When I send the API request to the endpoint "/api/v1/misc/compress-pdf"
Then the response status code should be 200
And the response file should have extension ".pdf"
And the response file should have size greater than 100
@compress @qpdf @positive
Scenario: Compress
Given I use an example file at "exampleFiles/ghost2.pdf" as parameter "fileInput"
And the request data includes
| parameter | value |
| optimizeLevel | 1 |
| expectedOutputSize | 5KB |
When I send the API request to the endpoint "/api/v1/misc/compress-pdf"
Then the response status code should be 200
And the response file should have extension ".pdf"
And the response file should have size greater than 100
@compress @qpdf @positive
Scenario: Compress
Given I use an example file at "exampleFiles/ghost1.pdf" as parameter "fileInput"
And the request data includes
| parameter | value |
| optimizeLevel | 1 |
| expectedOutputSize | 5KB |
When I send the API request to the endpoint "/api/v1/misc/compress-pdf"
Then the response status code should be 200
And the response file should have extension ".pdf"
And the response file should have size greater than 100
@libre @positive
Scenario Outline: Convert PDF to various types
Given I generate a PDF file as "fileInput"
And the pdf contains 3 pages with random text
And the request data includes
| parameter | value |
| outputFormat | <format> |
When I send the API request to the endpoint "/api/v1/convert/pdf/<type>"
Then the response status code should be 200
And the response file should have size greater than 100
And the response file should have extension "<extension>"
Examples:
| type | format | extension |
| text | rtf | .rtf |
| text | txt | .txt |
| presentation | ppt | .ppt |
| presentation | pptx | .pptx |
| presentation | odp | .odp |
| html | html | .zip |
@image @positive
Scenario Outline: Convert PDF to image
Given I generate a PDF file as "fileInput"
And the pdf contains 3 pages with random text
And the pdf contains 3 images of size 300x300 on 3 pages
And the request data includes
| parameter | value |
| dpi | 300 |
| imageFormat | <format> |
When I send the API request to the endpoint "/api/v1/convert/pdf/img"
Then the response status code should be 200
And the response file should have size greater than 100
And the response file should have extension ".zip"
Examples:
| format |
| webp |
| png |
| jpeg |
| jpg |
| gif |
@libre @positive @topdf
Scenario Outline: Convert PDF to various types
Given I use an example file at "exampleFiles/example<extension>" as parameter "fileInput"
When I send the API request to the endpoint "/api/v1/convert/file/pdf"
Then the response status code should be 200
And the response file should have size greater than 100
And the response file should have extension ".pdf"
Examples:
| extension |
| .docx |
| .odp |
| .odt |
| .pptx |
| .rtf |
@calibre @positive @htmltopdf
Scenario: Convert HTML to PDF
Given I use an example file at "exampleFiles/example.html" as parameter "fileInput"
When I send the API request to the endpoint "/api/v1/convert/html/pdf"
Then the response status code should be 200
And the response file should have size greater than 100
And the response file should have extension ".pdf"
@calibre @positive @zippedhtmltopdf
Scenario: Convert zipped HTML to PDF
Given I use an example file at "exampleFiles/example_html.zip" as parameter "fileInput"
When I send the API request to the endpoint "/api/v1/convert/html/pdf"
Then the response status code should be 200
And the response file should have size greater than 100
And the response file should have extension ".pdf"
@calibre @positive @markdowntopdf
Scenario: Convert Markdown to PDF
Given I use an example file at "exampleFiles/example.md" as parameter "fileInput"
When I send the API request to the endpoint "/api/v1/convert/markdown/pdf"
Then the response status code should be 200
And the response file should have size greater than 100
And the response file should have extension ".pdf"
@markdown @positive
Scenario: Convert PDF to Markdown format
Given I generate a PDF file as "fileInput"
And the pdf contains 3 pages with random text
When I send the API request to the endpoint "/api/v1/convert/pdf/markdown"
Then the response status code should be 200
And the response file should have size greater than 100
And the response file should have extension ".md"
@positive @pdftocsv
Scenario: Convert PDF with tables to CSV format
Given I use an example file at "exampleFiles/tables.pdf" as parameter "fileInput"
And the request data includes
| parameter | value |
| outputFormat | csv |
| pageNumbers | all |
When I send the API request to the endpoint "/api/v1/convert/pdf/csv"
Then the response status code should be 200
And the response file should have size greater than 200
And the response file should have extension ".zip"
And the response ZIP should contain 3 files