84 Commits

Author SHA1 Message Date
Blaž Carli
6ae2fddd48
added option for disabling HTML Sanitize (#2831)
# Description of Changes

Please provide a summary of the changes, including:

- added disableSanitize: false # set to 'true' to disable Sanitize HTML,
set to false to enable Sanitize HTML; (can lead to injections in HTML)
- Some users uses this on local boxes, and uses Google Fonts, and base64
image src.


### General

- [x] I have read the [Contribution
Guidelines](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/CONTRIBUTING.md)
- [x] I have read the [Stirling-PDF Developer
Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/DeveloperGuide.md)
(if applicable)
- [x] I have read the [How to add new languages to
Stirling-PDF](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/HowToAddNewLanguage.md)
(if applicable)
- [x] I have performed a self-review of my own code
- [ ] My changes generate no new warnings

### Documentation

- [x] I have updated relevant docs on [Stirling-PDF's doc
repo](https://github.com/Stirling-Tools/Stirling-Tools.github.io/blob/main/docs/)
(if functionality has heavily changed)
- [ ] I have read the section [Add New Translation
Tags](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/HowToAddNewLanguage.md#add-new-translation-tags)
(for new translation tags only)

### Testing (if applicable)

- [ ] I have tested my changes locally. Refer to the [Testing
Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/DeveloperGuide.md#6-testing)
for more details.

---------

Co-authored-by: blaz.carli <blaz.carli@arctur.si>
Co-authored-by: Anthony Stirling <77850077+Frooodle@users.noreply.github.com>
2025-01-31 23:36:50 +00:00
Sai Kumar
b8303e3860
Pdf to image custom page selection (#2576)
# Description

Implemented custom page selection for the pdf-to-image feature, allowing
users to specify which PDF pages to convert to images.

1. Variable Renaming: Changed singleOrMultiple to imageResultType
because it supports three options: single, multiple, and custom.
2. New Field: Added pageNumbers to accept user-defined page selections.
This field appears only when custom is selected in the UI.
3. New Method: Added getPageIndicesToConvert to process and validate the
specified page numbers.
4. Method Update: Updated convertFromPdf to handle custom page numbers,
ensuring only selected pages are converted.
5. Translation Properties: Added two new English translation properties,
custom and customPageNumber, to all language files with placeholder
values. These will need to be translated into country-specific languages
in the future.

Note: If an invalid page number is provided (zero, negative, or exceeds
page count), a single image containing all PDF pages is generated.

Closes #918 

## Checklist

- [x] I have read the [Contribution
Guidelines](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/CONTRIBUTING.md)
- [x] I have performed a self-review of my own code
- [x] I have attached images of the change if it is UI based
- [x] I have commented my code, particularly in hard-to-understand areas
- [x] If my code has heavily changed functionality I have updated
relevant docs on [Stirling-PDFs doc
repo](https://github.com/Stirling-Tools/Stirling-Tools.github.io/blob/main/docs/)
- [x] My changes generate no new warnings
- [x] I have read the section [Add New Translation
Tags](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/HowToAddNewLanguage.md#add-new-translation-tags)
(for new translation tags only)

![Screenshot 2025-01-02 at 12 31
29 AM](https://github.com/user-attachments/assets/c4ba3f31-5dd6-4a17-991e-51b86c2eb466)
![Screenshot 2025-01-02 at 12 31
49 AM](https://github.com/user-attachments/assets/3e800a95-2088-4f69-8a01-bd03d7b9e471)

---------

Co-authored-by: Sai Kumar J <saikumar@Sais-MacBook-Air.local>
Co-authored-by: Ludy <Ludy87@users.noreply.github.com>
Co-authored-by: saikumar <saikumar.jetti@gmail.com>
Co-authored-by: Anthony Stirling <77850077+Frooodle@users.noreply.github.com>
2025-01-04 18:01:13 +00:00
Anthony Stirling
9884c65b10
formattingand autowired constructors (#2557)
# Description
This pull request includes several changes aimed at improving the code
structure and removing redundant code. The most significant changes
involve reordering methods, removing unnecessary annotations, and
refactoring constructors to use dependency injection.
Autowired now comes via constructor (which also doesn't need autowired
annotation as its done by default for configuration)



## Checklist

- [ ] I have read the [Contribution
Guidelines](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/CONTRIBUTING.md)
- [ ] I have performed a self-review of my own code
- [ ] I have attached images of the change if it is UI based
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] If my code has heavily changed functionality I have updated
relevant docs on [Stirling-PDFs doc
repo](https://github.com/Stirling-Tools/Stirling-Tools.github.io/blob/main/docs/)
- [ ] My changes generate no new warnings
- [ ] I have read the section [Add New Translation
Tags](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/HowToAddNewLanguage.md#add-new-translation-tags)
(for new translation tags only)
2024-12-24 09:52:53 +00:00
reecebrowne
a72615cc86
Merge branch 'main' into bug/2490/2488/image-to-pdf 2024-12-18 10:40:54 +00:00
Reece Browne
9eed761346 Correct default fit 2024-12-18 00:36:04 +00:00
Reece Browne
12d86049f6 Add default to convert image to pdf api 2024-12-18 00:30:06 +00:00
Ludy87
af100d4190
Remove Direct Logger and Use Lombok @Slf4j 2024-12-17 10:26:18 +01:00
Anthony Stirling
0652299bec fixes 2024-12-09 20:40:59 +00:00
Anthony Stirling
833b3c45c6
Removal of Ghostscript to use qpdf and tesseract directly (#2338)
* navbar fix multi tool and compress location

* release notes and ghostscript removal

* cleanups

* formatting

* update docs

* more

* more

* docs

* release bump

* Hardening suggestions for Stirling-PDF / ghostscript (#2339)

* Protect `readLine()` against DoS

* Sanitized user-provided file names in HTTP multipart uploads

---------

Co-authored-by: pixeebot[bot] <104101892+pixeebot[bot]@users.noreply.github.com>

---------

Co-authored-by: pixeebot[bot] <104101892+pixeebot[bot]@users.noreply.github.com>
2024-11-26 20:50:35 +00:00
Omar Ahmed Hassan
afad06bed4
Extract tables from PDF to CSV using Tabula (#2312)
* Add Tabula dependency and exclude slf4j-simple

- Add tabula-java dependency to extract tables into CSV.
- Exclude slf4j-simple due to Logback

* Add a flexible CSVWriter

- Add FlexibleCSVWriter which extends CSVWriter to pass a custom CSVFormat, as CSVWriter's parameterized constructor (that allows changing CSVFormat) is protected.

* Use Tabula in extracting tables from PDF

- Use Tabula in extracting tables from PDF instead of the existing implementation

* Delete PDFTableStripper as It is unneeded

- Delete PDFTableStripper as It is unneeded as Tabula-Java is used instead.

* Use correct class in ExtractCSVController logger

* Exclude gson and bcprov-jdk15on dependencies from tabula

- Exclude gson and bcprov-jdk15on from tabula-java due to detected security vulnerabilities.
2024-11-23 23:28:44 +00:00
pixeebot[bot]
09c9944fc3
Switch order of literals to prevent NullPointerException (#2035)
Co-authored-by: pixeebot[bot] <104101892+pixeebot[bot]@users.noreply.github.com>
2024-10-18 07:15:10 +01:00
Anthony Stirling
c85463bc18
Frooodle/license (#1994) 2024-10-14 22:34:41 +01:00
Eric
b13b925bf0
Fix pdfa conversion (#1907)
* fix: use gs to convert to pdfa and return output by reading file as bytes

* feat: update translation files for pdfToPDFA.credit

* Hardening suggestions for Stirling-PDF / fix_pdfa_conversion (#1908)

Switch order of literals to prevent NullPointerException

Co-authored-by: pixeebot[bot] <104101892+pixeebot[bot]@users.noreply.github.com>

---------

Co-authored-by: pixeebot[bot] <104101892+pixeebot[bot]@users.noreply.github.com>
2024-09-15 18:01:33 +01:00
Anthony Stirling
de4144a1a4
Metadata handling for all PDF endpoints (#1894)
* Add image support to multi-tool page

Related to #278

* changes to support image types

* final touches

* final touches

* final touches

Signed-off-by: a <a>

* final touches

Signed-off-by: a <a>

* final touches

Signed-off-by: a <a>

* final touches

Signed-off-by: a <a>

* final touches

Signed-off-by: a <a>

* final touches

Signed-off-by: a <a>

* final touches

Signed-off-by: a <a>

* Update translation files (#1888)

Signed-off-by: GitHub Action <action@github.com>
Co-authored-by: GitHub Action <action@github.com>

* final touches

Signed-off-by: a <a>

---------

Signed-off-by: a <a>
Signed-off-by: GitHub Action <action@github.com>
Co-authored-by: a <a>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: GitHub Action <action@github.com>
2024-09-14 16:29:39 +01:00
Ludy
1a594b27ab
Fix: introduces the verification of the python installation (#1730)
* Fix: introduces the verification of the python installation

* Update ExtractImageScansController.java

* Update CheckProgramInstall.java
2024-08-21 11:16:29 +01:00
Ludy
58618b3a21
Add: Convert PDF to WebP (#1666)
* Add PDF to WebP

* add swagger param

* back

* creates a custom image for Docker from pymupdf

* Converting with pdf2image and Pillow instead of pymupdf

* webp remove to pdf-to-img

* remove mupdf
2024-08-20 16:17:54 +01:00
Diallo
148feda83f
Bug fix UI crash when url is unrechable (#1642)
* feat: Add URL  reachability check in ConvertWebsiteToPDF

* Add tests for URL reachability in ConvertWebsiteToPdfTest

* test: Update URL in ConvertWebsiteToPdfTest for testing
2024-08-08 20:35:15 +00:00
Ludy87
5799e61385
new workaround for remove digital signature 2024-06-01 21:30:05 +02:00
Anthony Stirling
65f9438639 deletion changes 2024-05-27 17:53:18 +01:00
Anthony Stirling
6ffa80c386 changes 2024-05-27 16:31:00 +01:00
Ludy87
cbb4ccd4b7
add: multi OAuth2 option README.md, small cosmetic repairs 2024-05-25 21:10:12 +02:00
Anthony Stirling
b860146c93 logging for #1024 and jdk bump 2024-05-17 19:18:57 +01:00
Anthony Stirling
890163053b
introduces custom settings file (#1158)
* Introducing a custom settings file

* formats

* chnages

* Update README.md
2024-05-03 20:43:48 +01:00
Anthony Stirling
e7e3b34b37
fix for #1035 (#1137)
* fix for #1035

* Update ConvertImgPDFController.java
2024-04-28 22:37:40 +01:00
Anthony Stirling
71e93e3cb5
print (WIP), fake scan (WIP) and text conversion for ultra-lite (#1098)
* Changes!

* lang

* fake scan init, print init and pdf to text for exe

* Hardening suggestions for Stirling-PDF / changes (#1099)

* Switch order of literals to prevent NullPointerException

* Introduced protections against predictable RNG abuse

---------

Co-authored-by: pixeebot[bot] <104101892+pixeebot[bot]@users.noreply.github.com>

* Update README.md

* install custom fonts

* Formats etc

* version bump

* disable WIP work

* remove chinese font

---------

Co-authored-by: pixeebot[bot] <104101892+pixeebot[bot]@users.noreply.github.com>
Co-authored-by: systo <systo@host.docker.internal>
2024-04-21 23:06:44 +01:00
phfuh
b702f5772d
Add selection for PDF/A output format (#1095)
* Create PdfToPdfARequest.java

* Change class, add output format

* Add input field for output format

* Change output format selection order
2024-04-21 08:44:05 +01:00
Eric
3dbfde534e
fix: missing pdf to html endpoint (#1043)
* fix: missing pdf to html endpoint

* refactor: remove unused variable

---------

Co-authored-by: Anthony Stirling <77850077+Frooodle@users.noreply.github.com>
2024-04-08 21:28:57 +01:00
Eric
dfb8c64f5a
fix: switch to pdftohtml for pdf to html conversions (#998)
* fix: switch to pdftohtml for pdf to html conversions

* build: include poppler-utils in dockerfile for pdftohtml
2024-03-29 17:02:33 -04:00
Anthony Stirling
a9679da719 Revert weasy 2024-03-28 19:38:56 +00:00
pavedroad
ac620082ec
chore: fix some typos (#900)
Signed-off-by: pavedroad <qcqs@outlook.com>
2024-03-12 19:42:15 -04:00
Anthony Stirling
6f72096953 more fixes 2024-02-10 00:21:00 +00:00
Anthony Stirling
5a52e3d6dd other changes 2024-02-10 00:08:54 +00:00
Anthony Stirling
96e399a617 changing html and book labels 2024-02-10 00:00:07 +00:00
Anthony Stirling
22343e507d fixes 2024-02-09 23:45:18 +00:00
Anthony Stirling
15ad46fe1c book htmk 2024-02-09 23:24:25 +00:00
Anthony Stirling
734d76a3b5 test 2024-02-06 00:00:49 +00:00
pixeebot[bot]
95471a2fba Switch order of literals to prevent NullPointerException 2024-02-02 00:29:18 +00:00
pixeebot[bot]
c8dfe10a7c Sanitized user-provided file names in HTTP multipart uploads 2024-02-01 23:48:27 +00:00
Anthony Stirling
be1904749b Add stamp, fix html, change accepts 2024-01-28 17:36:17 +00:00
Anthony Stirling
2fa68be36b pipeline fixes 2024-01-18 21:57:41 +00:00
Anthony Stirling
5281d7a49a pdfbox3 upgrade and fix 2024-01-12 23:15:27 +00:00
TieStone
81d49b722b add table support in md2pdf transport 2024-01-11 14:54:01 +08:00
TieStone
ab9e7bbb8c add table support in md2pdf transport 2024-01-11 14:54:01 +08:00
TieStone
ee223d0405 add table support 2024-01-11 14:54:01 +08:00
Anthony Stirling
ef12c2f892 Add ebook support 2024-01-09 22:39:21 +00:00
Anthony Stirling
6fe268adcb eol 2024-01-03 17:59:04 +00:00
Anthony Stirling
04acdb3b02 Fix for ANY values and settings button enablement 2024-01-01 13:57:22 +00:00
Anthony Stirling
5f771b7851 formatting 2023-12-30 19:11:27 +00:00
Anthony Stirling
cbe4bca716 add banner and remove unused class 2023-12-29 21:46:17 +00:00
Anthony Stirling
03d3235e1d Merge remote-tracking branch 'origin/main' into test 2023-12-25 13:26:13 +00:00