mupdf.git
14 months agoNew image redaction option 1.23.x 1.23.11
Robin Watts [Wed, 31 Jan 2024 15:24:26 +0000 (15:24 +0000)]
New image redaction option

PDF_REDACT_IMAGE_REMOVE_UNLESS_INVISIBLE is a new option for redaction.
This completely removes images if the image (with clipping applied)
intruddes into the redacted area.

To implement this, the sanitize filter has been updated to keep
track of paths in all circumstances. This, in turn, allows us to
keep a clip rectangle. This is then passed into the image_filter
callback.

14 months agoAdd options to redaction for line art removal.
Robin Watts [Wed, 17 Jan 2024 19:17:23 +0000 (19:17 +0000)]
Add options to redaction for line art removal.

14 months agoBump version for patch release 1.23.11.
Sebastian Rasmussen [Thu, 8 Feb 2024 12:39:12 +0000 (20:39 +0800)]
Bump version for patch release 1.23.11.

14 months agoFix OSS-Fuzz 66460; assert in fz_dash_lineto(). 1.23.10
Robin Watts [Mon, 5 Feb 2024 14:38:52 +0000 (14:38 +0000)]
Fix OSS-Fuzz 66460; assert in fz_dash_lineto().

We perform the operation:

 ax += dx * d/dy     (where dx = bx - ax)

to advance ax towards bx. Unfortunately, due to the vagaries of
limited precision floating point it's possible to have both d and dy
being the same sign, with d < dy, and yet ax might "overrun" bx.

We fix this by using a function to do this 'advance' and checking
for overrun.

15 months agoFix infinite loop seen with PyMuPDF.
Robin Watts [Thu, 1 Feb 2024 11:59:46 +0000 (11:59 +0000)]
Fix infinite loop seen with PyMuPDF.

The issue is actually a division by zero in fz_dash_lineto.

Consider the following case, drawn in exquisite ascii art:

      +--------+
      |        |           (i.e. b is exactly inline with
      |        |            the bottom of the rectangle,
      +--------+ +b         but off to the right.)
            a+

Following through the code, we take neither of the first
if/else if clauses as a is horizontally on screen.

We take the next 'if', as a is below the screen, and we move
a up to the point at which 'ab' intersects the bottom of the
screen. i.e. a = b. At this point dx = 0.

We then continue through until we reach the bottom ifs, where
dx = 0, against our expectations.

The fix is much simpler than the above explanation. We just
recognise that when dx == 0 and/or dy == 0, we don't need to
move the points.

15 months agoAdd fallback ToUnicode CMap for very broken fonts.
Tor Andersson [Wed, 10 Jan 2024 13:55:21 +0000 (14:55 +0100)]
Add fallback ToUnicode CMap for very broken fonts.

If a font has not been embedded, but uses an Identity-H Cmap and has
no ToUnicode of its own that we can abuse to recover the encoding,
pretend that it uses the standard TrueType MacRoman encoding.

This glyph order more or less matches all of the standard Windows fonts which
are used for flavor of broken PDF.

15 months agoBug 707317: Use text string comparisons when searching named destinations.
Tor Andersson [Tue, 14 Nov 2023 20:21:22 +0000 (21:21 +0100)]
Bug 707317: Use text string comparisons when searching named destinations.

This is not strictly per specification which allows using byte by byte
comparisons, but with this change we can find named destinations using
any of the UTF-16BE, UTF-16LE, UTF-8 or PDFDocEncoding encodings.

This lets us handle remote named destinations using non-ASCII characters
more robustly.

15 months agoBump version for patch release 1.23.10.
Sebastian Rasmussen [Thu, 1 Feb 2024 13:57:07 +0000 (21:57 +0800)]
Bump version for patch release 1.23.10.

15 months agoBug 707448 continued: Properly allow for Tz when redacting. 1.23.9
Robin Watts [Wed, 10 Jan 2024 12:11:10 +0000 (12:11 +0000)]
Bug 707448 continued: Properly allow for Tz when redacting.

The previous fix was wrong. Fixed properly here, I hope.

With this, redacting in all the examples given on the bug
now appears to work.

15 months agoBug 707448: Fix text moving after redaction.
Robin Watts [Mon, 8 Jan 2024 12:19:44 +0000 (12:19 +0000)]
Bug 707448: Fix text moving after redaction.

PDF text can (broadly) either be placed using:

 (foo) Tj

or

 [ (foo) 10 (bar) ] TJ

We were adjusting for removed text within a string in the wrong
sense when using the former. This was non-obvious, because the
numbers given in the array in the latter are SUBTRACTED rather
than added to the position, so they are implicitly negated.

Here we recast the code slightly so that the adjustments are made
the same way in either method, and we explicitly negate the
values before writing them to the array.

15 months agoBug707440: Add use-cid-for-unknown-unicode option to stext device.
Tor Andersson [Thu, 4 Jan 2024 13:56:41 +0000 (14:56 +0100)]
Bug707440: Add use-cid-for-unknown-unicode option to stext device.

Save raw character codes in the fz_text objects. Use these instead of
U+FFFD when the use-cid-for-unknown-unicode option is set.

With this option set, we now match Ghostscript's behaviour in more
cases.

15 months agoBump version for patch release 1.23.9.
Sebastian Rasmussen [Mon, 8 Jan 2024 15:55:27 +0000 (16:55 +0100)]
Bump version for patch release 1.23.9.

16 months agoscripts/wrap/cpp.py: removed some extra declarations required by PyMuPDF. 1.23.8
Julian Smith [Fri, 5 Jan 2024 09:53:22 +0000 (09:53 +0000)]
scripts/wrap/cpp.py: removed some extra declarations required by PyMuPDF.

Not required now because the functions have been made public.

16 months agoMade some previously-internal functions public.
Julian Smith [Fri, 5 Jan 2024 09:43:31 +0000 (09:43 +0000)]
Made some previously-internal functions public.

fz_copy_pixmap_rect()
fz_pixmap_size()
fz_scale_pixmap()
fz_subsample_pixmap()
fz_write_pixmap_as_jpeg()
pdf_lookup_page_loc()

(These are used by PyMuPDF, and need to be public for rebased implementation.)

16 months agoscripts/pipcl.py: avoid problems with python-3.12 and setuptools.
Julian Smith [Wed, 3 Jan 2024 16:43:32 +0000 (16:43 +0000)]
scripts/pipcl.py: avoid problems with python-3.12 and setuptools.

Python-3.12 doesn't seem to support setuptools by default, so we need to import
it lazily. This allows handling of `scripts/pymupdfwrap.py --venv` to install
setuptools into a venv, which will work.

16 months agoMakefile: fix build failures with library soft-links.
Julian Smith [Wed, 3 Jan 2024 16:42:17 +0000 (16:42 +0000)]
Makefile: fix build failures with library soft-links.

We need to use `ln -f` when creating shared library soft-links, otherwise we
fail if they already exist.

16 months agoMakefile: Use SO_VERSION only on Linux and OpenBSD.
Julian Smith [Mon, 18 Sep 2023 18:21:52 +0000 (19:21 +0100)]
Makefile: Use SO_VERSION only on Linux and OpenBSD.

16 months agoMakefile: add version numbers and installation targets for shared libraries.
Julian Smith [Tue, 5 Sep 2023 07:43:51 +0000 (08:43 +0100)]
Makefile: add version numbers and installation targets for shared libraries.

* Installation targets install-shared-* build+install C/C++/Python/C# bindings.
* On non-MacOS we append .FZ_VERSION_MINOR.FZ_VERSION_PATCH to shared
  libraries.
* On Linux we create links such as libmupdf.so -> libmupdf.so.23.1 (not
  required on OpenBSD).

For install-shared-* targets we require that USE_SYSTEM_LIBS=yes, otherwise we
fail with a diagnostic.

We install Python mupdf.py and _mupdf.so into location from Python's
sysconfig.get_path('platlib').

In existing calls of ./scripts/mupdfwrap.py:
* Add `--venv` so that we automatically get libclang and swig.
* Use `-d $(OUT)` so we use the right build directory, e.g. if $(build_prefix)
  is set.

$(OUT) is only set correctly (i.e. contains `shared-`) if Make was run with
`shared=yes`. So if $(shared) is not 'yes', rules for shared library targets
that use $(OUT) rerun make with shared=yes.

16 months agoUpdate make targets such that python bindings build with tesseract.
Sebastian Rasmussen [Tue, 29 Aug 2023 22:40:48 +0000 (00:40 +0200)]
Update make targets such that python bindings build with tesseract.

This takes the build_suffix into consideration when passing a lib
path to the mupdfwrap.py script so that the shared library is found.

16 months agoBump version for patch release 1.23.8.
Sebastian Rasmussen [Fri, 5 Jan 2024 23:05:22 +0000 (00:05 +0100)]
Bump version for patch release 1.23.8.

17 months agoscripts/pipcl.py: reduce unnecessary rebuilds. 1.23.7
Julian Smith [Sun, 5 Nov 2023 17:59:18 +0000 (17:59 +0000)]
scripts/pipcl.py: reduce unnecessary rebuilds.

We now use un-evaluated environment variables such as '$CC' so that, for
example, commands are unchanged when Pyodide sets CC to unique path in `/tmp`.

Also avoid unnecessarily updating .so's on macos.

17 months agoFlag broken struct trees in interpreter.
Tor Andersson [Fri, 10 Nov 2023 12:42:02 +0000 (13:42 +0100)]
Flag broken struct trees in interpreter.

Avoid spamming the same error repeatedly when trying to
sync up the structure tree with marked content and the
structure tree is broken.

17 months agoWhen structure tree is missing, assume that is broken and continue.
Sebastian Rasmussen [Fri, 3 Nov 2023 14:00:57 +0000 (15:00 +0100)]
When structure tree is missing, assume that is broken and continue.

Previously MuPDF would throw an exception if a PDF had a cycle in
the structure tree, but now we print a warning and continue as if
the structure tree didn't exist.

17 months agoscripts/wrap/cpp.py: added swig-friendly wrapper fz_string_from_text_language2().
Julian Smith [Tue, 21 Nov 2023 12:22:46 +0000 (12:22 +0000)]
scripts/wrap/cpp.py: added swig-friendly wrapper fz_string_from_text_language2().

17 months agoBug 707323: Fix page-breaks with restarting layout.
Robin Watts [Wed, 15 Nov 2023 18:07:15 +0000 (18:07 +0000)]
Bug 707323: Fix page-breaks with restarting layout.

Page breaks were terminating the layout process in unexpected
ways.

Also fix page-break-after with tables.

17 months agoBug 707324: Fix stray HTML table rectangles on subsequent pages.
Robin Watts [Wed, 15 Nov 2023 17:14:02 +0000 (17:14 +0000)]
Bug 707324: Fix stray HTML table rectangles on subsequent pages.

When skipping over boxes that have previously been placed on other
pages, ensure we properly set all the table cells to be 0 height
so we don't accidentally draw them on the next page we layout.

17 months agoBug 707327: Fix group alpha bug.
Robin Watts [Tue, 14 Nov 2023 19:51:09 +0000 (19:51 +0000)]
Bug 707327: Fix group alpha bug.

When combining group alpha, (and shape!) we need to 'union'
the values rather than blend.

Clipping A through M onto B should never result in B having
lower alpha values.

17 months agoBug 707045: Convert some ascii control characters to spaces.
Tor Andersson [Tue, 14 Nov 2023 18:51:09 +0000 (19:51 +0100)]
Bug 707045: Convert some ascii control characters to spaces.

Tabs, newlines, etc.

17 months agoBump version for patch release 1.23.7.
Sebastian Rasmussen [Fri, 24 Nov 2023 14:38:58 +0000 (15:38 +0100)]
Bump version for patch release 1.23.7.

17 months agoAdd interface for rearranging pages in a document. 1.23.6
Sebastian Rasmussen [Mon, 13 Nov 2023 16:24:18 +0000 (17:24 +0100)]
Add interface for rearranging pages in a document.

Can be used to create subsets or change the order of pages.

This also exposes the same interface over js and JNI.

17 months agoTweak CSS used for reading text files.
Robin Watts [Wed, 8 Nov 2023 11:56:20 +0000 (11:56 +0000)]
Tweak CSS used for reading text files.

Remove pre and body margins. This gives us consistent results
on every page (as CSS only applies body margins on the first
page).

Also, add pagebreak handling.

17 months agoAvoid dropping xml reference twice upon exception parsing the xml.
Sebastian Rasmussen [Thu, 2 Nov 2023 13:42:02 +0000 (14:42 +0100)]
Avoid dropping xml reference twice upon exception parsing the xml.

17 months agoAdd text file document handler.
Robin Watts [Fri, 6 Oct 2023 18:33:23 +0000 (19:33 +0100)]
Add text file document handler.

Loads text, converts to HTML internally.

Contains improvements and simplifications courtesy of Tor!

17 months agoCheck in generated box font source.
Tor Andersson [Wed, 25 Oct 2023 14:32:47 +0000 (16:32 +0200)]
Check in generated box font source.

17 months agoFix MSVC builds for new NimbusBoxes-Regular font.
Robin Watts [Tue, 10 Oct 2023 08:46:58 +0000 (09:46 +0100)]
Fix MSVC builds for new NimbusBoxes-Regular font.

Also remember to drop it once done.

17 months agoCreate a new subset font with box drawing characters from NimbusMono.
Tor Andersson [Mon, 9 Oct 2023 18:42:38 +0000 (20:42 +0200)]
Create a new subset font with box drawing characters from NimbusMono.

Use this as a fallback font for the U+2500 and U+2600 unicode blocks.

17 months agoBump version for patch release 1.23.6.
Sebastian Rasmussen [Wed, 15 Nov 2023 12:27:38 +0000 (13:27 +0100)]
Bump version for patch release 1.23.6.

18 months agoBump version for patch release 1.23.5. 1.23.5
Julian Smith [Thu, 2 Nov 2023 16:06:24 +0000 (16:06 +0000)]
Bump version for patch release 1.23.5.

18 months agoFix fz_reset_story once a story has completed.
Robin Watts [Sat, 4 Nov 2023 10:23:02 +0000 (10:23 +0000)]
Fix fz_reset_story once a story has completed.

Clear the complete flag, or once a story has completed
it will never start again.

18 months agoscripts/wrap/__main__.py: use pipcl.py's support for python build flags.
Julian Smith [Thu, 2 Nov 2023 15:56:37 +0000 (15:56 +0000)]
scripts/wrap/__main__.py: use pipcl.py's support for python build flags.

This fixes finding of python-config on some macos systems.

This commit is a subset of master commit da3c74cd0d3 "scripts/wrap/: fixed
pyodide builds and simplified windows python search.". We haven't cherry-picked
the whole commit because this would require other Pyodide-related changes that
are not suitable for 1.23.x.

18 months agoPyMuPDF issue 2608: CMAP mrange with surrogate chars.
Robin Watts [Wed, 18 Oct 2023 00:23:26 +0000 (17:23 -0700)]
PyMuPDF issue 2608: CMAP mrange with surrogate chars.

The file given in the bug contains a CMAP with a bfrange entry:

<63> <73> <D835DF08>

i.e. a range where the code 'base' is given as a surrogate pair.

We parse this, and because the base definition is longer than
the single 16 bit value, we break the range down into a series
of single char ranges.

We add these single char ranges (using pdf_map_one_to_many)
incrementing the last value in the range by 1 each time.

Unfortunately, pdf_map_one_to_many spots the surrogates, and
rewrites the data it is passed. And the data is then reused
for the next call into pdf_map_one_to_many.

The fix, implemented here, is to make pdf_map_one_to_many take
a local copy of the values before it modifies them.

This solves the problem.

18 months agoscripts/pipcl.py: various improvements.
Julian Smith [Thu, 12 Oct 2023 18:13:09 +0000 (19:13 +0100)]
scripts/pipcl.py: various improvements.

Cope if Windows python executable path contains spaces.

Use global log level.
    Instead of passing `verbose` args between fns, we now have a global
    `g_verbose` and fns log0(), log1(), log2(). log1() only outputs if
    g_verbose >= 1 etc.

    Command-line `--verbose` increments `g_verbose`.

    PIPCL_VERBOSE sets initial verbose leve.

Fix for wheel creation and multi-line license text.

Allow fn_sdist to specify a different name of file within the sdist.

Fix finding python-config on macos python-3.12 builds.
    Reorder things so when we select the last matching candidate, it is the
    best match.

18 months agoscripts/: minor changes.
Julian Smith [Fri, 6 Oct 2023 13:19:25 +0000 (14:19 +0100)]
scripts/: minor changes.

scripts/jlib.py:build(): show change to command.

scripts/wrap/cpp.py: reinit_singlethreaded(): disable diagnostic as it isn't
necessary.

scripts/wrap/__main__.py: use make -j by default. We now default to `make -j N`
where `N` is the number of cpus, from Python's multiprocessing.cpu_count().

18 months agoFixes for Python wheel creation with new pipcl.py.
Julian Smith [Sat, 23 Sep 2023 09:15:17 +0000 (10:15 +0100)]
Fixes for Python wheel creation with new pipcl.py.

pyproject.toml:
    Added setuptools, required by updated pipcl.py.
setup.py:
    Update to match new pipcl.py and fix version number to pass pipcl's
    improved checking with pep-440.
scripts/pipcl.py:
    Fix handling of multiline license text.

18 months agoscripts/pipcl.py: updated to latest version in PyMuPDF.
Julian Smith [Thu, 21 Sep 2023 20:56:47 +0000 (21:56 +0100)]
scripts/pipcl.py: updated to latest version in PyMuPDF.

We need the latest pipcl.PythonFlags() to allow builds on macos with basic
python installation.

18 months agoscripts/: New/improved C++ wrappers, misc other improvements to wrappers.
Julian Smith [Tue, 5 Sep 2023 07:42:52 +0000 (08:42 +0100)]
scripts/: New/improved C++ wrappers, misc other improvements to wrappers.

Rewrote unsafe fz_search_page2() custom method:
    Our custom wrapper for fz_search_page() was unsafe because it was treating
    `int *hit_mark` as a single int out-param, when in fact it will write to an
    array of `max` ints.

    New wrapper returns a std::vector of a new struct fz_search_page2_hit
    containing quad and mark members, so can be wrapped easily by SWIG.

    Tested by --test-python.

Added custom wrapper fz_highlight_selection2().
    New custom wrapper returns std::vector<fz_quad> so works with SWIG. In
    --test-cpp, added a test of the new wrapper and enabled use of asserts.

Fix Win32 debug builds:
    Exclude locking assert/debug fns from SWIG, as they aren't available in
    release builds (and are not useful to SWIG-generated bindings anyway).

    Also add fz_lock_debug_lock() and fz_lock_debug_unlock() to windows.dev if
    we are doing a debug build, to avoid link errors.

Minor changes:

    scripts/wrap/swig.py: ignore any error from informational call of `which`.

    scripts/wrap/cpp.py: Improve diagnostic if clang not found.

    scripts/wrap/__main__.py: allow --test-csharp-gui to work for any current
    directory.

    scripts/jlib.py: build(): added asserts to track down occasional errors.

    Fixed `scripts/mupdfwrap.py --venv` on openbsd.

    scripts/wrap/swig.py: Improved a C# diagnostic.

    scripts/wrap/cpp.py: add debug diagnostics to Director `use_virtual_*()`
    methods.

    scripts/wdev.py: allow override of VS year etc using environment variables.

    scripts/wrap/cpp.py: possible fix for reported clang problem. Looks like
    clang can use a different placeholder name for anonymous structs.

    Moved C++ code for --test-cpp into a separate file,
    scripts/mupdfwrap_test.cpp.

18 months agoUse CropBox as origin for fitz space in PDF documents.
Tor Andersson [Thu, 19 Oct 2023 16:07:26 +0000 (18:07 +0200)]
Use CropBox as origin for fitz space in PDF documents.

This means the default fitz space coordinates for the default page bounding box
for all document types will have their origin at the top left.

18 months agoChange default to use CropBox rather than MediaBox. 1.23.4
Tor Andersson [Thu, 28 Sep 2023 14:47:23 +0000 (16:47 +0200)]
Change default to use CropBox rather than MediaBox.

18 months agoBug 707083: Remember to retain colorspaces when filtering in-line images.
Sebastian Rasmussen [Thu, 5 Oct 2023 04:06:53 +0000 (06:06 +0200)]
Bug 707083: Remember to retain colorspaces when filtering in-line images.

18 months agoBug 707074: Detect cycles in structure trees.
Sebastian Rasmussen [Wed, 4 Oct 2023 00:50:30 +0000 (02:50 +0200)]
Bug 707074: Detect cycles in structure trees.

Without this detection this code could end up executing forever in case
structure elements were broken and referenced each other in a loop.

Also do away with the O(N^2) search and do an O(N) search at
the expense of deeper recursion and larger stack frames.

19 months agojni: Add API to check if annotation has a Rect property or not.
Sebastian Rasmussen [Thu, 28 Sep 2023 13:44:45 +0000 (15:44 +0200)]
jni: Add API to check if annotation has a Rect property or not.

19 months agoAdd fixed padding to Ink annotations.
Tor Andersson [Thu, 14 Sep 2023 14:29:00 +0000 (16:29 +0200)]
Add fixed padding to Ink annotations.

Avoid unselectable bboxes when the stroke is a tiny dot.

19 months agoBump version for patch release 1.23.4.
Sebastian Rasmussen [Mon, 11 Sep 2023 22:51:33 +0000 (00:51 +0200)]
Bump version for patch release 1.23.4.

19 months agojni/android: Avoid manipulating device object during a raised exception.
Sebastian Rasmussen [Wed, 6 Sep 2023 21:23:31 +0000 (23:23 +0200)]
jni/android: Avoid manipulating device object during a raised exception.

20 months agoBump version for patch release 1.23.3. 1.23.3
Sebastian Rasmussen [Tue, 5 Sep 2023 11:51:19 +0000 (13:51 +0200)]
Bump version for patch release 1.23.3.

20 months agoBug 706863: Trigger repair if trailer info and xref length are not the same.
Sebastian Rasmussen [Mon, 4 Sep 2023 00:06:24 +0000 (02:06 +0200)]
Bug 706863: Trigger repair if trailer info and xref length are not the same.

The file from this bug has a trailer entry /Size 232, indicating
that objects are numbered 0..231, while the xref starts with
xref\n0 233, indicating that objects are numbered 0..232. The
offset in xref entry for object 231 does point to a viable PDF
object, but the offset in xref entry for object 232 points to the
offset of the xref table itself, i.e. the last object cannot be
parsed successfully!

The pdfref17 states that the /Size value is 1 greater than the
highest object number used in the file and that any object in a
cross-reference section whose number is greate that this value is
ignored and considered missing.

Previously MuPDF kept /Size in the trailer unchanged, indicating
objects 0..231, but sized the in-memory xref to be able to
contain objects 0..232. Loading and viewing the document works
fine since PDF object 232 0 R is not actually referenced
anywhere.

When saving objects 0..231 were written to the output file
successfully, but once object 232 was being loaded into memory
this failed, causing the entire save operation to fail.

After this commit, by validating the xref size a repair is
triggered, causing both the xref and the trailer /Size key to be
ignored in preference to scanning the file for objects and
rebuilding both the xref table and the trailer.

20 months agoBump version for patch release 1.23.2. 1.23.2
Robin Watts [Thu, 31 Aug 2023 17:47:51 +0000 (18:47 +0100)]
Bump version for patch release 1.23.2.

20 months agoscripts/wrap/: added support for building with tesseract.
Julian Smith [Thu, 31 Aug 2023 14:54:39 +0000 (15:54 +0100)]
scripts/wrap/: added support for building with tesseract.

We build base C library with tesseract if build directory contains
`-tesseract-`.

20 months agoVS: Add DebugPythonTesseract and ReleasePythonTesseract configs
Robin Watts [Thu, 31 Aug 2023 12:11:53 +0000 (13:11 +0100)]
VS: Add DebugPythonTesseract and ReleasePythonTesseract configs

For building pymupdf with tesseract support on windows.

20 months agoBump version for patch release 1.23.1. 1.23.1
Tor Andersson [Wed, 30 Aug 2023 12:27:47 +0000 (14:27 +0200)]
Bump version for patch release 1.23.1.

20 months agoAdd missing leptonica file causing link error in python bindings.
Julian Smith [Tue, 29 Aug 2023 22:41:54 +0000 (00:41 +0200)]
Add missing leptonica file causing link error in python bindings.

20 months agoBug 707022: Revert "Draw ink annotations with butt line caps like Acrobat."
Sebastian Rasmussen [Wed, 23 Aug 2023 11:29:34 +0000 (13:29 +0200)]
Bug 707022: Revert "Draw ink annotations with butt line caps like Acrobat."

That commit corrected ink annotation appearances to confirm to the
rendering done by older versions of Acrobat Reader.

Newer and current versions of Acrobat Reader renders ink annotation
strokes not using butt line caps, but round line caps.

20 months agoBug 706864: Cache non-decrypted encryption dict/ID after repair.
Sebastian Rasmussen [Wed, 16 Aug 2023 22:38:05 +0000 (00:38 +0200)]
Bug 706864: Cache non-decrypted encryption dict/ID after repair.

When opening a document a non-decrypted encryption dict/ID are read, then
the rest of the document. If a repair is triggered it used to throw away
all cached objects to avoid caching non-decrypted strings/streams.
This is the correct decision of all objects except the encryption dict/ID.
If at a later point the PDF document is saved with encryption kept then
all objects will be saved using the encryption key from the non-decrypted
encryption dictionary/ID determined when opening the document, but when
saving the encryption dictionary itself it will not get any special
treatment, thus its owner/user password strings are decrypted. This
decrypted encryption dict and ID are then saved to the output file.
This means that the owner/user password strings no longer correspond
to the ones in the original document. So when mupdf-gl tries to reopen
the output file it will ask for a password, but the original owner/user
password's will not work.

This is fixed by caching the encryption dictionary and ID in non-decrypted
form after repairing the document and clearing out all the non-decrypted
objects. Doing that leaves the objects in memory until it is time to save
the document, causing the encryption dictionary to be saved in non-decrypted
form, which means that the owner/user password strings are left unchanged,
and thus mupdf-gl will accept the original owner/user passwords when reopening
the file.

For the file in this particular bug the owner password is unknown while the
user password is the default password (i.e. the empty string).

20 months agoEnd implicit operation also for Popups, Links and Signatures and check boxes.
Sebastian Rasmussen [Thu, 24 Aug 2023 16:46:41 +0000 (18:46 +0200)]
End implicit operation also for Popups, Links and Signatures and check boxes.

Previously the implicit operation for these types of annotations were
not ended, which left a stray implicit operation which would never be
ended.

The same issue was present in the code called when toggling checkbox widgets.

Later on if an application such as AppKit, called pdf_undoredo_state(),
it would throw an exception because an operation is ongoing.

20 months agoFix FZ_VERBOSE_EXCEPTIONS. 1.23.0
Tor Andersson [Tue, 22 Aug 2023 13:53:32 +0000 (15:53 +0200)]
Fix FZ_VERBOSE_EXCEPTIONS.

Tweak macros to avoid trailing commas if "format" argument is last.

Fix crash when formatting "rethrow" message in fz_rethrow_if.

20 months agoEnsure variable is initialized.
Tor Andersson [Mon, 21 Aug 2023 10:20:18 +0000 (12:20 +0200)]
Ensure variable is initialized.

Set annot to null. If pdf_create_annot_raw fails, we still want to be able to
safely free it.

20 months agogl: Avoid double drop of outline/doc when reloading documents.
Sebastian Rasmussen [Thu, 17 Aug 2023 19:29:36 +0000 (21:29 +0200)]
gl: Avoid double drop of outline/doc when reloading documents.

If a document was encrypted with the default password, and the user
saved the document with new encryption/password, the saving of the new
file would work fine, but once the file was being reloaded the outline
was dropped an a new password was requested. Once the password was
entered the file was reloaded again, causing the same outline to be
dropped a second time. This commit avoids this by setting the outline
pointer to NULL when the outline is dropped.

20 months agoHandle exceptions while ensuring incremental/local objects.
Robin Watts [Wed, 16 Aug 2023 11:34:24 +0000 (12:34 +0100)]
Handle exceptions while ensuring incremental/local objects.

20 months agojava: Make sure to include inner classes in jar-file.
Sebastian Rasmussen [Tue, 15 Aug 2023 15:27:22 +0000 (17:27 +0200)]
java: Make sure to include inner classes in jar-file.

Thanks to Thomas Hirsch for pointing this out.

20 months agoUse the correct file drop callback even upon exception.
Sebastian Rasmussen [Mon, 14 Aug 2023 12:44:30 +0000 (14:44 +0200)]
Use the correct file drop callback even upon exception.

20 months agoHandle memory allocation error in pkcs7 code.
Sebastian Rasmussen [Mon, 14 Aug 2023 12:36:03 +0000 (14:36 +0200)]
Handle memory allocation error in pkcs7 code.

20 months agoAvoid printing useless, possibly invalid bitmap signature.
Sebastian Rasmussen [Sun, 6 Aug 2023 22:28:53 +0000 (00:28 +0200)]
Avoid printing useless, possibly invalid bitmap signature.

Fixes oss-fuzz issue 61237

20 months agoBug 707021: DefaultColorSpaces constructor is not static.
Sebastian Rasmussen [Sun, 13 Aug 2023 12:35:22 +0000 (14:35 +0200)]
Bug 707021: DefaultColorSpaces constructor is not static.

20 months agoAdd forgotten exception rethrow.
Sebastian Rasmussen [Fri, 11 Aug 2023 01:48:34 +0000 (03:48 +0200)]
Add forgotten exception rethrow.

20 months agoDo not use freed pdf_sanitize stack entry after freeing it.
Sebastian Rasmussen [Thu, 10 Aug 2023 23:19:38 +0000 (01:19 +0200)]
Do not use freed pdf_sanitize stack entry after freeing it.

20 months agoFix problem with ImageProb2.pdf and ImageProb2y.pdf.
Robin Watts [Mon, 14 Aug 2023 11:05:27 +0000 (12:05 +0100)]
Fix problem with ImageProb2.pdf and ImageProb2y.pdf.

insert_weight must not be called with the source location being
out of range.

This fixes oss-fuzz issues 60349, 61290 and 61502.

20 months agoRevert the reversion of the fix for Bug 706764
Robin Watts [Mon, 14 Aug 2023 11:03:36 +0000 (12:03 +0100)]
Revert the reversion of the fix for Bug 706764

Bug 706764: OSS-Fuzz 59600: Fix scaled pixmap weight generation.

This reverts commit 9a35f1652ed34ea6aaa7daef36de219b1ed8a3ce.

The next commit will address the issue with ImageProb2.pdf and
ImageProb2y.pdf. Kept separate for clarity.

20 months agoMakerules scripts/wrap/__main__.py: fix cross-building to arm64 on MacOS.
Julian Smith [Thu, 10 Aug 2023 21:08:37 +0000 (22:08 +0100)]
Makerules scripts/wrap/__main__.py: fix cross-building to arm64 on MacOS.

Makerules:
    On MacOS when building for arm64, we need to set HAVE_LIBCRYPTO to
    no, otherwise we try to link arm64 with x86 libcrypto when building
    mutool etc.

scripts/wrap/__main__.py:
    On MacOS we need to include $ARCHFLAGS in our c++ command.

21 months agoRevert "Bug 706764: OSS-Fuzz 59600: Fix scaled pixmap weight generation." 1.23.0-rc1
Sebastian Rasmussen [Fri, 4 Aug 2023 12:55:30 +0000 (14:55 +0200)]
Revert "Bug 706764: OSS-Fuzz 59600: Fix scaled pixmap weight generation."

This reverts commit 8ad9e2393c695e8d8b850a2b735b4a6d02e05a26.

The bugfix introduced a crash in ImageProb2.pdf and ImageProb2y.pdf
in the cluster. Misbehaving on a real file is worse than misbehaving
on a fuzzed file, so for the time being we back this one out until
we have a proper fix that doesn't cause issues for normal files.

21 months agomupdf-gl: Fix '-r' resolution option.
Tor Andersson [Fri, 4 Aug 2023 12:55:43 +0000 (14:55 +0200)]
mupdf-gl: Fix '-r' resolution option.

It went "missing" when the -Y UI scale option was added.

21 months agoUpdate CHANGES.
Tor Andersson [Mon, 31 Jul 2023 12:38:24 +0000 (14:38 +0200)]
Update CHANGES.

21 months agoDocumentation: Latest WASM fixes reflected.
Jamie Lemon [Mon, 31 Jul 2023 15:07:59 +0000 (16:07 +0100)]
Documentation: Latest WASM fixes reflected.

21 months agoFix compilation warning about different return types.
Sebastian Rasmussen [Tue, 1 Aug 2023 12:57:50 +0000 (14:57 +0200)]
Fix compilation warning about different return types.

21 months agojava: Add support for explicit MediaBox/CropBox/ArtBox in Page.
Tor Andersson [Fri, 28 Jul 2023 15:04:26 +0000 (17:04 +0200)]
java: Add support for explicit MediaBox/CropBox/ArtBox in Page.

21 months agomutool: Separate resolveLink and resolveLinkDestination.
Tor Andersson [Wed, 26 Jul 2023 12:05:40 +0000 (14:05 +0200)]
mutool: Separate resolveLink and resolveLinkDestination.

21 months agowasm: Outline iterator.
Tor Andersson [Tue, 25 Jul 2023 15:31:06 +0000 (17:31 +0200)]
wasm: Outline iterator.

21 months agoscripts/jlib.py: system(): avoid encoding exceptions on Windows.
Julian Smith [Thu, 27 Jul 2023 15:53:16 +0000 (16:53 +0100)]
scripts/jlib.py: system(): avoid encoding exceptions on Windows.

If writing command output to a stream such as sys.stdout, and this
stream has a non-utf8 encoding such as cp1252 on Windows, we could get
encoding errors because some utf-8 characters cannot be represented in
cp1252.

So instead of system() using its `encoding` arg (default 'utf8') as the
encoding for all outputs, for outputs that have a `.write()` method we
assume there is an `.encoding` value, and use that. Other outputs such
as plain callables, use the `encoding` arg.

So we now create zero or more incremental decoders, setting up each one
with the `errors` arg so that it will not raise.

21 months agoscripts/wrap/state.py: fix --test-cpp with debug Windows builds.
Julian Smith [Thu, 27 Jul 2023 21:25:36 +0000 (22:25 +0100)]
scripts/wrap/state.py: fix --test-cpp with debug Windows builds.

When compiling the test code, need to set extra flags to match DLLs,
e.g. `/MDd` activates debug versions of things like std::string, which
have a different size.

21 months agoscripts/wrap/__main__.py: fix windows x32 builds and improve --test-cpp.
Julian Smith [Sun, 23 Jul 2023 21:09:37 +0000 (22:09 +0100)]
scripts/wrap/__main__.py: fix windows x32 builds and improve --test-cpp.

Workaround a problem where `pip install libclang` seems to fail to
install in a new 64-bit venv if we are in a 32-bit venv inside 64-bit
venv created by pip itself.

--test-cppnow now also checks mupdf::fz_lookup_metadata2().

21 months agoscripts/wrap/state.py: allow `-d build/shared-debug` to work on Windows.
Julian Smith [Thu, 27 Jul 2023 15:50:52 +0000 (16:50 +0100)]
scripts/wrap/state.py: allow `-d build/shared-debug` to work on Windows.

We automatically append current cpu and python version if not specified,
e.g. to get `build/shared-debug-x64-py3.11`.

21 months agoSupport -b ArtBox argument to mutool convert.
Tor Andersson [Tue, 25 Jul 2023 10:21:40 +0000 (12:21 +0200)]
Support -b ArtBox argument to mutool convert.

21 months agopdf: Initialize chapter field of fz_location when resolving links.
Tor Andersson [Wed, 26 Jul 2023 12:04:49 +0000 (14:04 +0200)]
pdf: Initialize chapter field of fz_location when resolving links.

21 months agoBug 706942: Handle indirect PDF null objects.
Sebastian Rasmussen [Wed, 26 Jul 2023 00:13:43 +0000 (02:13 +0200)]
Bug 706942: Handle indirect PDF null objects.

21 months agoCheck remaining data size before parsing bmp array header.
Sebastian Rasmussen [Sun, 23 Jul 2023 22:21:17 +0000 (00:21 +0200)]
Check remaining data size before parsing bmp array header.

This fixes oss-fuzz 59390.

21 months agoFix image leak in PSD parser.
Sebastian Rasmussen [Sun, 23 Jul 2023 21:45:57 +0000 (23:45 +0200)]
Fix image leak in PSD parser.

This fixes oss-fuzz 59841.

21 months agoSupport use of legacy annotation dash pattern.
Sebastian Rasmussen [Tue, 18 Jul 2023 09:57:32 +0000 (11:57 +0200)]
Support use of legacy annotation dash pattern.

21 months agoplatform/win32/mupdfcpp.vcxproj: fixed Windows debug builds of C++ bindings.
Julian Smith [Wed, 19 Jul 2023 19:51:53 +0000 (20:51 +0100)]
platform/win32/mupdfcpp.vcxproj: fixed Windows debug builds of C++ bindings.

The debug .dll and .lib files previously omitted MuPDF global data.

21 months agodocs/src/language-bindings.rst: improve and simplify build instructions.
Julian Smith [Wed, 19 Jul 2023 17:32:51 +0000 (18:32 +0100)]
docs/src/language-bindings.rst: improve and simplify build instructions.

It looks like pypi.org's swig can generate C# bindings and works
on OpenBSD, so no need for --swig-windows-auto or other special
instructions to use system package manager's swig.

`-b m` is now skipped on Windows (rather than giving an error) so we can
use same command as on Unix when building C++ bindings.

Added examples of setting LD_LIBRARY_PATH or PATH etc when using
bindings.

Clarify that C# on MacOS is not yet supported.

Fix bad highlighting in Windows shell example.

21 months agosetup.py scripts/wrap/: Simplified builds by always using pip's swig.
Julian Smith [Wed, 19 Jul 2023 17:29:42 +0000 (18:29 +0100)]
setup.py scripts/wrap/: Simplified builds by always using pip's swig.

It seems that we can now always use pypi.org's swig.

scripts/wrap/swig.py:

    Assume swig is available, even on windows. We now document that
    builds should be in a venv in which swig has been installed, so
    there's no need to recommend --swig-windows-auto.

scripts/wrap/__main__.py:

    Improve diagnostic if we don't know where mono is.

    Create default mupdf::FzDocument in --test-cpp.

    Avoid confusing behaviour if `--venv` is not first arg. We now show
    an error message if `--venv` is not first arg - it effectively
    discards any prior args.

setup.py:

    Don't use --swig-windows-auto - we now always require swig in venv.

21 months agoBug 706854: Avoid calling fz_advance_glyph for out of range gid.
Robin Watts [Wed, 19 Jul 2023 11:37:11 +0000 (12:37 +0100)]
Bug 706854: Avoid calling fz_advance_glyph for out of range gid.

We use gid == -1 to signify no glyph (for records where we have
multiple unicodes to a single glyph). Avoid calling fz_advance_glyph
in this case.