Robin Watts [Tue, 22 Sep 2020 16:34:31 +0000 (17:34 +0100)]
Avoid warning in parse_declaration_list.
By initing variables differently, we can avoid the "tail might be
used unitialised" warning.
Robin Watts [Tue, 22 Sep 2020 19:42:28 +0000 (20:42 +0100)]
MSVC: Add mujs source files to project.
Exclude them from the build as they are built as part of one.c,
but this way they can be searched from the IDE.
Robin Watts [Tue, 22 Sep 2020 19:04:05 +0000 (12:04 -0700)]
Fix warning from linux java build.
Robin Watts [Tue, 22 Sep 2020 18:49:18 +0000 (19:49 +0100)]
Fix a couple of warnings in Java build.
Robin Watts [Tue, 22 Sep 2020 17:59:30 +0000 (18:59 +0100)]
MSVC Solution: Add missing files to javaviewerlib project.
Makes no difference, except for search/replace.
Robin Watts [Tue, 22 Sep 2020 16:44:30 +0000 (17:44 +0100)]
Add java-clean target to the toplevel makefile.
Also make sure that java-clean, actually cleans the java build.
Sebastian Rasmussen [Tue, 22 Sep 2020 15:21:08 +0000 (23:21 +0800)]
java: Enable pedantic compilation warnings.
Had this been enabled before gcc would have complained about
functions returning void, returning the void value returned by
jni_rethrow(),
Robin Watts [Tue, 22 Sep 2020 18:40:23 +0000 (19:40 +0100)]
Wrap jni_rethrow() and family in macros.
Commit
c51cc4c3da519ffdf0fa43d680518ab995332e8b started to
return from JNI functions using comma expressions. Unfortunately
this also meant returning the void return value from jni_rethrow()
in JNI function returning void. Neither gcc nor clang complained
about this (with our normal set of warnings), but MSVC caught it.
Accordingly, we move to using macros for jni_rethrow() and similar
jni_throwing "functions". These wrap up the rethrow and the return
into a single invocation. The values we return here are never
important because the caller should spot that we have raised an
exception before it takes notice of the value we pass back.
Thus, we can have jni_throw_whatever() always end with a return 0,
which will work, whether it's an int, a long, or a pointer expected
as the return value. To cope with throwing in void contexts, however,
we need to also have jni_throw_whatever_void();
Sebastian Rasmussen [Tue, 22 Sep 2020 14:07:11 +0000 (22:07 +0800)]
java: Renumber and add missing constants, fixing java build.
Commit
02027e762a8f42d71f82665ebd84af4857e3ff93 accidentally
forgot to update the corresponding java enumerations.
Paul Gardiner [Mon, 21 Sep 2020 11:14:37 +0000 (12:14 +0100)]
Allow extra compile flags specific to C++.
This introduces the XCCXXFLAGS define, which gets included as flags
to all C++ compilations. The current planned use it to add -std=c++11
to the ios build to make tesseract-related code compile.
Tor Andersson [Mon, 14 Sep 2020 14:02:31 +0000 (16:02 +0200)]
Bug 702899: Fix off-by-one error in accelerate_chapter.
Tor Andersson [Thu, 17 Sep 2020 12:10:17 +0000 (14:10 +0200)]
Preserve error message if PDF document fails to open.
fz_drop_document clobbers the previous error message, since it uses
fz_try internally to swallow possible cleanup errors.
Tor Andersson [Thu, 17 Sep 2020 12:32:55 +0000 (14:32 +0200)]
wasm: Always call freeDocument before opening a new file.
If fetch() throws an error, we don't want the old document lying around.
Tor Andersson [Thu, 17 Sep 2020 11:34:11 +0000 (13:34 +0200)]
wasm: Rename rethrow() to wasm_rethrow() for clarity.
Tor Andersson [Thu, 17 Sep 2020 09:18:05 +0000 (11:18 +0200)]
wasm: Remove unused drawPageAsSVG function.
Tor Andersson [Wed, 16 Sep 2020 09:56:34 +0000 (11:56 +0200)]
wasm: Call mupdf.oninit() callback instead of main().
Tor Andersson [Thu, 17 Sep 2020 13:01:44 +0000 (15:01 +0200)]
wasm: Search text in document.
Update search results on visible pages while typing.
Click the next/prev buttons to find the next page with a match.
Tor Andersson [Thu, 10 Sep 2020 13:18:55 +0000 (15:18 +0200)]
wasm: Clean up HTML and CSS layout.
Don't use CSS grid layout.
We need the page list to be part of the document scrolling for history
navigation to work automatically.
Use brighter more cheerful colors.
Tor Andersson [Tue, 11 Aug 2020 12:16:10 +0000 (14:16 +0200)]
wasm: Add error checking and convert mupdf to javascript exceptions.
Tor Andersson [Mon, 10 Aug 2020 15:40:56 +0000 (17:40 +0200)]
wasm: Use ArrayBuffer and Blob to set image data faster.
ArrayBuffers can transfer between Worker and main thread without copying,
and avoiding the BASE64 encoding and stringification step means we can
more or less get the raw PNG data directly to the browser engine with as
few intermediate steps as possible.
Tor Andersson [Thu, 10 Sep 2020 13:06:03 +0000 (15:06 +0200)]
wasm: Use structured text to create selectable invisible text layer.
Tor Andersson [Fri, 7 Aug 2020 16:10:26 +0000 (18:10 +0200)]
wasm: Use JSON to return page links.
Avoid returning hard-wired HTML strings from the C code.
Generate the HTML from the JSON in the viewer code for more flexibility.
Tor Andersson [Thu, 10 Sep 2020 13:30:55 +0000 (15:30 +0200)]
wasm: Add async mupdf library and improve viewer demo.
Tor Andersson [Thu, 6 Aug 2020 14:22:47 +0000 (16:22 +0200)]
wasm: Remove obsolete demos and update Makefile.
Tor Andersson [Thu, 10 Sep 2020 13:05:20 +0000 (15:05 +0200)]
stext: Add JSON stext output format and add preserve-spans option.
Robin Watts [Mon, 14 Sep 2020 11:03:44 +0000 (12:03 +0100)]
MSVC: Hide harfbuzz warnings in all configurations.
Robin Watts [Fri, 11 Sep 2020 11:38:25 +0000 (12:38 +0100)]
mupdf-gl: Fix warnings in MSVC.
Robin Watts [Fri, 11 Sep 2020 10:53:07 +0000 (11:53 +0100)]
Add fz_warp_pixmap function.
Robin Watts [Fri, 11 Sep 2020 13:24:49 +0000 (14:24 +0100)]
MSVC Solution tweaks.
Avoid "/Gm is deprecated" warnings.
Fix Memento x64 builds.
Robin Watts [Fri, 11 Sep 2020 11:41:29 +0000 (12:41 +0100)]
Various size_t fixes for MSVC.
Tor Andersson [Tue, 11 Aug 2020 15:35:25 +0000 (17:35 +0200)]
Update CMap resources.
Tor Andersson [Fri, 4 Sep 2020 10:37:51 +0000 (12:37 +0200)]
Add docs/ecosystem.html and diagram.
Tor Andersson [Thu, 3 Sep 2020 11:56:18 +0000 (13:56 +0200)]
Add "redact" mode side panel that opens with shift-R.
Tor Andersson [Fri, 24 Apr 2020 13:16:07 +0000 (15:16 +0200)]
html: Insert <br> as flow nodes.
This fixes several issues with <br> tags interacting
badly with css margins and selectors due to how we were
splitting blocks in order to insert the new-lines from <br>.
This fixes bugs 698351 and 702856.
Julian Smith [Mon, 31 Aug 2020 15:29:34 +0000 (16:29 +0100)]
Moved test code from scripts/mutool.py into scripts/mupdfwrap.py.
Tor Andersson [Fri, 4 Sep 2020 17:13:05 +0000 (19:13 +0200)]
Remove unused submodule from gumbo-parser submodule.
Robin Watts [Thu, 3 Sep 2020 13:22:46 +0000 (14:22 +0100)]
Bug 702537: Avoid use-after-realloc in pattern handling.
The gstates are held in an array. This array can be realloced
while processing nested patterns, invalidating the current pointer.
Avoid this by reloading the pointer each time around the loop.
Robin Watts [Wed, 2 Sep 2020 18:46:27 +0000 (19:46 +0100)]
Bug 702527: Cope with nonseparable nonisolated blends of alpha only groups.
Robin Watts [Wed, 2 Sep 2020 18:16:17 +0000 (19:16 +0100)]
Bug 702524: Fix overread in nonseparable gray blending.
Don't read bp[n] if there is no alpha.
Robin Watts [Wed, 2 Sep 2020 16:29:09 +0000 (17:29 +0100)]
Bug 701180: Fix memory overwrite.
Templated function was overrunning buffer in the "no destination
alpha" case by stepping too many bytes.
Robin Watts [Tue, 1 Sep 2020 15:23:46 +0000 (16:23 +0100)]
mupdf-gl: Add "Mark for redaction" button.
Add a button to allow all the current search hits to be added
as redaction annotations.
Tor Andersson [Wed, 2 Sep 2020 10:55:27 +0000 (12:55 +0200)]
mupdf-gl: Add ui_*_aux functions.
These behave the same as ui_* functions, but allow for ui
elements to be greyed out.
Robin Watts [Tue, 1 Sep 2020 13:07:36 +0000 (14:07 +0100)]
mupdf-gl: Add 'Redact all Pages' button.
Robin Watts [Wed, 2 Sep 2020 09:01:33 +0000 (10:01 +0100)]
Update ui_slow_operation to give better feedback.
Ensure that we show the UI state before processing pages; otherwise
we end up showing page 1 when we are processing page 2 etc.
Robin Watts [Tue, 1 Sep 2020 11:47:40 +0000 (12:47 +0100)]
MuPDF-gl: Add 'High Security' saving option.
Paul Gardiner [Thu, 13 Aug 2020 09:32:31 +0000 (10:32 +0100)]
Add function to create signature form field.
Also update pdf_create_annotation_raw to place widget annotations
in the correct linked list.
Robin Watts [Tue, 1 Sep 2020 11:46:12 +0000 (12:46 +0100)]
Ensure ocr_init defaults to 'eng' if no language string passed.
api->Init will default NULL to 'eng', but not '' to 'eng', so fix
that here.
Robin Watts [Tue, 1 Sep 2020 11:45:08 +0000 (12:45 +0100)]
VS solution: Build Tesseract variant of mupdf-gl
In addition, a few configurations were incorrectly failing to build
leptonica.
Robin Watts [Tue, 1 Sep 2020 11:37:11 +0000 (12:37 +0100)]
Add fz_write_document.
Add convenience function to allow an entire document to be fed to
a document writer.
Also, add some missing prototypes for the pdfocr writer.
Tor Andersson [Fri, 28 Aug 2020 09:25:29 +0000 (11:25 +0200)]
xmltext: Format floating point numbers with variable precision.
Tor Andersson [Fri, 28 Aug 2020 09:23:38 +0000 (11:23 +0200)]
xmltext: Simplify device implementation.
Let no-op functions in the device be NULL instead of adding stubs that do
nothing.
Tor Andersson [Mon, 24 Aug 2020 11:22:16 +0000 (13:22 +0200)]
Bug 702769: Use PDF 1.7 version number when creating new PDF files.
Tor Andersson [Mon, 24 Aug 2020 11:15:14 +0000 (13:15 +0200)]
Bug 702768: Add "pdf-extract-rich-media.js" example script.
Tor Andersson [Mon, 24 Aug 2020 11:04:01 +0000 (13:04 +0200)]
Fix "no string" test in pdf_to_date.
Tor Andersson [Mon, 24 Aug 2020 11:03:34 +0000 (13:03 +0200)]
Add RichMedia and Projection annotation types to enum.
Mathieu Malaterre [Thu, 20 Aug 2020 08:33:26 +0000 (10:33 +0200)]
mutool show: Remove short syntax from man page.
The abbreviated syntax was removed in the commit "Add selector
syntax to 'mutool show'", but the man page was not updated to
reflect this.
Mathieu Malaterre [Thu, 20 Aug 2020 08:46:43 +0000 (10:46 +0200)]
Add missing trailer/pages/form to mutool show help
Signed-off-by: Mathieu Malaterre <[email protected]>
Tor Andersson [Mon, 24 Aug 2020 12:29:30 +0000 (14:29 +0200)]
Update MuJS to 1.0.8.
This includes a fix for bug 702774.
Julian Smith [Thu, 27 Aug 2020 18:06:43 +0000 (19:06 +0100)]
scripts/mupdfwrap.py: fix bug in ref-counting of Link class wrapper for fz_link.
Use fz_keep_link() when creating a Link from a fz_link*. Fixes heap errors on
some Linux systems when running -t to test things.
Julian Smith [Thu, 27 Aug 2020 18:01:05 +0000 (19:01 +0100)]
scripts/mupdfwrap.py: don't try to test .pdf file that is not in git.
Also allow specification of tests files on command line.
Julian Smith [Wed, 26 Aug 2020 14:50:09 +0000 (15:50 +0100)]
scripts/mupdfwrap*: moved test code into separate file.
Also made test run on all .pdf files in the mupdf repository.
Julian Smith [Wed, 26 Aug 2020 08:59:55 +0000 (09:59 +0100)]
Makerules: removed stray $warning.
Julian Smith [Mon, 24 Aug 2020 11:10:10 +0000 (12:10 +0100)]
source/tools/mudraw.c: fixed compile error if DISABLE_MUTHREADS defined.
Julian Smith [Fri, 21 Aug 2020 12:41:15 +0000 (13:41 +0100)]
scripts/mupdfwrap.py: fixed bad fd in --run-py.
Julian Smith [Thu, 20 Aug 2020 23:05:12 +0000 (00:05 +0100)]
scripts/mupdfwrap.py: improve wrapping of fz_lookup_metadata().
Provide extra function wrapper lookup_metadata() that returns a std::string by
value; makes things easier to use from C++, and also easier for SWIG/Python.
In generated python, replace the auto-generated Document.lookup_metadata()
method with a method that uses the new lookup_metadata() function to return a
python string directly, or None if lookup failed.
Julian Smith [Thu, 20 Aug 2020 13:56:06 +0000 (14:56 +0100)]
scripts/mupdfwrap.py: avoid unnecessary recompilation of swig-generated c++.
If generated .cpp file is identical to previously-existing file, use
previous previously-existing file's mtime so that we don't cause unnecessary
recompilation.
Julian Smith [Tue, 18 Aug 2020 14:13:58 +0000 (15:13 +0100)]
scripts/mupdfwrap.py: explain omission of auto-generated methods.
Julian Smith [Tue, 18 Aug 2020 12:40:49 +0000 (13:40 +0100)]
scripts/mupdfwrap.py: use custom wrapper for fz_copy_rectangle() to return std::string.
Julian Smith [Tue, 18 Aug 2020 12:40:33 +0000 (13:40 +0100)]
scripts/mupdfwrap.py: fixed up hard-coded test file location.
Julian Smith [Tue, 18 Aug 2020 11:14:10 +0000 (12:14 +0100)]
scripts/mupdfwrap.py: insert extra newline before forward decl of iterators.
Julian Smith [Tue, 18 Aug 2020 09:06:18 +0000 (10:06 +0100)]
scripts/mutool_draw.py: fixed mupdf.stderr() => mupdf.stderr_().
Julian Smith [Thu, 23 Jul 2020 13:50:26 +0000 (14:50 +0100)]
scripts/mutool_draw.py: minor fix to if.
Julian Smith [Wed, 22 Jul 2020 13:19:45 +0000 (14:19 +0100)]
source/fitz/trace-device.c: added adv ('advance') attribute to <g> tags.
Julian Smith [Wed, 22 Jul 2020 13:15:27 +0000 (14:15 +0100)]
Makerules: always specify -pthread on OpenBSD.
Julian Smith [Mon, 27 Jul 2020 15:45:51 +0000 (16:45 +0100)]
source/fitz/memento.c: fixed out-of-bounds array access and fd corruption.
In squeeze():
Don't write to out-of-bounds stashed_map[-1] if dup() returns -1.
Close dup-licated fds in child process to avoid possibility of them getting
corrupted by incorrect code.
Julian Smith [Mon, 20 Jul 2020 15:49:42 +0000 (16:49 +0100)]
source/fitz/memento.c: set errno after alloc failure.
Also trivial fix to allow us to run on OpenBSD.
Julian Smith [Fri, 17 Jul 2020 16:38:09 +0000 (17:38 +0100)]
source/fitz/memento.c: optionally hide multiple reallocs of same block.
Output of multiple reallocs disabled if environmental variable
MEMENTO_HIDE_MULTIPLE_REALLOCS is defined at runtime.
Reduces very verbose output if a buffer is resized many times.
Julian Smith [Fri, 17 Jul 2020 09:51:43 +0000 (10:51 +0100)]
Fixes for memento builds on OpenBSD.
Need to link with "-L /usr/local/lib -l execinfo" to get backtrace*(), and not
use -ldl.
Julian Smith [Wed, 15 Jul 2020 17:03:55 +0000 (18:03 +0100)]
scripts/mupdfwrap.py: specify HAVE_PTHREAD=yes when building mupdf.so.
Appears to fix recent breakage.
Julian Smith [Mon, 22 Jun 2020 14:04:09 +0000 (15:04 +0100)]
scripts/mutool_draw.py: added support for 'xmltext' device.
Also fixed creation of fz_infinite_rect.
Julian Smith [Thu, 23 Jul 2020 13:51:35 +0000 (14:51 +0100)]
source/tools/mudraw.c: added support for 'xmltext' device.
Julian Smith [Tue, 25 Aug 2020 15:00:40 +0000 (16:00 +0100)]
platform/win32/libmupdf.vcxproj*: added xmltext-device.c.
Julian Smith [Tue, 14 Jul 2020 18:57:41 +0000 (19:57 +0100)]
Added 'xmltext' devce, outputs low-level pdf info as xml.
Julian Smith [Wed, 15 Jul 2020 12:15:26 +0000 (13:15 +0100)]
source/fitz/printf.c: added support for %i.
Julian Smith [Thu, 25 Jun 2020 13:31:01 +0000 (14:31 +0100)]
scripts/jlib.py: various minor improvements.
Improved system_raw() and system()'s handling of <out> arg. In system_raw() and
system(), we now copy subprocess's behaviour for None (inherit file handles)
and subprocess.DEVNULL.
In system_raw(), default for <out> is None, to inherit file handles.
In system(), default for <out> is std.stdout, which allows <prefix> and
<verbose> to work as expected. also don't indent verbose by default, don't use
print() because it generates spurious newlines, special case out=log to avoid
confusion about {...}.
Fixed bug in build() if <outfiles> is a string rather than list/tuple.
expand_nv(): fixed exception text after error.
Fixed LogPrefixScope.
Allow shell programme to be specified in system() etc. E.g. executable='bash'.
Julian Smith [Mon, 6 Jul 2020 10:31:58 +0000 (11:31 +0100)]
scripts/mupdfwrap.py: call exit(1) after exception.
Julian Smith [Tue, 25 Aug 2020 15:49:51 +0000 (16:49 +0100)]
scripts/mupdfwrap.py: fixes for OpenBSD.
ClangInfo: extended search for libclang.so - also look in /usr/lib and
/usr/local/lib.
Don't call generated functions 'stdout' or 'stderr' - looks like openbsd has a
macro for these names. We call them stdout_ and stderr_ instead.
On OpenBSD, run gmake instead of make.
When compiling+linking with libpython, use pkg-config to get compile/link
flags, instead of python3-config; much simpler and works on openbsd.
Use 'c++' as C++ compiler, not 'g++'.
Julian Smith [Mon, 22 Jun 2020 13:39:35 +0000 (14:39 +0100)]
scripts/mupdfwrap.py: fixed bug in fz_rect wrapper's constructor.
Used to always throw exception.
Also added default constructor for fz_stext_options wrapper.
Julian Smith [Thu, 20 Aug 2020 23:05:12 +0000 (00:05 +0100)]
scripts/mupdfwrap.py: improve wrapping of fz_lookup_metadata().
Provide extra function wrapper lookup_metadata() that returns a std::string by
value; makes things easier to use from C++, and also easier for SWIG/Python.
In generated python, replace the auto-generated Document.lookup_metadata()
method with a method that uses the new lookup_metadata() function to return a
python string directly, or None if lookup failed.
Julian Smith [Thu, 20 Aug 2020 16:34:22 +0000 (17:34 +0100)]
Changed fz_lookup_metadata() to return required buffer size, not string length.
The returned buffer size is string length plus one.
Also patched up pdf_lookup_metadata() in same way.
Julian Smith [Fri, 21 Aug 2020 22:13:16 +0000 (23:13 +0100)]
Fixed SEGV in SWIG Python on Linux.
We need to define fz_pdfocr_write_options_usage even in non-OCR builds.
Julian Smith [Wed, 15 Jul 2020 17:02:25 +0000 (18:02 +0100)]
Makethird: fix shared=yes build of thirdparty/gumbo-parser.
Need to use -fPIC from $(LIB_CFLAGS). (Thanks to Robin for this fix.)
Robin Watts [Tue, 25 Aug 2020 11:29:14 +0000 (12:29 +0100)]
Fix Windows non-tesseract builds.
libleptonica was set to build for all configurations, rather than
just the Tesseract enabled ones.
Tor Andersson [Wed, 19 Aug 2020 13:46:47 +0000 (15:46 +0200)]
ocr: Write compact PDF syntax.
Fix typo in CMap output.
Avoid unnecessary precision in text scaling and positioning.
Use Td instead of Tm for smaller output.
Tor Andersson [Wed, 19 Aug 2020 13:54:45 +0000 (15:54 +0200)]
ocr: Flip order of y1/y2 when converting to PDF coordinate space.
We want to end up using a positive font size, so y2 must be
larger than y1.
Tor Andersson [Wed, 19 Aug 2020 13:21:38 +0000 (15:21 +0200)]
ocr: Clean up enum names and mudraw usage message.
Robin Watts [Fri, 17 Jul 2020 15:47:22 +0000 (16:47 +0100)]
OCR device.
Robin Watts [Thu, 16 Jul 2020 14:21:08 +0000 (15:21 +0100)]
Tesseract integration.
(Includes fixes from both Tor and Sebastian).
Sebastian Rasmussen [Tue, 18 Aug 2020 17:11:05 +0000 (01:11 +0800)]
Bug 702728: Destroy mutexes after dropping base context.
Previously the mutexes were destroyed before dropping the base
context. When reference counted objects, e.g. document handler
contexts, were dropped mupdf would call fz_drop_imp() which tried to
take the allocation lock. The allocation lock is one of the destroyed
mutexes causing an error or crash in pthreads.
Reversing the order such that the base context is dropped before
the mutexes are freed resolves this issue.
Robin Watts [Thu, 6 Aug 2020 15:41:41 +0000 (16:41 +0100)]
Fix windows build of mupdf-gl.
pid_t doesn't exist on WIN32.