Avoid stray objects in incremental xref after pdf_delete_object.
authorRobin Watts <[email protected]>
Mon, 19 Jul 2021 17:53:08 +0000 (18:53 +0100)
committerRobin Watts <[email protected]>
Mon, 19 Jul 2021 18:04:43 +0000 (19:04 +0100)
When we create an object, it is put into the incremental xref.
If we delete that object, then a 'free' object is overwritten
in its place.

We have it in mind to move to spotting objects in the incremental
xref as meaning 'we have unsaved changes', and this is scuppered
by having needless 'free' objects in there.

Objects that existed in previous versions of the document and have
now been removed, certainly need to have 'free' object entries
(and these should count as changes). Objects that have never
existed until this version should be deleted entirely, by being
given a 0 type (and hence should not count as changes).

Accordingly in pdf_delete_object, we check back through the
previous xrefs; if the most recent entry for this object is free
(or if there is no reference at all), then we know we can delete
it by removing the entry entirely.

source/pdf/pdf-xref.c

index 991c0283fce9ff0dec7d58373b3415f5cb1d6451..6363dd4dddb5259d1147212d1eccc1146180aeac 100644 (file)
@@ -2450,6 +2450,8 @@ void
 pdf_delete_object(fz_context *ctx, pdf_document *doc, int num)
 {
        pdf_xref_entry *x;
+       pdf_xref *xref;
+       int j;
 
        if (doc->local_xref && doc->local_xref_nesting > 0)
        {
@@ -2475,6 +2477,47 @@ pdf_delete_object(fz_context *ctx, pdf_document *doc, int num)
        x->stm_ofs = 0;
        x->stm_buf = NULL;
        x->obj = NULL;
+
+       /* Currently we've left a 'free' object in the incremental
+        * section. This is enough to cause us to think that the
+        * document has changes. Check back in the non-incremental
+        * sections to see if the last instance of the object there
+        * was free (or if this object never appeared). If so, we
+        * can mark this object as non-existent in the incremental
+        * xref. This is important so we can 'undo' back to emptiness
+        * after we save/when we reload a snapshot. */
+       for (j = 1; j < doc->num_xref_sections; j++)
+       {
+               xref = &doc->xref_sections[j];
+
+               if (num < xref->num_objects)
+               {
+                       pdf_xref_subsec *sub;
+                       for (sub = xref->subsec; sub != NULL; sub = sub->next)
+                       {
+                               pdf_xref_entry *entry;
+
+                               if (num < sub->start || num >= sub->start + sub->len)
+                                       continue;
+
+                               entry = &sub->table[num - sub->start];
+                               if (entry->type)
+                               {
+                                       if (entry->type == 'f')
+                                       {
+                                               /* It was free already! */
+                                               x->type = 0;
+                                               x->gen = 0;
+                                       }
+                                       /* It was a real object. */
+                                       return;
+                               }
+                       }
+               }
+       }
+       /* It never appeared before. */
+       x->type = 0;
+       x->gen = 0;
 }
 
 static void