Escaping double hyphen within :ref: role doesn't work and a dash is rendered. #11492

bgusach · 2023-07-19T09:54:29Z

Describe the bug

The role

:ref:`\--interface <some-ref>`

should be rendered as --interface (with two hyphens) but it is rendered as –interface (with one en-dash) instead.

Outside a :ref: the escaping works as expected.

These following funky alternatives

:ref:`\\--interface <some-ref>` 
:ref:`\\-\\-interface <some-ref>`

don't work either.

The only workaround I've found has been to define smartquotes = False in conf.py.

(Since other roles like :code: handle -- in a different way, I'm inclined to think that this is a Sphinx and not a Docutils problem, but I may very well be wrong.)

How to Reproduce

Create basic index.rst:

:ref:`--interface \--interface \-\-interface \\--interface \\-\\-interface  <some-ref>`

Run make html (ignore undefined label warning).
Visit index.html.
See –interface –interface –interface –interface –interface

Environment Information

sphinx==7.0.1

Sphinx extensions

No response

Additional context

No response

The text was updated successfully, but these errors were encountered:

picnixz · 2023-07-24T09:51:57Z

TL;DR: It seems that this works:

:ref:`\\\\-\\\\-interface <label>`

Because there is some unescape procedure at some point that gobbles one more level of escape. More precisely, if you look at sphinx.util.docutils.ReferenceRole, the input \\\\-\\\\-interface <label> internally parses

\x00\\x00\-\x00\\x00\-interface

and removes the NUL bytes from before \, storing its value in self.title to be \\-\\-interface. The doctree looks now like:

<paragraph>
  <pending_xref refdoc="index" refdomain="std" refexplicit="True" reftarget="label" reftype="ref" refwarn="True">
    <inline classes="xref std std-ref">
      \\-\\-interface

Then (it's done finished yet!), you apply sphinx.transforms.SphinxSmartQuotes inheriting SmartQuotes from docutils/transforms/universal.py that is responsible for transforming double dashes into en-dashes.

AFAICT, this transformation finds all nodes.TextElement and translates its content according to some pre-defined rules. However, when I debugged the flow, it appears that my node is actually processed twice. The reason is that docutils (and not Sphinx anymore) finds all nodes.TextElement nodes in the document and actually, the <paragraph> node containing the <pending_xref> and the <inline> node (also contained in <pending_xref>) are considered as two distinct nodes.

In particular, when you process the <paragraph> node, you also process the internal <inline> and get

<paragraph>
  <pending_xref refdoc="index" refdomain="std" refexplicit="True" reftarget="label" reftype="ref" refwarn="True">
    <inline classes="xref std std-ref">
      \-\-interface

Then, you process the <inline> node and get (finally)

<paragraph>
  <pending_xref refdoc="index" refdomain="std" refexplicit="True" reftarget="label" reftype="ref" refwarn="True">
    <inline classes="xref std std-ref">
      --interface

If you only have one level of escape, processing <paragraph> would already remove the backslashes and then you will have your unwanted en-dash.

I don't know if it is a flaw in the design of the SmartsQuote transformation of docutils but I think we can work towards correcting this. It should be natural that \- actually escapes the -. In particular, I think that we need to change the way the XRefRole parses the title and first transform occurrences of \-\-, \-\-\- and \.\.\. (namely all possible smart characters) into \x00\-\x00\- so that we never have to worry about the SmartsQuote transformation.

I think this logic should only apply to explicit titles. Currently, the explicit titles are unescaped, but I think we may simply keep them as they are. Unescaping them mean removing NUL bytes (which are coming from an escaping backslash in general \). So we should just change

sphinx/sphinx/util/docutils.py

Line 532 in 24b4d65

self.title = unescape(matched.group(1))

into self.title = matched.group(1).

bgusach · 2023-07-25T13:08:57Z

The \\\\-\\\\- trick works. Not a real solution, but I can live with that for a while. Thanks 👍️

AA-Turner added this to the some future version milestone Aug 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Escaping double hyphen within :ref: role doesn't work and a dash is rendered. #11492

Escaping double hyphen within :ref: role doesn't work and a dash is rendered. #11492

bgusach commented Jul 19, 2023 •

edited

Loading

picnixz commented Jul 24, 2023

Uh oh!

bgusach commented Jul 25, 2023

Uh oh!

Uh oh!

Escaping double hyphen within :ref: role doesn't work and a dash is rendered. #11492

Escaping double hyphen within :ref: role doesn't work and a dash is rendered. #11492

Comments

bgusach commented Jul 19, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Describe the bug

How to Reproduce

Environment Information

Sphinx extensions

Additional context

picnixz commented Jul 24, 2023

Uh oh!

bgusach commented Jul 25, 2023

Uh oh!

bgusach commented Jul 19, 2023 •

edited

Loading