Menu

#110 Move from "optparse" to "argparse".

Default
pending
nobody
None
5
2025-04-28
2022-01-06
No

The optparse documentation says:

Deprecated since version 3.2: The optparse module is deprecated and will not be developed further; development will continue with the argparse module.

We are currently suppressing related deprecation warnings in the test suite.
After raising the Python dependency to >=3.7, now may be the right time to make the move.

Related

Feature Requests: #81
Feature Requests: #88

Discussion

1 2 > >> (Page 1 of 2)
  • Günter Milde

    Günter Milde - 2022-01-06

    Intended steps:

    • Fully backwards compatible change of "settings spec" processing.
    • Remove the DeprecationWarning filter in the test suite.
    • Consider moving to a new settings-spec format based on ordered dicts.
      • Converter class for the current format.
      • Use the new format in Docutils component.
      • Deprecate the old format and related helpers/hacks in "utils".
    • Switch to "argparse" based helpers for some of our ensure_*() functions in the option parsing ecosystem.

    Help welcome.

     

    Last edit: Günter Milde 2022-01-06
  • Chris Sewell

    Chris Sewell - 2022-01-06

    sounds good

    Consider moving to a new settings-spec format based on ordered dicts

    Minor note, since python 3.7 standard dicts are guaranteed to be ordered: https://docs.python.org/3/whatsnew/3.7.html

     
  • Adam  Turner

    Adam Turner - 2022-01-12

    Sorry for the radio silence!

    I've made some progress in this branch ( https://github.com/AA-Turner/docutils/tree/argparse ) -- currently fixing the last few failing tests and then will clean up the commit history and propose as a patch.

    The work does lead me to ask about how pace of deprecations / removal (partly to help my chances of getting the patch accepted, and partly for 'nicer' code).

    Can I remove the "old" config parsing logic? It was deprecated in 2003 in R1643 ( https://sourceforge.net/p/docutils/code/1643/ ).

    Equally, when deprecating features that would be obsoleted by my changes, would you suggest announcing removal in v0.20.0, v1.0.0, etc?

    (I have seen the proposed deprecation policy -- my personal feedback would be that it should go along with a versioning policy. Python itself is special in that they can't really bump the major version. If Docutils adopted semantic versioning, then removals would simply trigger a major version bump -- we could add additional language on at least X minor releases with deprecations before removal, for example).

    A

     
    • Günter Milde

      Günter Milde - 2022-01-12

      On 2022-01-12, Adam Turner wrote:

      Sorry for the radio silence!

      No need to apologise. Docutils development is slow and there is real life, too.

      I've made some progress in this branch

      Looks promising, thank you.

      I would not add as many leading underscores, though,
      to keep consistency with the existing code base.
      Exceptions are new additions that are bound to go/change in near future
      or minor auxiliary objects that are only locally used.

      Before the move to dict based arguments_spec, give us a pause to
      plan/discuss/document what the specs shall look like.

      The work does lead me to ask about how pace of deprecations / removal

      That is a tricky question. I like clean code but I got burned several times
      after removing obscure parts only to find out that this broke applications
      with indirect dependency on Docutils and maintainers not knowing about the
      project lest reading our RELEASE-NOTES :(

      Can I remove the "old" config parsing logic? It was deprecated in 2003
      in R1643 ( https://sourceforge.net/p/docutils/code/1643/ ).

      Not yet, I am afraid. I changed the "may be removed" to "will be removed"
      two months ago in [r8880]. Next step would be to add a removal version...

      Equally, when deprecating features that would be obsoleted by my
      changes, would you suggest announcing removal in v0.20.0, v1.0.0, etc?

      I don't know whether there will be v0.20 at all. The idea is to reconcile the
      reality (Docutils is used as if it were mature) and the version number (<1)
      once the API sufficiently defined a deprecation policy is agreed.
      Alternatively, 1.0.0 could be pushed off a while and be a clean start
      (including removals currently scheduled for 1.2). This would require
      re-wording the DeprecationWarnings to say 0.21 or later.

      (I have seen the proposed deprecation policy -- my personal feedback
      would be that it should go along with a versioning policy.

      This discussion deserves a separate thread/ticket.

      Thanks, Günter

       

      Related

      Commit: [r8880]

      • Adam  Turner

        Adam Turner - 2022-01-12

        leading underscores

        OK. My intention was to not expand the module's public API any further, on the underscored-names-are-private convention. However I do remember the discussion about more clearly defining Docutils' public API -- perhaps that work should happen first. (My suggestion would be a single document listing the modules/classes that form the public API, and then reinforcing that in the code with __all__ and use of underscores, etc.)

        Before the move to dict based arguments_spec

        Fair enough -- I have proposed the simplest possible change, which is just replicating kwargs of ArgumentParser.add_argument. I'd probably suggest sticking with this (similarly to how settings_spec delegates much to OptionParser.add_option), although of course other designs could be used.

        this broke applications

        :(

        Not yet, I am afraid

        Fair enough!

        once the API sufficiently defined a deprecation policy is agreed

        OK, seems this is a blocker to a few of my proposed changes. I can open a ticket to start a discussion if there's not one already? I think the API, versioning, and deprecation conversations do probably link to each other quite a bit.

        A

         

        Last edit: Adam Turner 2022-01-12
        • Günter Milde

          Günter Milde - 2022-01-14

          On 2022-01-12, Adam Turner wrote:

          ... move to dict based arguments_spec

          Fair enough -- I have proposed the simplest possible change, which is
          just replicating kwargs of ArgumentParser.add_argument.

          We need to keep compatibility the current "settings spec" specification
          as this is part of the "drop-in" extension API (used by, e.g.,
          "myst-docutils"). After deciding on a new interface (maybe just an
          add_argument kwargs dict), we can use this in Docutils and deprecate the
          old interface.

          ...

          once the API sufficiently defined a deprecation policy is agreed

          OK, seems this is a blocker to a few of my proposed changes. I can open
          a ticket to start a discussion if there's not one already?

          Yes, this may help.

          For the time beeing, we may follow the Python rules:
          incompatible changes require an advance warning at least 2 minor version before
          implentation.

          The change from 0.x to 1.x, although increasing the major number should
          signify a major changes in commitment rather than API.
          (cf. https://docutils.sourceforge.io/docs/dev/policies.html#version-identification)

          Thanks, Günter

           
  • Adam  Turner

    Adam Turner - 2022-01-15

    We need to keep compatibility the current "settings spec" specification

    Yes of course. I should have maybe explained a little -- I added a transparent conversion function from arguments_spec to settings_spec and vice versa. If you access .settings_spec on any component just with arguments_spec defined, you will get a valid settings_spec tuple returned (as if by magic!)

    Yes, this may help.

    OK, will do.

    A

     
  • Adam  Turner

    Adam Turner - 2022-01-17

    Hi, as promised please find the patch at https://github.com/AA-Turner/docutils/pull/8 // https://github.com/AA-Turner/docutils/pull/8.patch

    It will likely be easiest to review commit-by-commit.

    Note as discussed earlier, all code expecting a settings_spec tuple will can get one (see https://github.com/AA-Turner/docutils/pull/8/files#r785578670 ).

    Would appreciate any feedback or reviews!

    A

     
    • Günter Milde

      Günter Milde - 2022-01-19

      Thank you for the reworked patch set!

      It will likely be easiest to review commit-by-commit.

      Lets start with 1/4

      I do not see the benefit in the new helper functions outweighting the
      refactoring effort:

      Usage (after all 4 commits):

      ========================================= === ===== =========
      new function lib tests dev-tools
      ========================================= === ===== =========
      frontend.read_config_files 2
      frontend.config_files_settings 1 1
      frontend.get_default_settings 1 7 1
      frontend.get_settings_with_defaults 1
      frontend.parse_args 1
      Publisher.config_section_to_settings_spec 3
      ========================================= === ===== =========

      Making OptionParser.get_standard_config_files() a class method could be
      combined with the removal of the mod_python hack.
      The commit message or should note the reason
      (mod_python beeing no longer developed and too old)
      with a link to, e.g., https://en.wikipedia.org/wiki/Mod_python.

      I see that after all 4 commits, frontend.OptionParser is still present
      (is it also fully backwards compatible?)

      Would it be feasible to have a minimal change where only the Values,
      Option, and OptionParser classes were based on "argparse" instead of
      "optparse"

      Then we can start moving to frontend.ArgumentParser and using "argparse"
      features step by step.

       
  • Adam  Turner

    Adam Turner - 2022-01-20

    I do not see the benefit in the new helper functions outweighting the refactoring effort:

    The functions are of most use to downstream developers, I'd argue. It provides a higher level abstraction which means you don't need to know about the internals of how OptionParser &c work.

    get_settings_with_defaults is tenuous, but I was trying to move to a point where the OptionParser and ConfigParser classes are only used in docutils.frontend -- providing a cleaner seperation of concerns and making any future refactorings easier.

    could be combined with the removal of the mod_python hack.

    Done and pushed

    frontend.OptionParser is still present (is it also fully backwards compatible?)

    I didn't remove it or change the name as many downstream consumers rely on it. It is fully backwards compatible, yes -- mostly due to the settings_spec <--> arguments_spec converters. It is deprecated, though (through the module level __getattr__).

    Would it be feasible to have a minimal change where only the Values, Option, and OptionParser classes were based on "argparse" instead of "optparse"

    It would be very difficult. optparse makes fundamentally different assumptions than argparse, which took a while to sort out and convert. E.g. the parsing of config files, the "validator" behaviour -- there is a hack at the moment to intercept validators that downstream users have used and wrap them with an argparse Action.

    Broadly though, commit 2 ("Update to use argparse") can be made standalone if you want it to be -- I would be disapointed not to have the helper functions, but strictly they are not required.

    I think that is probably as minimal as it gets -- it was difficult enough to rewrite everything keeping 100% backwards compatability -- and I can't make any assumptions about which names are "safe" to remove -- e.g. I had to keep a no-op definition for Option as downstream users might use it.

    A

     
    • Günter Milde

      Günter Milde - 2022-01-31

      Sorry for the long delay. Here my update on patch 1/4.

      On 2022-01-20, Adam Turner wrote:

      The functions are of most use to downstream developers, I'd argue. It
      provides a higher level abstraction which means you don't need to know
      about the internals of how OptionParser &c work.

      You don't need to know these details, when using the intended entry points:

      • the framework provided in docutils.core for front-end tools and
        applications deploying docutils

      • the readers.Reader, parsers.Parser, writers.Writer, and
        transforms.Transforms base classes for drop-in components.

      For concrete use-cases where this interface is insufficient, we can work on
      enhancement and improvement, but I don't want to start a parallel,
      undocumented framework of "access functions".

      ... I was trying to move to a point where the OptionParser and
      ConfigParser classes are only used in docutils.frontend --
      providing a cleaner seperation of concerns and making any future
      refactorings easier.

      The frontend.ConfigParser is only used in frontend.OptionParser already.
      Use of frontend.OptionParser will drop significantly with just one helper
      function: frontend.get_default_settings(). Remaining cases can be handled
      later (before deprecating the old OptionParser).

      ...

      I think that is probably as minimal as it gets -- it was difficult
      enough to rewrite everything keeping 100% backwards compatability --
      and I can't make any assumptions about which names are "safe" to remove

      We can check for usage in Sphinx and myst, this gives at least an indication.

      I'll attach my proposal for a pre-conversion cleanup.

       

      Last edit: Günter Milde 2022-01-31
      • Adam  Turner

        Adam Turner - 2022-01-31

        For concrete use-cases where this interface is insufficient, we can work on enhancement and improvement, but I don't want to start a parallel, undocumented framework of "access functions".

        Fair enough. From the comment below "Use of frontend.OptionParser will drop significantly with just one helper function" are you happy to keep frontend.get_default_settings()?

        I'll review the patch you attached, thanks!

        A

         
        • Günter Milde

          Günter Milde - 2022-01-31

          On 2022-01-31, Adam Turner wrote:

          From the comment below "Use of frontend.OptionParser will drop
          significantly with just one helper function" are you happy to keep
          frontend.get_default_settings()?

          I am fine with this one. It needs a docstring and a mentioning in the module
          docstring, though.

          Thanks, Günter Milde

           
          • Adam  Turner

            Adam Turner - 2022-01-31

            Perfect, thanks!
            A

             
  • Adam  Turner

    Adam Turner - 2022-01-25

    From FR# 88

    I am just working on a way to disentangle frontend.OptionParser and frontend.ConfigParser but this is a topic for bug:#441

    Sounds good -- happy to review a proposed design.

    I'd propose making the --config flag simply store the path to the file, and then add it to the list of config files to be read.

    get_config_file_settings should also probably be removed from OptionParser -- instead a dict should be returned by ConfigParser and merged in.

    I didn't go into detail intentionally so as not to spark a debate about these parts, but I do think (eventually) simplifying these interactions can lead to a cleaner codebase.

    Docutils has an elaborated configuration framework which actually predates the "optparse" module. Later development of "Optik" into "optparse" and then "argparse" implemented some of the abstractions and enhancements offered by Docutils in a different way.

    Ideally, we would construct a single ArgumentParser object and add arguments "normally" through parser.add_argument. The problem here is that subclasses can filter settings_spec (through filter_settings_spec).

    I've never seen this implemented in the way Docutils does it before -- if settings_spec tuples were treated as immutable, then things would be much simpler.


    We could also move to a model (a la stdlib dataclasses where we simply specify the setting name, type and default within the Component itself. The front-end tools could then use that information (or a mapping of settings to command line options or similar) to create the CLI arguments.

    I think this would be cleanest, but this cannot be done overnight.


    Did you have a chance to review my comments above in the previous post? It would be good to get some of this merged, or know what is blocking it // needs changing.

    A

     
    • Günter Milde

      Günter Milde - 2022-01-31

      On 2022-01-25, Adam Turner wrote:

      I am just working on a way to disentangle frontend.OptionParser and frontend.ConfigParser but this is a topic for bug:#441

      Sounds good -- happy to review a proposed design.

      My major sticking point is the cross-referencing: OptionParser uses a
      ConfigParser instance and passes "self" to config_parser.read() because
      ConfigParser requires an instance of OptionParser!

      The reason is that some settings have side-effects (overwriting other
      settings) that need to be resolved for each config file before reading the
      next to keep the settings priority as expected. (cf. the attached
      documentation)

      My idea is to scale back frontend.ConfigParser to provide backwards
      compatibility and the custom optionxform() while moving the
      setting_validation() to frontend.OptionParser.

      The OptionParser would then loop over the config_files,
      read them one-by-one with config_parser(read(file)),
      select settings from the "active" config file sections, and
      "validate" them (including the overwriting of affected settings).

      If you agree that this is a step in the right direction, I would try to
      prepare a patch.

      I'd propose making the --config flag simply store the path to the
      file, and then add it to the list of config files to be read.

      Mind, that the list of config files to read is processed before
      parsing the command line.
      An option parser based on "argparse" can use the parse_known_args() method
      to pre-parse the command line, append additional config files and proceed.
      ...

      Ideally, we would construct a single ArgumentParser object and add
      arguments "normally" through parser.add_argument.
      The problem here is that subclasses can filter settings_spec (through filter_settings_spec).
      ...
      if settings_spec tuples were treated as immutable, then things would be much simpler.

      Actually, both SettingsSpec instances and SettingsSpec.settings_spec
      tuples do not "mutate" in Docutils.
      filter_settings_spec() is only used in creating a settings spec tuple
      from the settings specification of a base or "sister" class in the class definitions of the "xetex", "html4" and "html5" writers.
      Other components ('pep_html", "s5_html") overwrite just some default values with SettingsSpec.setting_default_overrides and keep the parent's setting_spec.)

      The Sphix "html" writer loops over the base class' settings_spec tuple
      in order to set a default value in writers/html.py:59
      https://github.com/sphinx-doc/sphinx/blob/4.x/sphinx/writers/html.py
      This should be changed (before retiring the old format)
      to use settings_default_overrides instead.

      We could also move to a model (a la stdlib dataclasses where we
      simply specify the setting name, type and default within the Component
      itself.

      This sounds like what I have in mind with a new data format to replace the
      tuple SettingsSpec.settings_spec (mind, that Component inherits
      SettingsSpec).

      The front-end tools could then use that information (or a
      mapping of settings to command line options or similar) to create the
      CLI arguments.

      I would keep the processing of settings specifications in the "frontend"
      module with front-ends deploying them via the core.Publisher class.

      I think this would be cleanest, but this cannot be done overnight.

      But, maybe we should discuss more about a clean end state before
      (or in parallel) with a step-by step review of the optparse-argparse switch.

       
  • Adam  Turner

    Adam Turner - 2022-01-27

    I just rebased the branch onto the latest master
    https://github.com/AA-Turner/docutils/pull/8

    A

     
    • Adam  Turner

      Adam Turner - 2022-01-27

      Re-rebased
      A

       
      • Adam  Turner

        Adam Turner - 2022-01-29

        Re-re-rebased.

        It is a not-insignficant amount of work each time to rebase the changes, so please may I ask for a review, or to know what is blocking a review? ( @milde @grubert )

        I would like to progress these changes -- I've been holding off on other potential suggestions so as not to have merge conflicts with myself.

        A

         
        • Günter Milde

          Günter Milde - 2022-02-03

          I am sorry for the delay and the additional work. The change turned out to be far more complicated than expected when I first proposed it.
          In order to have informed choices, I had to study the current state in detail first. (See [r8996] for a by-product.)
          I propose to start with clean up and deprecation steps to smoothe the actual OptionParser -> ArgumentParser change. (A proposal for the two first steps is attached.)

           

          Related

          Commit: [r8996]

          • Adam  Turner

            Adam Turner - 2022-02-05

            These seem fine, I'd go ahead and apply them. For reference I opened https://github.com/sphinx-doc/sphinx/pull/10162 to update sphinx.writers.html to use settings_default_overrides.

            A

             
            • Adam  Turner

              Adam Turner - 2022-02-05

              For reference "these" are the two patches 0001-Small... and 0002-Clean... -- I reviewed these locally. I assumed that these replaced the earlier "argparse-gm.patch"

              A

               
    • Günter Milde

      Günter Milde - 2022-02-08

      Attached a review and patch proposal.
      More "high-level" planning and proof-of-concept files under sandbox/enhancement-proposals/settings-API

       
      • Adam  Turner

        Adam Turner - 2022-02-20

        Replying to comments from the review file here:

        • Use single quotes for strings (""" for docstrings) (↗policies.txt).
        • Avoid line length > 79 chars.

        Yep, will fix

        Missing the "usage" [argument].

        Added back, I removed it as it overrides the more helpful message that argparse provides by default but yes there is a remote possibility someone might be using it

        Not needed. Note the "Filenames will be tilde-expanded later." in the docstring.

        I moved the tilde expansion to the class constructor for a minor performance boost. What I forgot to do was remove the docstring comment.

        Changed behaviour: Users may set $DOCUTILSCONFIG=~/.config/docutils.conf.

        I don't think it is, as the "DOCUTILSCONFIG" in environment path includes tilde expansion. The fast path is only when "DOCUTILSCONFIG" does not exist (likely the majority of cases).

        Place at end of file.

        Sure

        We have 3 settings sources: components, config files, command line: * Are settings after updating from config files still "default" settings?

        You could argue yes, the settings are the default for the particular environment. Happy to take better suggestions, though.

        I propose a more versatile variant that allows for "settings_overrides" and customization of the config file names.

        I see the idea, but I would very strongly advise against -- this is meant to be a simple convinience function. If the more complex functionality is needed later, we can always either add it to the same function, or add a new one.

        I propose a more general helper function ... Publisher.get_components()

        This looks good. I would document it as a private / provisional, method, but I think this could be used now.

        A

         
        • Adam  Turner

          Adam Turner - 2022-02-20

          I have pushed the changes from the review and Publisher.get_components into https://github.com/AA-Turner/docutils/pull/8

          A

           
1 2 > >> (Page 1 of 2)

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.