Skip to content

Remove documentation build bugs #204

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Jan 31, 2024
Merged

Remove documentation build bugs #204

merged 11 commits into from
Jan 31, 2024

Conversation

kroenlein
Copy link
Collaborator

This PR primarily serves to remove warnings from the gemd-python documentation build pipeline. It also brings the gemd-python documentation more in line with the citrine-python docs. Specifically:

  • Moves the version into a separate file, allowing it to be added to the doc build
  • Updates stale links
  • Corrects formatting and spelling issues in some docstrings
  • Adds __all__ lists to packages people might want import from and reformats the existing lists to be strings.
  • Adds Citrine branding

Unfortunately, this does not yet resolve two warnings:

  • The automated documentation generation creates a root reference/gemd.rst that is not referenced elsewhere
  • Classes that are imported as top-level methods created warnings about ambiguity in cross-reference; however, all generated links correctly link back to the docstrings.

@kroenlein kroenlein requested a review from anoto-moniz January 8, 2024 22:50
@kroenlein kroenlein marked this pull request as ready for review January 9, 2024 00:03
@anoto-moniz
Copy link
Contributor

Unfortunately, this does not yet resolve two warnings:

  • The automated documentation generation creates a root reference/gemd.rst that is not referenced elsewhere
  • Classes that are imported as top-level methods created warnings about ambiguity in cross-reference; however, all generated links correctly link back to the docstrings.

Is there any reason not to fix those warnings? Referencing reference/gemd.rst in the main index page is a simple (and useful) fix to bullet one. And even if the generated links all work now, changing the underlying labels to get rid of the warnings seems beneficial, if tedious.

import os
import sys
sys.path.insert(0, os.path.abspath('../../gemd'))
sys.path.insert(0, os.path.abspath('../..'))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using apidoc causes that reference/gemd.rst file to be generated. By having it start outside of the gemd/ folder, you're causing its headline to be whatever folder contains ../... So I'd suggest leaving this as ../../gemd.

setup.py Outdated
this_directory = path.abspath(path.dirname(__file__))
about = {}
with open(path.join(this_directory, 'gemd', '__version__.py'), 'r') as f:
exec(f.read(), about)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we use exec here? Instead of int(f.read())?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. Let's change it, unless you have some objection. I'll do the same for the SDK.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think the new version hits your goals, but I'm kinda inclined to just have

from gemd.version import __version__

even though I know that's not wise.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just always wary of an unnecessary exec, although it's not the most harmful here, sure. You could use the line you put above. But my personal preference is opening the __version__ file as plaintext, grabbing only its first line, and regexing for the version string. That is, what you've put in, except I don't think we need to have __version__ as a variable in the version string file.

But it's not a huge deal either way, so I'll leave the final call for gemd up to you.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As weird as it is, as far as I can tell having __version__ in file is actually best practice.

https://stackoverflow.com/questions/458550/standard-way-to-embed-version-into-python-package

A regex extraction seems gross to me, except it also seems to be the best choice. We could also try

https://pypi.org/project/setuptools-scm/

if we want to try going down the auto-semantic version approach.

For today, I'll just make this approach work.

anoto-moniz
anoto-moniz previously approved these changes Jan 9, 2024
Copy link
Contributor

@anoto-moniz anoto-moniz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few little comments, but no blockers.

@kroenlein
Copy link
Collaborator Author

Unfortunately, this does not yet resolve two warnings:

  • The automated documentation generation creates a root reference/gemd.rst that is not referenced elsewhere
  • Classes that are imported as top-level methods created warnings about ambiguity in cross-reference; however, all generated links correctly link back to the docstrings.

Is there any reason not to fix those warnings? Referencing reference/gemd.rst in the main index page is a simple (and useful) fix to bullet one. And even if the generated links all work now, changing the underlying labels to get rid of the warnings seems beneficial, if tedious.

I'd looked at referencing the generated gemd.rst and it really doesn't add anything of quality to the docs landing page.

If you know how to the change the underlying labels, I'm all ears. I have been unable to find a solution on this. An example error is:

WARNING: duplicate object description of gemd.entity.object.ProcessSpec.ingredients, other instance in reference/gemd.entity.object, use :noindex: for one of them

This is because apidoc is crawling the library and saying the method imported into object/init.py from object/ProjectSpec.py is ambiguous with the object defined in object/ProjectSpec.py.

@anoto-moniz
Copy link
Contributor

anoto-moniz commented Jan 10, 2024

If you know how to the change the underlying labels, I'm all ears. I have been unable to find a solution on this. An example error is:

WARNING: duplicate object description of gemd.entity.object.ProcessSpec.ingredients, other instance in reference/gemd.entity.object, use :noindex: for one of them

This is because apidoc is crawling the library and saying the method imported into object/init.py from object/ProjectSpec.py is ambiguous with the object defined in object/ProjectSpec.py.

That's the other type of warning, where it's a class e.g. "WARNING: duplicate object description of gemd.entity.bounds.categorical_bounds.CategoricalBounds, other instance in reference/gemd, use :noindex: for one of them". The issue is that it's in multiple __all__ statements, which seems to be a bug in sphinx-apidoc: sphinx-doc/sphinx#11050.

But the one you quote, which is referring to a field, is because the field is given a docstring both in the class docstring and on the field/property itself. The solution is to move that description to the field/property directly, such as I did for citrine-python: https://github.com/CitrineInformatics/citrine-python/pull/910/files#diff-2d687c2d393b3a84762b6673dc4b134d1471c98a87c94bef1e7b004e43cfc4b5

I'd really like to find some solution to that first one, as it results in a long list of warnings, rendering the output of the build much less useful.

@kroenlein kroenlein requested a review from anoto-moniz January 17, 2024 02:51
@kroenlein
Copy link
Collaborator Author

Unfortunately, this does not yet resolve two warnings:

  • The automated documentation generation creates a root reference/gemd.rst that is not referenced elsewhere
  • Classes that are imported as top-level methods created warnings about ambiguity in cross-reference; however, all generated links correctly link back to the docstrings.

Resolves everything but the no-index bug.

@@ -80,7 +79,7 @@ def union(self,

Parameters
----------
others: Union[CategoricalBounds, CategoricalValue]
others: Union[CategoricalBounds, ~gemd.value.categorical_value.CategoricalValue]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick?
In some places in the docs, you use Union when a parameter can accept multiple types. In others, such as gemd/builders/impl.py above, you use "or". Is there an intended distinction?

@@ -24,11 +24,12 @@ def __init__(self, components=None):

@property
def components(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: missing a return type.

Other bounds or value objects to include.

Returns
-------
CategoricalBounds
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, these docs were just straight up wrong? Fun!

def _local_dependencies(
self
) -> Set[Union["gemd.entity.base_entity.BaseEntity", # noqa: F821
"gemd.entity.link_by_uid.LinkByUID"]]: # noqa: F821
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: Why does this list subtypes that can be returned? As the parent class, it doesn't know. It only knows (insofar as typing can) that each element will be a subclass of HasDependencies.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a subclass, it's a mixin / interface specifically designed to say what objects need to be included in a serialization of this object. It was added to support the register_all method for GEMD objects creating batches of self-supporting objects

https://github.com/CitrineInformatics/citrine-python/blob/54b9fb8a0658dfe0c98315260d1a7e1f972d2a53/src/citrine/_utils/batcher.py#L58

Might make sense to make it an abstract method on BaseEntity instead, except it's used by BaseObject & BaseTemplate & BaseAttribute objects, but not by other Entities. The object model is really not clean in its dependencies.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point still stands: a mixin also doesn't know which types implement it.

Copy link
Collaborator Author

@kroenlein kroenlein Jan 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A valid return value from _local_dependencies could be a PropertyTemplate, because something can depend on a PropertyTemplate. A PropertyTemplate is not a HasDependencies, because it can't depend on anything.

The return values from _local_dependencies are not necessarily HasDependencies objects.

@@ -17,7 +17,7 @@ class HasPropertyTemplates(HasDependencies):

Parameters
----------
properties: List[(PropertyTemplate, BaseBounds)]
properties: List[(~gemd.entity.template.property_template.PropertyTemplate, BaseBounds)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This type for properties is quite different from the type hints on the actual parameter.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An English language description of the type here is a list-like object, where each element of the list is one of:

  • A PropertyTemplate
  • A LinkByUID that maps to a PropertyTemplate (unknowable / unenforceable)
  • A Tuple that contains:
    • A PropertyTemplate or a LinkByUID
    • A Bounds object

The signature lines up with that, but that's a pretty complicated thing to put into documentation. Probably just needs an essay.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're going to document the types, then we should be as accurate and precise as possible. Otherwise, there's no point.

If we can't describe the types we accept as English, then using a code like structure similar to the type hint is a reasonable move. Because while it's not as important that the documentation of inputs cover every case, that both type hints show up in the docs may cause some confusion.

But I'm more concerned with disagreements between the docstrings and type hints with properties and other values to be read. For example, parameters documents its return as List[Union[ParameterTemplate, LinkByUID]], which is quite different from the docstring's claim of List[(ParameterTemplate, BaseBounds)]. A list of elements which are either ParameterTemplate or LinkByUID objects is not the same as a list of tuples, where each tuple is one ParameterTemplate followed by a BaseBounds.

Although this raises the question of why we must include the type in the docstring and inline. It would seem to be redundant, as the generated Sphinx docs include both, and thus just introduces chances for conflict, like this.

anoto-moniz
anoto-moniz previously approved these changes Jan 19, 2024
Copy link
Contributor

@anoto-moniz anoto-moniz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've registered my disagreements on the way we're documenting types here. They're not accurate, and bound to be confusing to users should they look at the documentation.

However, this is just documentation, not a bug. Worst come to worst, they'll learn the discrepancies when they go to use it. More importantly, these are not discrepancies you introduced: they already existed, and you just happened to touch the lines bringing them to my attention.

As such, I'll step out of the way, since the PR doesn't make anything worse, and improves a bunch of stuff. In the future, I would like to see us be better about documenting types and keeping docstrings in line with type hints if we're to continue using them.

@kroenlein
Copy link
Collaborator Author

This should address everything but the inconsistencies in types for the has_X_templates family of modules. We can revisit, but I think part of the challenge is that the setters are very forgiving, and the Parameters fields line up with the returns (or at least I think they do) not with the acceptable incoming data.

sphinxcontrib-apidoc==0.5.0
sphinx==5.0.0
sphinx-rtd-theme==1.0.0
sphinxcontrib-apidoc==0.3.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for reverting these requirements?

@kroenlein kroenlein merged commit 2af5020 into main Jan 31, 2024
@kroenlein kroenlein deleted the maintain/clean-doc-build branch January 31, 2024 21:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants