Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Variable self ref #13

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

johnpmcconnell
Copy link

@johnpmcconnell johnpmcconnell commented Feb 21, 2021

This set of changes implements a mechanism for specifying a replacement for "self" in the generated signatures (see #11). I had to make some fairly major changes to make this work, so please review them carefully. If there's better ways of solving any of these problems, I'll be happy to rework my changes.

The concerns I have:

  1. I decided to call the appearance of "self" in the output "signature" a "self reference." I'm open to reworking this with better naming.
  2. I chose to use a meta info field for specifying the new name. I chose this because I couldn't find any other simple way of providing an input value to the py:method domain directive; it seems to be effectively closed to extension. Is there a better approach to providing the name?
  3. Because I used a meta info field, the existing transform was occurring much too late in the process of generating documentation. By the time it was invoked, meta info fields had been filtered out of the docstring. To solve this, I had to convert the transform into an event handler and specify its priority to invoke it before the meta filtering event. Is there a better way to make the transform happen early enough in the process?
  4. The event I believed to be the most appropriate choice doesn't pass the method's full document node directly; it passes the docstring node, whose parent is the method's full node. This means I had to traverse one level above the node in the handler to replace the signature. Is doing so bad style, and can we avoid doing so?
  5. I added several tests to ensure I didn't break anything and to test my new changes: an autodoc test, a test on explicit use of py:method with a selfref field, and a test on autodoc with a selfref field. One thing I don't like about this is that there are now four separate files listing all the methods this project covers. To try to ensure that they remain in sync as the project covers more special methods, I gave the tests a shared expected output (with some filtering to simplify the autodoc cases). Is there a better way to maintain or enforce consistency between all the files? We could generate them automatically somehow, but it would make the project more complex.

@johnpmcconnell johnpmcconnell marked this pull request as draft February 21, 2021 21:45
@johnpmcconnell johnpmcconnell marked this pull request as ready for review February 21, 2021 22:02
Copy link
Collaborator

@eric-wieser eric-wieser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this, it looks very thorough. I'm not super happy with the name selfref, but also can't think of any better names.

:meta _: seems like the right way to go here.

@johnpmcconnell
Copy link
Author

johnpmcconnell commented Feb 22, 2021

@eric-wieser To be completely honest, I'm not satisfied with the name, either. I definitely do not mind rebasing to change it to a better name. I thought it was better than my original idea of selfname, though. We could go with something like objref or objname? But I'm not sure those are any better; they seem slightly less clear to me. Names based on self have the virtue of clearly referring to the object being documented. I think part of the problem is there's really no standard terminology for the different parts of the example code we're presenting here.

FYI, I'm also working on an implementation of with, but I'd like to get this merged before I submit a PR for it, since they'll conflict if I submit totally separate ones.

Let me know if you have any requests for changes. Thanks!

Comment on lines 218 to 219
# Priority must be lower than Python domain meta filter to ensure it runs
# before meta info fields are removed.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does a similar reason force us to not use SphinxTransform, or did you just switch to 'object-description-transform' for simplicity?

Copy link
Author

@johnpmcconnell johnpmcconnell Feb 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this was the reason I switched away from SphinxTransform. I couldn't find any mechanism for getting an instance of SphinxTransform to run before meta was filtered out. Even setting the default_priority didn't make it run soon enough. It is entirely possible I just don't know what I'm doing, but I was pretty thorough in looking. I did quite a bit of hooking up events and debugging to see what order things happened in, and I looked through a decent bit of source code trying to find something. (Frankly, the Sphinx source code was too confusing for me to find where transforms are actually applied. I figured out that it's implemented in docutils, which explains why I couldn't find it.) This event was always fired well before the transform was applied, and you can see in the Python domain that this event is where meta filtering happens. It did seem like a logical place to hook into as well: we're applying a transform on method objects after they get initially parsed.

Copy link
Author

@johnpmcconnell johnpmcconnell Feb 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've done some more research on this, and while I could be mistaken because it's difficult to piece together what is invoked where and in what order, my findings seem to confirm that there's not a better way to do this at present.

As noted before, meta info fields are cleaned on the object-description-transform event. This event occurs when an ObjectDescription directive is run.

The documentation on Sphinx's core events seems to suggest that SphinxTransforms added to the application object are run much later than these directives. Specifically, they're run in the same phase as the doctree-read, which occurs after the document is parsed and directives are executed.

Maybe I should look at implementing a custom info field instead of using meta?

Copy link
Author

@johnpmcconnell johnpmcconnell Feb 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eric-wieser Sorry for all the messages.

Somewhere along the line in reading more, I stumbled across the pending element. I figured out how to use this to implement a directive that leaves a sort of "marker" for the transform to process later. It looks like this in usage:

.. py:method:: __await__()

   .. selfref:: obj

I also implemented a custom info field (branch is now deleted), which looks like this:

.. method:: __add__(other)
    Docstring
    :selfref: obj

Both of these implementations preserve the usage of SphinxTransform. I'm not sure which is more in line with normal docutils/Sphinx patterns. Both feel like they break the mold to me:

  • pending was really designed to invoke its own transform, rather than act as a parameter for another one. While I could potentially implement a second transform that goes back and does even more to the signature after PrettifySpecialMethods does its work, I felt like that posed a much greater risk of some kind of mistake having an unintended side effect or some particular form of the signature ending up wrong. But I can go back and see about it if you really think I should.
  • The info fields aren't designed to be parameters for some other transform either. They're supposed to specify additional information about the object being described, except for meta which is really only used by autodoc for filtering purposes. I also don't like that the info field gets pre-processed and rendered into another form before our transform processes it. (That's why a call to lower is required.) I worry that changes to the code that acts on it could somehow break our transform in the future.

It doesn't seem like augmenting an existing directive with additional transforms and parameters is something the system is really designed for, unless I'm missing something.

Anyway, please take a look at both of those implementations and let me know what approach you feel is best going forward. I favor either the meta info field or the directive with pending, primarily because I don't like the custom info field getting processed by some other transform.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the thorough investigation - I agree, the meta version seems simplest right now, but feel free to keep looking for better approach if you want to.

Otherwise, I think modulo a little more thought about better words for selfref, this is good to merge.

Copy link
Author

@johnpmcconnell johnpmcconnell Feb 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eric-wieser The more I work with this and consider it, the more I think that a directive is a better choice than an info field, even a meta one. Using a directive lets us avoid hooking into events entirely, avoids potential conflicts with changes to Sphinx's Python domain, and is more semantically correct.

  • Directives are the standard mechanism for mark up, and this parameter represents a kind of mark up that changes how the final output is rendered. Info fields provide additional descriptive information about the item being documented, and this is not additional descriptive information.
  • The directive also lets us very cleanly segregate our code and elements from everyone else's, making it less likely that some other extension or core function will accidentally process ours incorrectly.
  • If we use a meta info field, we're absolutely forced to plug into an event and catch it before the built in event erases it. Even with a non-meta info field, we should use an event to catch it before it gets rewritten.

In a more ideal world, a new option to py:method would be the most semantically correct, but I don't believe that extending it is a good idea even if it's possible. As far as I can tell, it is not a supported approach, meaning even a working solution is going to be hacky, fragile, and likely to conflict with other extensions trying to do similar tasks.

I suppose the one argument I can think of in favor of an info field is that it looks like an option to the py:method directive, but I'm not sure that's necessarily a good thing. It requires the user to place whitespace before it if writing reST directly, which is confusing if we want users to think of it as an option.

Do you have any strong objection to using a directive instead?

@johnpmcconnell
Copy link
Author

I'm reverting this to draft. I have an idea that might be more in line with Sphinx's and docutil's intended usage.

@johnpmcconnell johnpmcconnell marked this pull request as draft February 23, 2021 04:57
README.rst Outdated

.. method:: __add__(other)
Docstring
:meta selfref: obj
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about :meta self-param: obj, which is remiscent of :param: obj? Then use self_param in the python code.

The source of inspect.py uses self_parameter, so I think self_param is a sensible choice.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eric-wieser I agree that "self param" is a better name. I'm using it in the revision I'm making.

@johnpmcconnell johnpmcconnell force-pushed the feature/variable-self-ref branch 2 times, most recently from 5f42ba5 to 5ee986c Compare February 27, 2021 03:22
@johnpmcconnell johnpmcconnell marked this pull request as ready for review February 28, 2021 00:34
@johnpmcconnell
Copy link
Author

johnpmcconnell commented Feb 28, 2021

I'm putting this back into ready state for you to merge if you want.

I still believe that the directive based implementation is a big improvement, but if you'd rather stick with the meta parameter, then that's your decision. Let me know if you want to pursue the directive instead.

@eric-wieser
Copy link
Collaborator

A directive seems like the wrong model to me - directives are usually used by a visible block of content in sphinx. I suppose a ..magicmethod:: directive that replaced ..method would still fit that model, but that feels like quite a big change that probably deserves its own PR.

@johnpmcconnell
Copy link
Author

johnpmcconnell commented Mar 2, 2021

I don't think a standalone ..magicmethod:: directive is a an approach that will succeed because autodoc doesn't seem to have good hooks for modifying what directive is emitted.

Ultimately, it feels like we're trying to pick from several less than ideal (maybe even bad) options because the functionality we really need isn't available/supported. We could think about trying to submit upstream patches to Sphinx to make them available... but I don't know how likely to succeed that is or what kind of timeframe it means.

Barring upstream modifications, I feel like the directive is the least bad.

  • It is true that directives usually output some visual element, but that's not strictly 100% the case. Directives like sectionnum modify other elements, for example, and I feel like this is moderately similar to what we're doing. There are other directives that also have very indirect effects on the output's appearance: default-domain and class (which Sphinx aliases as rst-class) are good examples.
  • The only precedence of using an info field to affect output visually is to tag the element for a global filtering mechanism, and even in that case, the field is essentially descriptive rather than specifying a direct modification. The actual specification of the filter to apply is done via options on a directive instead.
  • All other instances of info fields also create direct visual output, meaning that they also usually create a visual block of content just the same as directives. So we're somewhat violating that norm regardless of what implementation we choose. I think that norm is already violated more strongly by Sphinx and docutils themselves for directives than info fields.
  • The directive also gives a lot of useful features out of the box, particularly better enforcement that it's used properly. We get detecting misspellings (errors from the system if there's a typo), enforcing argument count, built in support for deprecation if better implementations come along, and easier to extend if we find out this feature is more complex than we thought (things like options and additional arguments).

I also think I should point out that this problem is going to get bigger with the with statement. There are many more things that users ought to be able to control with that output:

  • The target name (the variable that stores the return value of __enter__)
  • The expression should have options for a variable chosen by the user, a constructor call to the class (since that's such a common idiom), and some default if left unspecified
  • Whether to include a type annotation for __enter__'s return value (This can only be accomplished by placing the type declaration of the target variable on a preceding line, which some users may find too cluttered)

And these sorts of complexities may come up in other protocols yet to be implemented as well. Looking ahead, a custom directive seems like it will be a cleaner solution for this more complex case:

.. contextdisplay::
   :ctxvar: myctx
   :hidetype:
   :initexpr:

(Obviously, that's very rough and needs some naming work, but just to give you an idea.)

The usage enforcement features become even more important with this level of complexity, and if a directive is a better solution for this, then a directive for self would be no worse semantically and more consistent in the long run.

Sorry for dragging this out. If I had done a better job understanding Sphinx and docutils from the get go, we might not need to go through all this discussion. And while I know this doesn't seem like a big deal by itself, I'm also trying to look forward to ensuring this extension maintains a fairly consistent experience to make using it easier.

@johnpmcconnell johnpmcconnell marked this pull request as draft March 5, 2021 06:17
@johnpmcconnell johnpmcconnell marked this pull request as draft March 5, 2021 06:17
@johnpmcconnell johnpmcconnell force-pushed the feature/variable-self-ref branch from 5ee986c to 6451d43 Compare March 10, 2021 08:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants