Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrectly Failing Validation for Circular Reference (Array) #130

Closed
ndimares opened this issue Jun 19, 2023 · 18 comments
Closed

Incorrectly Failing Validation for Circular Reference (Array) #130

ndimares opened this issue Jun 19, 2023 · 18 comments
Assignees
Labels
enhancement New feature or request

Comments

@ndimares
Copy link
Contributor

@daveshanley I'm running into a circular reference issue that I think is actually valid syntax:

ProductCategory:
      type: "object"
      properties:
        id:
          type: "string"
          description: ""
        name:
          type: "string"
          description: "The name of the category"
        children:
          type: "array"
          items:
            $ref: "#/components/schemas/ProductCategory"
          description: "Array of sub-categories in the same format."
      required:
        - "id"
        - "name"
        - "children"

I think this gets flagged as an infinite circular reference because it is a recursive structure that has a required field. However, I feel like it should be valid, because an empty array seems like a valid way to break the circle.

@ndimares
Copy link
Contributor Author

Note, this is very similar to: #113

@daveshanley
Copy link
Member

daveshanley commented Jun 21, 2023

I have been thinking about this. If I am a compiler, this looks 100% cyclical to me. That's the point of the warning. Depending on how the spec is used, it may or may not create problems down the line.

What should I tell the parser here? If it MUST follow the reference to ensure validity (required) and the chain immediately loops back on itself - there is no way to know when to stop.

The flag is there for you to know that this is a problem. However, it's up to you to decide whether you care. The library will continue operating normally (that's why the SchemaProxy design exists, to stop run-away chaos).

@daveshanley daveshanley added the documentation Improvements or additions to documentation label Jun 21, 2023
@TristanSpeakEasy
Copy link
Contributor

Hey @daveshanley we are struggling to deal with this one, while we could just ignore the error and continue with the model I don't feel we get enough info back from libopenapi to be able to make the correct decision on a circular reference error created by this case. I was largely heading down a road of reimplementing a lot of the recursive walking code in libopenapi to be able to get enough context on things like this nodes parents etc and tracking recursion.

I would largely also consider it equivalent to the special case you have in for not treating non-required properties as circular reference errors, in that you ensure the parse doesn't infinitely recurse on them with some short circuiting and this case could be handled in the same way.

As the above is a legitimate way of designing a document and the json that matches it is perfectly valid I would expect there to be no errors returned from parsing this document (or any that have "valid" circular references vs "invalid" ones there are a few of these "valid" special cases, like non-required properties/arrays/oneOf etc)

@daveshanley
Copy link
Member

I am not sure how else to determine 'what is a circle' without there being some definitive set of rules on which we all agree and then potentially adding a config to those rules, so we can dial them down for a compatibility mode and then turn it up for hard checking.

@daveshanley daveshanley reopened this Aug 21, 2023
@daveshanley
Copy link
Member

There has been a good deal of work on rebuilding parallelization in #148, What kind of changes would you want to make to the existing code to improve the recursive walking? I am open to all suggestions.

@TristanSpeakEasy
Copy link
Contributor

TristanSpeakEasy commented Aug 21, 2023

I was looking at how the current short circuiting worked here https://github.com/speakeasy-api/libopenapi/blob/f28d681837625556b8e739b59025a4ef74caa444/resolver/resolver.go#L256 and started down the route of determining how to evolve this for this use case it would need to know about the parent of the current node it is checking to be able to see if it was an array type and then if so set IsInfiniteLoop to false.

But it was just a little bit too much refactoring for me to do with fearing breaking code I'm not too familiar with

@daveshanley
Copy link
Member

What do you consider to be valid use cases where a loop is permitted? This will guide the requirements for any modifications we need to make to make this code more intelligent.

@TristanSpeakEasy
Copy link
Contributor

So the list we have currently identified that we have actually seen used in commercial OpenAPI documents:

  • Non-Required Parameters
  • Arrays with items with no minimum number of items
  • Loops through oneOf/anyOf

Theoretical valid loops (haven't seen these used so could be de-prioritized) would be through:

  • additionalProperties
  • Arrays with contains with no minimum number of contains

@daveshanley
Copy link
Member

Do you see a need for a level concept for errors within these loops? e.g. fatal, warning, info.

Would you want to configure this, to loosen or tighten the rules about what is and what is not a loop?

@daveshanley daveshanley added enhancement New feature or request and removed documentation Improvements or additions to documentation labels Aug 25, 2023
@TristanSpeakEasy
Copy link
Contributor

Not really no.

from my perspective you have "valid loops" == "no errors" and then "invalid loops" == "error"

If useful I can try and dig up the discussion around this in the OpenAPI community but the concept of "valid loops" is actually part of the specification or at least it was official acknowledge by the OpenAPI team that its valid to define documents this way

@TristanSpeakEasy
Copy link
Contributor

Sorry @daveshanley here was the discussion OAI/OpenAPI-Specification#822 (comment) I did miss remember the outcome and it was less its an official part of the spec and more that is something not explicitly denied by the spec.

Basically OpenAPI is:

leaving it up to the user to ensure that schemas are satisfiable and logically consistent

So from the usage we have seen from companies that are defining commercial specifications in the wild and our own experience with this, there are a number of places where schemas defined certain ways can be satisfiable and logically consistent and we want to allow those users to define the specs this way.

Anyway just wanted to ask if you had done anymore thinking around this, would love to see a solution for this one way or the other, that would make it easy for us to support specifications like this.

@daveshanley
Copy link
Member

Thank you for this link, I am reading through it now and will start ramping up on this problem.

@daveshanley
Copy link
Member

daveshanley commented Sep 15, 2023

I have been looking at this today and thinking about the problem.

The general consensus is that leave it up to a user to decide. So libopenapi should facilitate that.

  1. The resolver has explicit logic to look at polymorphic references
    (https://github.com/pb33f/libopenapi/blob/main/resolver/resolver.go#L373)

This was a decision I made to be 'hard' on all circular references. I still think the logic is valid and I don't want to lose that fidelity, but I also want to make this more flexible. I want compiler knowledge level of the loops, polymorphic or not, but this has to be intelligent to know what came from where.

  1. There is a choice to not short circuit on circular references (https://github.com/pb33f/libopenapi/blob/main/document.go#L254)

So circular references don't break the model, they can be ignored - however this is still not going to resolve the issue of not knowing what to listen to.

How I think this should be solved.

  1. Update the CircularReferenceResult (https://github.com/pb33f/libopenapi/blob/main/index/circular_reference_result.go#L6) to support properties that expose if the result came from a polymorphic result (oneOf, allOf, anyOf) or it came from an array.

  2. Add a configuration option to SpecIndexConfig (https://github.com/pb33f/libopenapi/blob/main/index/index_model.go#L50)

Options to ignore polymorphic circular references and ignore arrays. This way the resolver will simply skip the 'hardcore' deep lookup for polymorphic/array references.

Thoughts?

@daveshanley
Copy link
Member

These levels could then be configured in downstream tools like vacuum, to be more 'friendly'.

@TristanSpeakEasy
Copy link
Contributor

So from an end user perspective they would get errors with more info by default but they could turn off the errors being reported altogether for those types of circular references?

@daveshanley
Copy link
Member

So from an end user perspective they would get errors with more info by default but they could turn off the errors being reported altogether for those types of circular references?

Yes, essentially "here is more data about why it was flagged (it's a loop through a polymorphic oneOf ancestor)", however, I don't care about polymorphic circular references, so ignore them and don't record the loop.

I need this approach because in order for me to build deeper analysis tools, I need to keep the graph in-tact so visually I can render out circles. However those circles as you have mentioned, may or may not be breaking for a users specific needs, so libopenapi can be told to just ignore them.

@daveshanley daveshanley self-assigned this Sep 18, 2023
@daveshanley
Copy link
Member

This is my next build.

@daveshanley
Copy link
Member

from v0.11.0 This can now be configured.

PR is here: #173
Docs are here: https://pb33f.io/libopenapi/circular-references/#circular-reference-results

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants