You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
According to the test output, the test first creates an org dorifi, and after that succeeds, executes cf target -o dorifi. Targetting the org resulst into org not found error.
We have analysed what the cli does on targetting - it lists the orgs by name, and if the result is empty, returns the not found error
The API is awaiting for the org ready condition, therefore the theory that the API does not wait for it does not hold.
We believe that the problem might be that once the org namespace is created and user rolebingins are propagated in it, the cache of the API shim has not seen the rolebinding yet, therefore listing orgs yields unauthorised error, which is masked by the API by returning an empty list.
In order to address this, could we turn off the k8s client cache completely in the API shim? By doing that all API operations would talk to the k8s databse directly and caching issues would be probably eliminated. Furthermore, by not using the client cache, we could experiment removing the retrying client (although this might cause flakes if there are multiple etcd instances).
According to the flake hunter, this flake is not likely to occur:
It turns out that the api client does not have a cache, so this theory is invalidated. We have added the verbose flag to the cf cli hoping fro more details next time it flakes.
It turned out this flake is not related to caching, but org deletion being slow when pods are in the Initializing state. We have decided to ignore this in the tests. For more info: #3061
We have seen the following flake in e2e periodic tests:
https://ci.korifi.cf-app.com/teams/main/pipelines/main/jobs/run-e2es-periodic/builds/11735
According to the test output, the test first creates an org
dorifi
, and after that succeeds, executescf target -o dorifi
. Targetting the org resulst intoorg not found error
.We have analysed what the cli does on targetting - it lists the orgs by name, and if the result is empty, returns the
not found
errorThe API is awaiting for the org ready condition, therefore the theory that the API does not wait for it does not hold.
We believe that the problem might be that once the org namespace is created and user rolebingins are propagated in it, the cache of the API shim has not seen the rolebinding yet, therefore listing orgs yields unauthorised error, which is masked by the API by returning an empty list.
In order to address this, could we turn off the k8s client cache completely in the API shim? By doing that all API operations would talk to the k8s databse directly and caching issues would be probably eliminated. Furthermore, by not using the client cache, we could experiment removing the retrying client (although this might cause flakes if there are multiple etcd instances).
According to the flake hunter, this flake is not likely to occur:
Therefore it might be hard to confirm whether we have fixed the issue.
The text was updated successfully, but these errors were encountered: