Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unencrypted tokens and parser #822

Open
pkly opened this issue Feb 22, 2022 · 17 comments
Open

Unencrypted tokens and parser #822

pkly opened this issue Feb 22, 2022 · 17 comments

Comments

@pkly
Copy link

pkly commented Feb 22, 2022

Hi, first off, love the lib. It's even better in 4.x than it was in 3.x.

One thing bothers me a bit though, since the interfaces moved around and claims() got moved to the UnencryptedToken interface it left a weird gap in the Parser interface, since it's not possible to get a "proper" decoded token interface via the default implementation.
Yeah, it's returning an unencrypted plain token, but it's a bit weird that you can't simply force that.

I would like to propose a solution to this problem, just like there's a Token interface and UnencryptedToken interface, maybe let's have Parser and UencryptedParser interface? The default implementation could stay the same, the UnencryptedParser interface would simply add a new method - parseUnencrypted(string $jwt): UnencryptedToken or something like that.

There'd be no modification required on the Parser class, simply changing the interface and having both methods return the same object.

@Ocramius
Copy link
Collaborator

What I'm kinda missing is what we're trying to achieve: what is the use-case, at high level/architecture?

@pkly
Copy link
Author

pkly commented Feb 22, 2022

We have our own oauth2 server which uses the jwt tokens for access. Sometimes we need to decode it to check for a few things, including ids. This is only possible by using claims, which are not available on the Token interface, but were easier to access in the 3.x library.

@Ocramius
Copy link
Collaborator

Ah, so you mean that there should be a parser that guarantees that a non-encrypted token is returned.

Perhaps it would be a good idea to have an UnencryptToken#__invoke(Token|UnencryptedToken|EncryptedToken $token): UnencryptedToken, that you can funnel tokens through, and that always guarantees a cleartext token to be usable?

@pkly
Copy link
Author

pkly commented Feb 22, 2022

I don't think it can always be assumed that a theoretically encrypted token could be decrypted without the use of a service with some configuration

@Ocramius
Copy link
Collaborator

That would be a service :)

@pkly
Copy link
Author

pkly commented Feb 22, 2022

Oh, sorry, I misread.
Yeah I suppose that would also work.

@Ocramius
Copy link
Collaborator

@lcobucci WDYT? That would solve/remove the problem of casting a Token into an UnencryptedToken, which currently needs to be done via assert(...), in order to be type-safe, such as in:

jwt/src/JwtFacade.php

Lines 37 to 45 in 5f1fbfd

public function parse(
string $jwt,
SignedWith $signedWith,
ValidAt $validAt,
Constraint ...$constraints
): UnencryptedToken {
$token = (new Parser(new JoseEncoder()))->parse($jwt);
assert($token instanceof UnencryptedToken);

@SvenRtbg
Copy link
Contributor

There are some oddities around, IMHO. I'm just dealing with creating tokens, so after reading the docs I wondered: How can they suggest calling claims() without asserting what is returned from the builder? And looking at the Lcobucci\JWT\Builder interface, I was even more puzzled: public function getToken(Signer $signer, Key $key): Plain; - Plain is the implementation of UnencryptedToken, but why is an interface responding with an implementation, and not with an interface for that implementation?

So the only available builder is explicitly forced to return a Plain implementation, and the only available parser is forced to return the most generic Token interface, but internally also only returns Plain for now. I understand that maybe this is reserving room for encrypted tokens somehow, or any other type of token that are not exactly UnencryptedToken, but what can be in between the two? UnencryptedToken::claims() is returning any claims a token may have, while Token has plenty of specific validator methods that also simply access that claims dataset object in the only implementation - and technically they are simply put into the same location in the JSON string.

So there is this unexplained gap, and having a dedicated method that explicitly returns UnencryptedToken from the parser gets +1 from me.

@pkly
Copy link
Author

pkly commented Feb 22, 2022

Yes, that seems rather weird as well. I'd suggest just returning UnencryptedToken in both instances.

@Ocramius
Copy link
Collaborator

from the parser

It would be a separate object :)

@lcobucci
Copy link
Owner

lcobucci commented Feb 22, 2022

Folks, thanks for sharing your points!
It's the feedback I wanted to receive during the beta/rc releases of v4.0.

And looking at the Lcobucci\JWT\Builder interface, I was even more puzzled: public function getToken(Signer $signer, Key $key): Plain; - Plain is the implementation of UnencryptedToken, but why is an interface responding with an implementation, and not with an interface for that implementation?

The new interfaces were created in v4.0 and the token was designed to be just a value object - which is generally okay for an interface to depend on.
UnencryptedToken was only introduced in v4.1 - and we didn't want to change the signature of the interface (even though it's not technically a BC-break since types are still compatible).

One thing bothers me a bit though, since the interfaces moved around and claims() got moved to the UnencryptedToken interface it left a weird gap in the Parser interface, since it's not possible to get a "proper" decoded token interface via the default implementation.

This is similar to the discussion in #814.

I didn't have the time to finish that yet (which is why #814 is also lingering) but I'm documenting the JwtFacade and thinking a lot about the gap you're describing.

To me, the JwtFacade is exactly what we should use to solve it - and we get the benefits of a safer usage of tokens.
That API already provides the necessary guards to always return unencrypted tokens, so I wonder why do we need to introduce yet another one.

The effort of implementing that kind of parser is really minimal:

use Lcobucci\JWT\Encoding\JoseEncoder;
use Lcobucci\JWT\Token\Parser;
use Lcobucci\JWT\UnencryptedToken;

final class UnencryptedTokenParser implements \Lcobucci\JWT\Parser
{
    private function __construct(private readonly \Lcobucci\JWT\Parser $parser) {}

    public static function create(): self
    {
        return new self(new Parser(new JoseEncoder()));
    }

    public function parse(string $jwt): UnencryptedToken
    {
        $token = $this->parser->parse($jwt);

        if (! $token instanceof UnencryptedToken) {
            throw InvalidTokenStructure::tokenIsEncrypted();
        }

        return $token;
    }
}

My question is: is this really the direction we want to go?

I kept this forward compatibility layer with the hope of supporting encryption without having to break BC (#6 is open for years and years). However, what I'm getting from users is: we don't really need encryption.

If that's the case, we should aim at simplifying things (even if that would require a v5 soon - which is completely fine btw) instead of introducing more and more components.

@Ocramius
Copy link
Collaborator

I'm not suggesting a new Parser implementation. I am suggesting a new interface:

interface DecryptToken {
    public function __invoke(Token $token): UnencryptedToken;
}

This is kinda important due to parsing and decrypting being different planets 😁

@pkly
Copy link
Author

pkly commented Feb 23, 2022

If in 3 everything including claims was contained within the token itself I don't really see why the split was necessary between Token and UnencryptedToken (if there was one? unsure, as I didn't really delve that deep into it before converting our code to v4).

I don't really mind either solution, either merging interfaces or having a separate one.
If we go the DecryptToken interface route we could probably keep compatibility in 4 without worrying much about BC breaks, the default implementation on Parser could just check if the token is actually unencrypted, else throw a DecryptionFailedException or something, since obviously support at the moment is not necessary. If someone would need a decryption layer for their tokens they could simply implement the interface and use their class.

On that note, I wonder if using __invoke is the best idea, but that's just a minor issue.

@lcobucci I don't really like the idea of changing the return type on an implementation itself since if you'd use something like symfony you could create a service and simply bind the interface as a service - you'd still get the same issue we're having right now, since the implementation itself is not important, the interface is.

@SvenRtbg
Copy link
Contributor

I have one thought related to the possible future situation with this library.

You may remember when basically all JWT libraries had a serious security flaw because they unconditionally accepted "none" as the signature method announced in the header, so an attacker might just select that schema, omit the third part with the signature, and has full control over the data part and any claims.

So now imagine that every user of this library is accustomed to "the parser returns UnencryptedToken anyways", and suddenly this assumption is not true anymore because somehow an EncryptedToken is returned. Technically attackers would be able to pass about any string into the parser. Or maybe the other way around, code that might be expecting an encrypted token is now getting an unencrypted variant.

The key learning from that security incident for me is: Ask for things very specifically, state what your expectations are in the code to allow detecting deviations early. This means that even if both types of token would be inheriting from the same parent interface, I'd still expect the parser to be asked to work on the two different things by calling two distinct methods. Maybe even have different parser implementations. What I'd avoid is to have one generic "parse it all" method that then has to use third party information, guesses or black magic about what type of token may have been passed.

So if at all encrypted tokens would be implemented, they should be treated differently from the start, so any thoughts about maybe make them seem to just be Token might not be helpful for the use case anyway.

@Oxmoze
Copy link

Oxmoze commented Feb 23, 2022

My 2 cents.
UnencryptedToken, EncryptedToken... these terms are misused because obviously no encryption is done here.
Signed tokens are just encoded into base64 url safe but not encrypted.
It should simply be Token/SignedToken, ParsedToken or similar, but encrypted should not used at all unless JWE is supported

@Ocramius
Copy link
Collaborator

The point is differentiating JWE from the other tokens: no confusion between signing and encrypting.

@spawnia
Copy link

spawnia commented Feb 27, 2022

The library currently has (string $token): Token) and (string $probablyUnencryptedToken, SignedWith $signedWith, ValidAt $validAt, Constraint ...$constraints): UnencryptedToken. It is clear to me that there is a gap between those, and I think there is a use case for a simple function with the signature (string $probablyUnencryptedToken): UnencryptedToken.

No custom implementation, parse + decrypt, dependency injection, dummy implementation of validation rules, or any other complication should be necessary to do this. Higher level APIs that are either more flexible, more opinionated or different in any other way can coexist with this.

Ask for things very specifically, state what your expectations are in the code to allow detecting deviations early.

I agree with @SvenRtbg's argument, adding a specific parser would achieve that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants