Language injections for controlling syntax highlighting in string literals #3952
Replies: 25 comments 20 replies
-
This isn't specific to Python but to VS Code in general, so transferring. |
Beta Was this translation helpful? Give feedback.
-
@brettcannon This has been discussed in vscode before with the conclusion that this should be done on a language by language basis. The idea is that you need to understand the Python grammar to be able to know what to highlight and not. Do you agree with that assessment? That's why I posted it in the python language repo. |
Beta Was this translation helpful? Give feedback.
-
I hadn't realized it was already rejected as a core feature and being suggested each language somehow support it. I've moved the issue back. |
Beta Was this translation helpful? Give feedback.
-
Fully understandable, I should have included that information in the original issue. Anyways, it's an awesome feature, if at all possible to build! :) |
Beta Was this translation helpful? Give feedback.
-
Moving this to pylance for further investigation. See here for details on Embedded Programming Languages: microsoft/vscode-languageserver-node#1170 (comment) |
Beta Was this translation helpful? Give feedback.
-
@EmilStenstrom do you have any use cases where this would be useful? I mean other than just creating strings with syntax highlighting. I would think this would be something to add to LSP itself and not have it be language specific. Pylance shouldn't parse HTML tags and provide completions for HTML. You'd want the HTML language server to parse them. In Visual Studio, I believe this would be handled with projection buffers. Each language server would be responsible for just their portion. Something would have to indicate how to split the text buffer into its pieces. VS code has at least one issue that sounds similar: |
Beta Was this translation helpful? Give feedback.
-
Oh it seems VS code has a different way. That would mean Pylance would generate virtual documents and send each document off to the appropriate server. |
Beta Was this translation helpful? Give feedback.
-
@rchiodo With use-case, do you mean what the benefit would be to have the strings as highlighted instead or regular python strings?
If you by use-case mean if this is code that actually exists in the wild, then yes, it very much does. I'm the author of a library called django-components, which provides a container for reusable web components consisting of some python glue code, html, css and html. Currently users need to have four different files with the different formats, because otherwise there is no way to get syntax highlighting and autocompletion, but since most files are small, it would be a better experience for everyone if you could inline those small files as strings instead. This is possible today, but then authors lose all editor support for those languages... |
Beta Was this translation helpful? Give feedback.
-
This reminds of cell magics in Jupyter. Something like so: %% sql
SELECT * FROM FOO... Which we special case right now to just ignore everything. It would be much nicer for the user if the %%sql turned the rest of the document into a virtual sql doc. |
Beta Was this translation helpful? Give feedback.
-
I asked for use cases because this would be a lot of work to implement so we'd have to justify it based on how many people it would affect. If we could use the same thing for jupyter notebooks, that might up the affect count. |
Beta Was this translation helpful? Give feedback.
-
Personally, I like your |
Beta Was this translation helpful? Give feedback.
-
Nice catch with the Jupyter case! I think there are many more. The general question is: Do people write code in other languages in python. I have personally done this many times, especially for "glue scripts". The tagstr project (which is actively worked on) has a list of example tags in their repo, which includes the html, sh, and sql tags. I think this is likely the three most common languages that your write inside python strings. Object-Relations Mappers (ORM:s): Almost all ORM:s have a mode were you write raw sql that get sent to the database directly. Here's Django's documentation for this: https://docs.djangoproject.com/en/4.1/topics/db/sql/ - By marking those string as SQL for VSCode you could greatly reduce the risk of errors inside those brittle SQL strings. Unix shell scripts: Conditionally calling different shell scripts is VERY common in python. So common that the standard library has a utility called shlex which lets you deal with them. This means shell scripts in code is very common, and single letter errors could potentially delete all your files. By getting syntax highlighting for those strings, you could avoid such errors! :) HTML templating: Python is used a lot on the web, and not all sites are backed by large templating libraries. Instead, HTML snippets are stringed together with python, leaving a lots of room for errors when no highlighting is available for those strings. Pycharm uses HTML as their example for when language injections are needed. Those three use-cases should touch a LOT of codebases out there, and if we include all languages that VSCode supports, configuration scripts, deployment code, templating languages, I think it would be hard to find ANY sizeable codebase that doesn't embed another language somewhere. |
Beta Was this translation helpful? Give feedback.
-
Just to test my point, I just picked a microsoft-related project I've recently worked with: the O365 bindings for python. They use HTML strings in their tests: test_teams.py and test_message.py. |
Beta Was this translation helpful? Give feedback.
-
Not sure what the status of the extension client move is, but implementing this requires changes on the client side. So it would require pylance owned the client side or that the python core extension provide some of the support here. See the example here: |
Beta Was this translation helpful? Give feedback.
-
@rchiodo I am still investigating the requirements from the Jupyter side for the client move. I will update you and the team when that is done. For now this should be here. |
Beta Was this translation helpful? Give feedback.
-
Do I understand things correctly that implementing this requires changes to "pyrx" which is a closed source repo linked above? Is this issue still in triage stage? Let me know if there's something else I can do to help out! |
Beta Was this translation helpful? Give feedback.
-
@EmilStenstrom this is in the looking for upvotes phase at the moment. It's why I asked for use cases too. Changes for this would (at the moment) be in the Python core extension, the Pyright code, and in the private pyrx repo. |
Beta Was this translation helpful? Give feedback.
-
@rchiodo Do you collect the upvotes in this issue? I got a e-mail notice that this was moved to a discussion, but I don't find it there. |
Beta Was this translation helpful? Give feedback.
-
For now, yes the upvotes on the issue would be counted. I'm guessing we'll move it to a discussion at some point. Once @karthiknadig comes back with the information on the client move. I think Jude was initially going to move this to a discussion but we kept it here waiting for Karthik. That's likely the e-mail you got. Right now the 'client' side code for pylance (pylance exists in two parts, server/client) is created in the Python core extension. We're trying to determine if we can move this to the pylance extension itself. I think in order to support the idea you proposed, we need to create virtual documents as outlined in VS code's example. In order to create those, that has to happen on the client. Hence the need to know where the client is going first. |
Beta Was this translation helpful? Give feedback.
-
There are a lot of people talking about support for this in python in the original thread in vscode main. Unfortunately, they have no idea that this is discussed here, and since that thread is locked, there is no way I can bring their attention here. |
Beta Was this translation helpful? Give feedback.
-
In my opinion, the In terms of user experience and readability, I would rank them
|
Beta Was this translation helpful? Give feedback.
-
I do have to say that I most like the comment options, as I don't like to inject code only for ide to do something, I kuch prefer that any IDE specific things will be either manual process or comment that will be ignored (or even be used the same) in other IDEs Even more that it's not that pycharm allows lang injection, is that any jetbrains ide allows it, and comment is the most cross language way to support it |
Beta Was this translation helpful? Give feedback.
-
Moving this issue to discussion as an enhancement request for comments and upvotes. |
Beta Was this translation helpful? Give feedback.
-
See also this discussion: python/typing#1370 My suggestion is different from this issue, in that I am arguing for specifying string languages as types. The difference being that the definition can be moved to the function definition or type stubs, rather then every call site. Take the example of regexes. My suggestion is that s2 = re.sub(
# language=regex
'(\d+)foo',
# language=regex replacement
'$1bar',
s,
) this is very verbose, requires tons of newlines, and again, it's at the call site but it's always the same! Instead logically it would be much better at the function definition: def sub(pattern: Regex, replacement: RegexReplacement, string: str): now everyone who upgrades their type stubs or their python version or whatever will get their code syntax highlighted for free. |
Beta Was this translation helpful? Give feedback.
-
Can this be made to work with SQLTools and Copilot, see #6847 Similar to who Pycharm implemented it, see Inject SQL & Injection Settings IntelliSense now treats the code within the string as the specified language, it realises the table reference is invalid Now go back to the *, get the option to expand to all the fields Or if I remove the * I get to choose from the fields in the selected table: |
Beta Was this translation helpful? Give feedback.
-
I think it would be very useful to have a way to have syntax highlighting for other languages inside of a python file. Much like
<style>
tags with CSS inside a html file gets the correct syntax.Example:
Since there's no highlighting of the strings, it's very easy to miss the missing quote in
js_string
.WAYS OF SOLVING THIS
A magical comment
PyCharm has this built in, so you can either manually mark a block as another language, or use the magical "# language=html" comment to mark the next string as a foreign language.
I don't quite like the syntax of this, but it solves the problem, and would make two editors in sync around this.
Tag-strings
Tag-strings is an idea for a new python language feature that allows you to write:
This might sound sound crazy, but there is some working python fork by Jim Baker and Guido that explores this. The rationale for this feature is not explicitly syntax highlighting, but this is such a nice API, that I think it would fit very well with syntax highlighting too.
I opened a thread on this on python-ideas, and got feedback in the form of "this is something that the editors should do, not the python language".
Dummy methods
Another proposal (in the python-ideas thread above) is being able to set some functions in VSCode to be special, so that strings that are inside those functions gets highlighted. In this case, I've set up html(), css(), and js() to be highlighted in VSCode.
New types
Another way would be to use the typing system, and use the new type to highlight.
Use subclass of str
Which has the nice side effect that you could also do something with the strings.
Use Annotated type
YOUR THOUGHTS
This is where I'm hoping for some feedback from you. Is this something you have wanted too? Do you think it's a useful addition to the python language extension? Is this at all doable in VSCode, or does there need to be upstream changes?
Beta Was this translation helpful? Give feedback.
All reactions