How To Implement String Interpolations in `LanguageInjectionPerformer`

I am looking into language injection support for the Nix plugin (current state). An important part of the complexity is that the Nix language supports string interpolations and removing common indentation.

rec {
  recipient = "World";
  message = ''
    Hello ${recipient}!
  '';
}

The plugin is currently targeting IntelliJ 2025.2.6.1. In general, I have to say that this topic turns out to be quite complex, partially due to seemingly overlapping responsibilities of different SPIs, without clear documentation about the relations between them. Anyway, I think I have figured out the basics and settled on the LanguageInjectionPerformer extension point (my current implementation). I am currently wondering about two specific questions regarding best practices related to interpolations:

  1. Which PSI element should implement the PsiLanguageInjectionHost interface? The element representing the string (NixString), or the element representing the text components between interpolations (NixStringText)? Semantically, I would assume it makes more sense to implement it on the level of the entire string. However, the API becomes rather cumbersome when I try to do that. Most notably, MultiHostRegistrar.addPlace doesn’t accept any element not implementing PsiLanguageInjectionHost. My only way to preserve the mapping from the “place” to its text component is by encoding it into the TextRange, which hasn’t worked reliable so far. I therefore tried to implement the PsiLanguageInjectionHost on the individual text component, but that makes the action to open the fragment editor (QuickEditAction) unavailable at various places inside the string (i.e. next to the quotation marks, and just before string interpolations). There seems to be no good solution. The documentation seems to avoid this topic by only talking about string concatenations, ignoring the existence of string interpolations in many languages.

  2. Which placeholder text should I inject for interpolations? When there is a string interpolation between two text components, I inject a placeholder by providing it either as suffix to the previous “shred/place”, or as prefix to the next “shred/place” (source code). This seems to be handled nicely by IntelliJ. The value of the placeholder is never shown in the UI. But this leaves a question: what should be the value of the placeholder? Ideally, the placeholder should be a valid expression or identifier in the guest language, to avoid breaking its parser. However, I am not in control of the guest language. Otherwise, the text of the placeholder is irrelevant to me. Is there some convention about the value of the placeholder, or some why to query a good placeholder from the guest language?

I don’t have much time for a detailed response, but here are a few pointers.

In general, the String literal would be the injection host. It typically is a snippet of the guest language and the injected guest language should see the whole snippet and not each interpolation as a separate snippet.

You first find the interpolation ranges inside the host, then inject them using the registrar.
In general, it’s hard to get right, especially if you want to the “Edit fragment” action to work.
I recommend to add tests.

I don’t think it’s possible to make all guest languages happy because they, as you said, get the placeholder and that’s not always valid.

Examples:

Semantically, I would also expect the injection host to be the whole string literal, not an individual text fragment. So in your case, NixString should implement PsiLanguageInjectionHost.

The mapping between a place passed to LanguageInjectionPerformer and the corresponding text fragment is expressed via TextRanges inside the host. In your case, NixStringText fragments already have well-defined ranges relative to the containing NixString, so those ranges should be used to describe which parts of the host participate in the injection.

You can look at these implementations as examples:

The Kotlin implementation is a more complex example, but it is also useful for understanding how placeholders/interpolations are handled.

Regarding the placeholder text for interpolations: unfortunately, there is no universally good answer here. If the value of the interpolation can be computed statically, you can try to evaluate it and substitute that value into the injected text.

If it cannot be computed statically, then the appropriate approach is to call MultiHostRegistrar.frankensteinInjection. This tells the injected language processing that the host language could not provide a complete, semantically valid string, and that the injected text is assembled from incomplete or non-literal fragments.

Thanks for your feedback and suggestions. I have now implemented the interface on the string itself, and it is mostly working now. :rocket: I made some assumptions I would be happy to cross-check with you.

  1. What about InjectorUtils.registerSupport(...) and LanguageInjectionSupport? I noticed that JetBrains’ implementations of LanguageInjectionPerformer usually call InjectorUtils.registerSupport(...). I only used MultiHostRegistrar.startInjecting to register the injection. I also did not implement LanguageInjectionSupport, which is part of the instructions in the documentation. I hope it is fine that I ignored these things so far. It seems to work. I suspect that these additional steps might be necessary for persistent injections configurable via the settings. Am I right?

  2. PsiLanguageInjectionHost.updateText ignored? My implementation is currently throwing an UnsupportedOperationException, and it seems to work. My best guess is that this method is a remnant from the past, as the current implementation seems to use the ElementManipulator. Do I have to implement this method?

  3. LiteralTextEscaper.getRelevantTextRange ignored? I could not figure out what it is used for, since the LanguageInjectionPerformer is already providing the ranges. I have not implemented it.

  4. LiteralTextEscaper.decode only receives ranges provided by the LanguageInjectionPerformer. Is that true? My current implementation assumes that the ranges always start and end at token boundaries. If I get ranges from other sources, they could violate this assumption and break my implementation. The Javadoc claims that I should return false in such cases, but I don’t understand how to run into such scenario and how returning false would help.

  5. LiteralTextEscaper.decode is only called once per instance. Or at least only with one range per instance. Is this assumption true?

  6. LiteralTextEscaper.getOffsetInHost always receives the same range as LiteralTextEscaper.decode. At least that was my impression if we ignore IJPL-244922. Is that correct? I feel like otherwise many implementations might be broken, not just mine.

  7. Are there other usages of ElementManipulator? While I need to implement this extension point for language injections, it doesn’t seem to belong to this feature directly. Are there other side effects by implementing it?

  8. Can I fail ElementManipulator.handleContentChange? I was wondering what I should do if the given range starts or ends within a string interpolation. After all, ElementManipulator.getRangeInElement cannot represent these gaps in the range.

  9. I also noticed that the Performance isn’t great. But I have noticed the same issue before when using Java. I assume it is a problem of the language injection architecture, not with my specific implementation. Or is there something I should pay attention to? Without having profiled anything, I guess it might be caused by the fragment editor having to “commit” code changes once for each fragment (i.e. line) on every edit, as using ElementManipulator probably requires a valid PSI tree.

Btw, I guess these might also be useful clarifications for the Javadoc of the SPI, except the last point. Assuming I haven’t just missed it, and it is already there. :sweat_smile:

Regarding my current state of the implementation. There are still some TODOs and issues, but it almost seems to work reliably. I also believe that I found two bugs in the platform[1][2], but I was able to work around them. Anyway, my biggest concern is the amount of complexity. Despite having created almost 100 test cases (or over 150 depending on how you count them), I believe there are still uncovered edge cases in my implementation. :sweat_smile: Anyway, I guess that is just how it is.

If you are interested, I listed some factors below on how I believe the SPI design contributes to the complexity. I don’t want to say that JetBrains is to blame for the complexity, but I think the SPI could be better. The Nix language also added some quirks which also contributed to the complexity. Anyway, it should not hurt list my perspective, although actually changing the design would probably be difficult and a lot of effort. :smile:

How the current SPI contributes to the complexity
  • No direct mapping between PSI nodes and text fragments (aka. “places” or “shreds”). If fragments were linked to PSI nodes, I think handling decoding and updating the text would be much easier. You could just re-encode and replace the text after each update, no need to deal with any ranges.

  • Requirement to create separate fragments for each line. Due to IJPL-244525, but also to achieve consistent highlighting in the UI, the common indent (which gets trimmed by the language) must be excluded from the ranges reported by LanguageInjectionPerformer. Since ranges are continuous, this means I need to create a separate text fragment for each line. This increases complexity as fragments can now be created or removed while the user uses the fragment editor. It also makes establishing of a proper mapping between PSI nodes and text fragments noticeably harder, as one fragment might be split into two. Separating the concerns for highlighting ranges and transferring text updates could reduce overall complexity.

  • You cannot rely on getting the rang matching the line. While I have to create a separate range for each line (as mentioned in the previous point), you cannot rely on modifications on these ranges belonging to the line they represent. More specifically, if you add text to the beginning of a line, the ElementManipulator is called with the range of the previous line, and the new text appended behind the \n. As a result, I always have to strip the indent adjacent to the given range, and re-apply it afterward. (This is effectively a consequence of the first two points.)

  • No MultiHostRegistrar.addText available. Instead, I have to add Injection.getPrefix, Injection.getSuffix and placeholder strings as suffix and prefix to the individual fragments. Finding the correct prefix or suffix for each fragment adds noticeable complexity to the implementation of LanguageInjectionPerformer. If the static text fragments could be added via separate method calls, some complexity could be avoided. (There could also be an overload of MultiHostRegistrar.startInjecting, which expects the Injection instance and automatically applies the correct language, prefix and suffix. Or the object could just know these configurations.)

  • LiteralTextEscaper.getOffsetInHost as separate SPI method. Currently, each implementation of LiteralTextEscaper has to maintain their own state after processing LiteralTextEscaper.decode, so that it can later respond to getOffsetInHost. This is not only unintuitive, but also adds complexity for the state management. The SPI could instead use a single method which returns the decoded string together with a “source map”. The API could then provide the necessary tools to generate the result of the method. Like a builder with methods similar to append(String escapeSequence, String decodedText).

PS: I noticed that IntelliJ’s implementation of Java’s template strings (which were unfortunately discontinued by Java) seem to implement PsiLanguageInjectionHost on the fragment (PsiFragment), not on the string itself. But I will probably keep it on the string now.


  1. IJPL-244525: InjectionRegistrarImpl$PatchException when range resolves to an empty string ↩︎

  2. IJPL-244922: InjectedLanguageUtil.hostToInjectedUnescaped uses mismatched coordinate spaces in offset comparison ↩︎

I can only provide limited answers…

I’m not using it, either. The docs say the *Support is for pattern integration with the IntelliLang plugin.

There are still a few callers left in intellij-community. Have to tried to invoke “Edit fragment” on your injected content?
I’m usually implementing this method by delegating: return ElementManipulators.handleContentChange(this, text).

[/quote]

I don’t think it’s ignored, there are calls in intellij-community.
Usually, text escaping etc. is used by the PSI file editor opened by “Edit fragment”.

ElementManipulators are called for rename refactorings to update the names of identifier owners, iirc. It’s pretty much central functionality for PSI updates.

Good question :slight_smile:
If I’m not mistaken, an ElementManipulator is either called for things like a rename (no escaping involved) or for the PSI build on the unescaped injection and thus the ElementManipulator would again only get the ranges with complete escapes.
But that’s just by experience and I don’t know if that’s entirely correct.

I’ve given up to make injections work fully reliable.

I’ve also reported bugs a long time ago:

With the current design, injections can cover sub-ranges of PsiElements. I don’t think that fundamental changes to the API will happen anytime.

Have you noticed com.intellij.injected.editor.InjectionMeta already? The YAML injector is using it for block indents, for example. Still feels like a hack, though.

Great, thanks for the detailed follow-up. A few comments below.

What about InjectorUtils.registerSupport(...) and LanguageInjectionSupport?

Your assumption is mostly correct. LanguageInjectionSupport / IntelliLang support is needed when you want to integrate your language with IntelliLang-style configurable/persistent injections, Settings UI, XML injection configuration, patterns, etc.

For a programmatic injection implemented by your own LanguageInjectionPerformer, it is fine to call MultiHostRegistrar.startInjecting(...) / addPlace(...) directly. The SDK docs also separate these paths: IntelliLang support/configuration is one mechanism, while LanguageInjectionPerformer is the API intended for more complex cases such as concatenation or interpolation.

PsiLanguageInjectionHost.updateText ignored?

Mostly yes, in the sense that the actual content replacement is usually handled through an ElementManipulator. Still, I would not leave updateText throwing UnsupportedOperationException unless the host is intentionally read-only.

The safest implementation is probably to delegate it to the manipulator as well ElementManipulators.handleContentChange(this, text)

LiteralTextEscaper.getRelevantTextRange ignored?

I would implement it. In the usual case it should return the range of the host text that is relevant for decoding, e.g. the string content without delimiters/markers. Even if the performer later passes more specific ranges to addPlace, the escaper should still describe the meaningful text range of the host.

decode only receives ranges provided by the LanguageInjectionPerformer?

I would not rely on this too strongly. In practice, you will often see the same ranges that were passed to the registrar, but the escaper is also used by offset-mapping/editor infrastructure. So I would make decode(...) defensive: if the given range cannot be decoded correctly by your implementation, return false rather than assuming token boundaries or throwing.

In particular, I would avoid making the implementation depend on the range always starting and ending at NixStringText boundaries unless you explicitly validate that first.

decode is only called once per instance?

I would not rely on this either. It is fine for the escaper to cache data derived from a particular decode(...) call if that data is only used for the corresponding offset mapping, but the implementation should not become incorrect if the platform creates/calls escapers differently.

getOffsetInHost always receives the same range as decode?

That is the intended relationship for the decoded text/range being mapped, but again I would keep the implementation defensive. The safest model is: decode(rangeInsideHost, outChars) builds a mapping from decoded offsets back to host offsets for that exact range, and getOffsetInHost(offsetInDecoded, rangeInsideHost) answers from that mapping. If the range is unsupported or inconsistent, clamp/fallback carefully rather than assuming too much.

Are there other usages of ElementManipulator?

Yes, it is not specific to language injections. It is a generic PSI mechanism used by features that need to replace or edit the “content” part of an element without replacing the whole PSI element manually. Language injections are one important consumer, but not the only possible one. So implementing it can affect other editing/refactoring/intention-like paths that ask the platform to change the element content.

Can I fail ElementManipulator.handleContentChange?

Ideally, normal edits from the injected editor should not crash. If the requested change crosses an interpolation boundary or otherwise cannot be mapped back unambiguously, you basically have three options:

  1. normalize/expand the changed range to a range you can safely rewrite;

  2. make such parts effectively non-editable by the injected editor;

  3. fail gracefully with a clear limitation.

For interpolations that cannot be represented as a plain text replacement in the host, I would avoid pretending they are safely editable. This is also where frankensteinInjection is useful conceptually: it marks that the injected text is assembled from pieces that do not form a normal, fully reconstructable literal.

Performance

If you mean the Fragment Editor specifically, then yes, it is not very performant at the moment. It does quite a lot of work on EDT and performs commits very eagerly, so editing injected fragments can be noticeably expensive. Maybe this will be rewritten at some point.

For language injection in general, though, I would be more careful with conclusions. There are many things that can contribute to the slowdown: number of shreds, PSI traversal in the performer, offset mapping in the escaper, reparsing/commit frequency, inspections/highlighting in the injected file, etc. So the best next step is to profile the concrete scenario and identify the actual bottleneck.

Things worth checking:

  • whether performInjection() does broad PSI traversals on each update;

  • whether the escaper mapping is linear and cheap;

  • whether computed string-fragment/range mappings can be cached safely;

  • whether highlighting/inspections in the injected language dominate the cost.