I’d love to have something like an “indents.scm” that could be used across editors, unfortunately the built-in Emacs tree-sitter support does not use “indents.scm files”, it uses it’s own format…which I think is a double-edged sword. On one hand, it seems like re-inventing the wheel if an indentation rules file already exists, but on the other hand it allows the Emacs modes to add in their own processing, which isn’t constrained by the limitations of the “indents.scm” format. I haven’t looked at the “indents.scm” format in-depth to know what those limitations are, since it was never available for Emacs, so maybe it’s very powerful, I’m not sure. Maybe you can share what you’ve observed.
One of the big issues I’d encountered, and which I experimented with for a long time while working on indentation for gpr-ts-mode
, was handling invalid syntax. The tree-sitter based indentation rules are fine when you have valid syntax, but when there is a syntax error near the indentation point, it becomes much more problematic. The tree-sitter library will choose to create ERROR nodes, or the parents will be ERROR nodes, or a sibling will be an ERROR node, and then the syntax tree with the ERROR nodes will not look like you’d expect and your rules will no longer match. Trying to write rules including ERROR nodes can be very brittle. Furthermore, I ran into issues when the tree-sitter library was inserting missing nodes into the syntax tree as part of it’s recovery algorithm, but it wouldn’t let me access the next sibling after a node it had inserted itself. That was probably a bug in the library, but all of this just led me to give up on trying to use the syntax tree at all when there was any kind of ERROR node around the desired indentation location.
Instead, when I detect ERROR nodes in the vicinity of the indentation point (node itself, parent node, sibling node, etc.) I use a heuristic approach, looking for keywords, punctuation (e.g., comma, semi-colon), etc. in the buffer near the node to help guide the indentation. There was a lot of trial and error to get something that was robust with gpr-ts-mode
, but what I have now seems to work quite well. When indentation is requested, the mode will seamlessly determine whether to use rules-based indentation or heuristic-based indentation, depending on the presence of ERROR nodes near the indentation location. Furthermore, when a line indentation is attempted and there are no ERROR nodes around it (i.e., when I can use rules-based indentation), I expand the indentation to the nearest declaration (although this can be disabled if desired), so that it will correct any indentation in the declaration caused by the heuristic approach.
At this point, I’m pretty happy with the gpr-ts-mode
indentation (and am attempting to use this approach with the ada-ts-mode
indentation). I don’t know how you could describe that in an editor-independent fashion that could be shared. I’d love if there were a way though, but so far I have not seen anyone do this. Very seldom do I even see indentation rules that account for ERROR nodes at all, which can cause indentation to annoyingly jump around while you’re typing, which is what I originally had, and was not an enjoyable experience at all.