Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> We are currently in the process of moving away from tree-sitter and instead using the parsers provided by the languages themselves where possible.

I imagine this means you're trying to abstract over those parsers somehow? How well is that going, and have you written about your approach?

(I wrote `resholve` to identify and rewrite references to external dependencies in bash/posixy Shell scripts to absolute paths. This is helpful in the Nix ecosystem to confirm the dependencies are known, specified, present, don't shift when run from a service with a different PATH, etc.

It builds on the mostly-bash-compatible OSH parser from the oilshell/oils-for-unix project for the same reasons you're citing.

It would be ~nice to eventually generalize out something that can handle scripts for other shell languages like fish, zsh, nushell, elvish, the ysh part of the oils-for-unix project, etc., but I suspect that'll be a diminishing-return sort of slog and haven't had any lightbulb-moments to make it feel tractable yet.

We also have some ~related needs here around identifying hardcoded or user-controlled exec...)



Our parsers simply return the concrete syntax trees in a JSON format. We do not unify all the different syntax constructs into a common AST if that is what you are looking for. The languages and file formats we support are too diverse for that.

The language specific logic does not end with the parsers though. The core of SemanticDiff also contains language specific rules that are picked up by the matching and visualization steps. For example, the HTML module might add a rule that the order of attributes within a tag is irrelevant. So it all comes down to writing a generic rule system that makes it easy to add new languages.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: