Do you use Artificial Intelligence for Ada coding?

As I am still alone coding the TLALOC Ada 83 compiler, I tried to get help from Claude AI for coding in Ada 83. Does anyone use AI for Ada coding ? Ada 83 is especially suited to AI assisted programming with its simple package structure. You define the specification revise it with AI and then let the AI code the body under feedback supervision. It works extremely well.

Yo :waving_hand:

I personally tried this for FOSS or a personal project.
I do not use it for industrial professional projects. Maybe in the future, if we can have a fine-tuned LLM well trained on Ada that can run on our computer / our company server, not on a remote LLM from another company.

On my side, I tried the following LLM:

  • ChatGPT 5.1 (high reasoning): Meh… it needs more iterations than Claude for tasks
  • Gemini 3: good but not the best
  • Claude Sonnet/Opus: I use it for coding since 3, now with 4.5 it’s awesome

For your purpose, you have to give it a try because those LLM are trained on public code and ARM, so, it will try to use the latest language version possible (when it does not generate malformed code).

Furthermore, I exploring the uses of multi-agents systems and spec-driven-development.
When I 1st seen “spec-driven-development” I laughed a lot, because, this is what we doing with Ada… for decades :rofl:

Currently, I am experimenting with the use of BMAD-METHOD on Windsurf and Claude Code. It is too heavy for small projects but is ok for larger ones. IMHO, we have to add custom agents for Ada (and SPARK) coding, but the default does the job… well.

I also quickly tried spec-kit and openspec but I am unable to do efficient things with it (the goal of LLM is to save time, not waste it).

In any way, you will have to review the code it write :slight_smile: because it can be too verbose, sub-optimal, or not doing what you want (rare with a spec-driven-framework).

Since search engines have become useless, I use AI to:

  • Search the web
  • Create summaries (give me an overview of this API with CRUD examples)
  • Get debug snippets on environments I don’t know (and don’t want to know)

In short, to save a lot of time on acquiring/managing “stupid knowledge.”

I also use it in “rubber duck” mode. I explain my problem to AI, and it comes up with a solution strategy. Sometimes it’s silly (often depending on my own prompt), sometimes it gives me ideas :wink:

The only time I consulted her for Ada, she hallucinated in a very convincing way :slight_smile:

In short, AI has become indispensable, like fire.

And the same consequences in case of misuse.

2 Likes

I don’t use it to code anything (yet). A lot of what I do relies on mathematics, and my impression, as well as my experience, is that AIs are still terrible with mathematics. (They don’t have actual reasoning skills, if by “reasoning” we mean logical deduction.) Besides, I enjoy the process of devising the solution and writing the code; that’s one reason I do the Advent of Code problems.

Last year I worked with an entry-level programmer whose community college programming class apparently consisted in teaching students how to query an AI to write the program for them. (“apparently” is load-bearing; I have no idea whether that’s what actually happened.) It was pretty impressive what the AI could do. But when it provided him invalid code, he was incapable of fixing it. I sat with him once to debug it and it was a 2-hour-long nightmare.

I am interested in using AI for code quality issues, and if AI-generated code produces decent enough comments that the debugging experience would have been easier, my attitude would change a bit. But only a bit.

To follow up: I sometimes do something like this with a different task at work, and I find more often that it’s much quicker to do it myself than put up with the give and take of prompt engineering. How do you feel about the efficiency of prompt engineering with AI 83 code?

(Some people say that AI’s will relieve us of the tedium of certain uninteresting tasks, but that’s only an improvement if the tedium of prompt engineering can be minimized.)

I occasionally use code generation features but I wouldn’t ever use Claude or AI with execution abilities. It can be made to delete all your files or worse by remote actors.

This is how you churn out people with no competence.

6 Likes

Hello! I just posted this in Releases but figured it’s relevant here. I fine-tuned a 14B model specifically on compiler-verified Ada 2022 code that runs locally. It’s exactly the kind of thing @Heziode was describing wanting. Details and links here: https://huggingface.co/the-clanker-lover/steelman-14b-ada-v0.1

3 Likes

My overall experience is that LLMs can significantly accelerate some things, but you can also fall into sunk cost fallacy and end up losing everything you’ve gained (“just one more prompt, I’m almost there!”). When I rely on them too heavily in unfamiliar fields, it’s similar to relying too much on one colleague - it’s giving a man a fish and not teaching a man to fish. It’s difficult to understand a generated project without the conceptual backing.

I’ve been in a lot of different codebases and the code they generate has an eerie, off kilter, “uncanny valley” feel I can usually sense even when not told something is AI code. This isn’t to say they aren’t useful and I’ve known people who have had very good results with them. I use them as a fuzzy search, a vanguard to build a syllabus in a new area, a peer reviewer of code and other text, or for throwaway prototypes. They’re also good at diagnosing compile and proof errors if Ada isn’t your primary language; e.g. some GNAT errors are a bit obtuse if you aren’t familiar with the Ada vernacular and I used Gemini to help diagnose some proof errors in Anteforth.

The models used to struggle heavily with writing syntactically correct Ada, though the most recent models do considerably better. One of the reasons I was working in Ada again is that I believe strongly that all of the domain knowledge you can embed directly in source code might make it one of the best languages to use with AI.

4 Likes

“Coding” should be a slur.
There’s nothing smart or exciting about producing syntactically correct code.
This is especially true for the primary vibe languages such as Python or JavaScript.
The utility of language models that everyone should be after is exploratory analysis and design, high in the abstraction.
With domain awareness and careful engineering, writing code is an effortless formality.

1 Like

@pyj

I come at this from a pretty different angle. I’m an Electrical Engineer — mostly embedded and low level work. I’ve been trying to learn programming on and off for about four years but it never really clicked the way hardware and systems thinking does. I have the engineering and infrastructure knowledge, just shaky on the programming itself.

What changed was agentic engineering — using AI agents to actually build software, not just generate snippets. In the last two months I’ve been able to build things that would have been completely out of reach before. Multi-agent systems, training pipelines, evaluation frameworks. The AI handles the programming; I handle the architecture, the requirements, the systems thinking.

But that immediately creates a trust problem. The AI generates more code than I can realistically
review, and I don’t have the programming fluency to catch everything by reading it. I had an AI agent delete half a working codebase to make a test pass. That kind of thing. It will do radical, destructive things in a rush to fulfill a goal if you don’t have guardrails.

That’s what started my search for a language where the machine can verify itself. I found Lean4 first —formal verification is compelling, but the proofs are human-written claims, which puts me back in the same position. Then I found Ada and SPARK. Contracts, type invariants, preconditions, postconditions — the compiler and prover enforce correctness mechanically, without me needing to read every line.

Your point about domain knowledge in source code is exactly why I landed here. Ada lets you encode what the program should do directly in the source, and then the toolchain checks it. That’s the missing piece for agentic engineering — not better code style or fewer hallucinations, but a language where the machine’s output can be verified by another machine.

Edit:

To be clear — I’m not generating everything in Ada. I still use AI tooling directly for most day-to-day work. Ada and SPARK are specifically for the parts where I need the output to be machine-verifiable without my involvement. The goal is a system where the AI generates, the compiler verifies, and I only need to step in at the architectural level.

1 Like

It’s a part of what I see about the use of AI.

Another part is to have a workflow system that goes from functional spec (support of DO-178, EN 50128, etc.) → ads writing → tests (TDD) → adb writing
And loop until the code match the functional spec (whitout the messy things that do LLM by rewriting the tests co comply with the implementation instead of the functional spec…).
Adnd all that with bidirectionnel tracaebility (functional spec ↔ code ↔ tests), that allow to generate traceability matrix, helping in certification process.

Your job with the fine tuning could be a piece of that

Ping @clanker-lover

1 Like

That’s exactly the direction we’re heading. The fine-tune is one piece — the other piece is an agentic pipeline we’re building around Steelman as the code generation engine. It’s still a WIP prototype, but the architecture is designed around the principles you’re describing.

The design: spec (.ads) first, then the generator produces the body (.adb), then the GNAT compiler verifies it with strict flags — warnings-as-errors, validity checks, style enforcement. If it fails, a patcher agent gets the structured compiler errors and produces a fix. The compiler is the oracle, not another LLM judging the output.

The point you’re making about not rewriting tests to match a bad implementation is critical, and it’s a core design principle: the compiler and the spec are the authorities, the generated code conforms to them, never the other way around. That’s the whole reason we chose Ada — the language’s type system and contract mechanisms (Pre/Post/Contract_Cases/Type_Invariant) give you machine-checkable specs that an LLM can’t weasel around.

DO-178 / EN 50128 traceability is exactly the kind of application this is heading toward. A fine-tune that understands Ada idioms combined with an agentic pipeline that enforces correctness through the compiler — that’s how you get from “LLM that generates code” to something that could eventually participate in a certification workflow.

The model and all training data are public but the agentic system is still private. If you have thoughts on what a certification-oriented pipeline would need, I’m interested. Thanks!

Can I ask in which context you are doing all of this?

1 Like

Personal project. I’m interested in the intersection of AI code generation and formal verification —
Ada’s type system and SPARK contracts make it the natural fit. The fine-tune and the agentic pipeline are both hobby work, but I’m serious about where it’s going.

Why not use LEAN or Rocq (prev. called QoC) or any other formal verification language?

1 Like

I looked at Lean4 before landing on Ada. The reason I pivoted was practical: DeepSeek Prover is still young, and I wasn’t ready to fully commit to learning to write proofs manually until the automated proving tools matured enough to actually test against. I was going to come back to it. Still probably will especially for very safety critical things.

While researching alternatives I found Ada, and two things clicked. First, SPARK’s verification is
embedded in the language itself — Pre/Post contracts, type invariants, loop invariants — and gnatprove ships as part of the standard toolchain. You don’t need a separate prover ecosystem. Second, GNAT’s compiler output is structured enough that an LLM can read the errors and act on them autonomously.

That’s what made the agentic pipeline idea possible — the compiler becomes the oracle in a generate → compile → fix loop, no human in the middle.

My original plan was actually to read all the generated code manually to overcome vibe coding’s quality problem. But once I realized the compiler and prover could do that verification automatically, the idea shifted to building an autonomous pipeline around it. That’s what I’m working on now. Lean4 is still on my radar for the long term — the two approaches aren’t mutually exclusive. But for an autonomous pipeline today, Ada/SPARK gives you verification built into a real systems language with a mature toolchain.

I use A.I. to help me out. But I use Grok. Grok works fine for Ada. Code quality is good. Knowledge is extensive. Can do Design by Contract and Ada 2022. I once asked Grok why he is so good with Ada 2022 and the answer was that Adas verbose and clean syntax is very suitable for LLMs and that he was trained on extensive and high quality code base.

1 Like

I think this is one of the most important selling points of Ada when it comes to the general formal verification community. In Ada not only can you prove things (and quite advance ones too) but you also get an implementation from it! This is huuuuge. With Rocq or Lean you can create a prove and show that the algorithm works. But if you want to get a runnable piece of code that you can use in a program… Well, now you have to redo all the work in another language!

This is my main concern with the wider formal verification community, I am really saddened that they don’t even know Ada and it could literally save them half the work!

Best regards,
Fer

3 Likes