Technical documentation in AI era (Part 1)
Self-documenting code fallacy
Three people look at the same painting, Supper at Emmaus by Caravaggio.
First person is a small child without any knowledge of history or culture. It will see nice painting, of some people eating and talking. A little dark but generally pleasing.
Second person with some more culture background might notice the biblical reference in title and understand, that painting depicts the moment when the resurrected but incognito Jesus reveals himself to two of his disciples (presumed to be Luke and Cleopas) in the town of Emmaus.
Third one, an art historian might know Trompe-l'œil technique and Caravaggio biography and therefore appreciate the painting even more then the other two.
At it’s core the painting depicts four people. Three of them sitting, one standing. There is a table with some food on it and that is it. The appreciation of it, will depend on the information, that is outside of the picture. The viewer requires additional “documentation“ to fully understand it.
Code can’t be self documenting
We have analogues situation with understanding codebase.
If some non programer will look at it, they will be able to read it but will not understand much (just like kid with painting).
If it will be programmer, but without domain, patterns and conventions knowledge he will also have hard time understand what is going on.
Only programmer with knowledge of all the above will be able to fully understand and evaluate the code base.
Like the painting, code itself cannot contain this information, because at its core it is just classes methods packages… to understand what they mean we need knowledge from outside the code. We need documentation.
AI agents need documentation
We do not think about it every day because, when new people are joining our teams the knowledge is transferred mostly orally and after couple of weeks enough is loaded in to newcomers head.
Today the same knowledge is needed for AI agents to be able to “understand“ our codebase. We can no longer transfer this informations orally, we need to write it down somehow, so that in can be put into LLM context window.
Traditional approach has two main limitations:
hard to link code with docs - how to know which parts of documentation is relevant for particular code fragment.
knowledge duplication - textual description often duplicates knowledge that is already embedded in code, which leads us to next problem.
Documentation becomes out of date - lack of linkage, and duplication naturally lead to out of date documentation. After changing something in code it is time consuming to find proper place in docs and update it.
Extracting knowledge from code
Fortunately we are not doomed, because tools like Noesis can help solve all of the above issues by introducing:
Conventions as Code (link to part 2)
Behaviour level dependencies (link to part 3)
Domain knowledge embedded in code (link to part 4)
AI powered documentation (link to part 5)


