Infraestruturas de Portugal AI agent procurement: bots, agents and open source
Infraestruturas de Portugal’s BIA AI procurement is criticized as overly ambitious and risky. The post highlights security concerns with AI agents and challenges the “open source” requirement, noting most AI models are only open weights, raising legal and technical issues.
Usually when commenting on procurement topics I stand clear of the actual substantive issues connected with what is being bought. After all, I'm much more interested on how we buy than what we buy. However, the particular object of this contract is close enough to my wheelhouse that I actually have a lot to say about what is being bought. This first post on the substantive elements of the procurement will cover what is being bought and the open source requirements set by Infraestruturas de Portugal.
Is BIA a bot or an AI agent?
The other day I used the image of the Homermobile to illustrate what seems to be the approach of Infraestruturas de Portugal to the AI tool it is procuring here: it wants a turnkey solution that does a lot, just like the proverbial Homermobile had a lot gadgets a car did not need.
I think that is the fundamental problem here, going for an encompassing AI tool to take part in multiple elements of a procurement process at this moment in time is a recipe for disaster. It is not that AI tools cannot be helpful, but they excel more as discreet specific solutions than a one size fits all approach. This is more so bearing in mind we are at the beginning of this particular revolution so the time for solution consolidation and aggregation is yet to arrive. My fear is that at the proof of concept stage, the tools that will be tested will fail at multiple vectors of what is required by the technical specifications.
While the B on the catchy BIA acronym is for "Bot", the contracting authority mentions throughout the tender documents that it is looking for an 'Intelligent Agent' (tender specifications, section 1) or an 'agent' (Objectives 1 and 6). On the technical specifications, bot is only mentioned once on the scenario 2 for the proof of concept.
The pipeline for BIA includes multiple steps but from the documentation I am not sure the tool is expected to operate with a degree of autonomy and some of the steps happen outside the LLM model itself, ie the formal validation of requirements and the RAG of documents submitted.
So it may be that in an holistic view, BIA is indeed an agent and not simply a bot. If that is the case, I am just going to leave here what happened recently to Mckinsey's AI agent. AI agents are incredible security risks and I remain unconvinced Infraestruturas de Portugal understands the risks arising from giving an AI agent access to the tenders submitted. AI agents are a new risk vector for organisations and we are far away from having achieved a level of operational security equivalent to that of traditional software.
I know BIA is not OpenClaw, but the principles under which both will operate are likely to be fairly similar. If my assessment of their similarities is correct, then the threat assessments from Crowdstrike and Microsoft should give pause to anyone wanting to unleash an AI agent loose. Having said that, it would be funny if BIA ends up implemented on OpenClaw though!
Finally, as mentioned on an earlier blogpost, BIA is supposed to be used first for goods and services and, in the future, for works. The tender documents also express a desire for BIA to be used in the wider public administration but without any real roadmap or consideration for the implications of such a move.
The open source requirement
One of the things that is very evident throughout the tender documentation is that Infraestruturas de Portugal is only looking for BIA to be based on 'open source' solutions for its tech stack. At face value, this seems sensible since it means they will not be beholden to the contractor who built the solution for them, ensuring a control over its future. But there are fundamental problems with open source as a concept in general but also on the AI space specifically.
The first of the general problems is the definition of open source. Since it is an umbrella term, what does it mean exactly? Usually it means that the code is freely available and with a permissive license, allowing for it to be used or modified without costs. But specific open source licenses have different rules and obligations. Some may constrain some or all commercial deployments (and yet still be open source) while others only care about attribution or may require the subsequent source code to be made available freely as well. What open source flavour does Infraestruturas de Portugal want?
We can find a statement about licensing requirements on the technical specifications: "[t]he solution must be implemented using open-source software (Open Source), allowing free use by public entities as well as full auditing and customisation." In this instance I think I would have preferred a reference to specific licenses like Apache 2.0 or MIT (or equivalent!) but this is a wide and pervasive license requirement. The pervasive licensing required by Infraestruturas de Portugal might work for the code created by the contractor, but I am not sure it will work for the LLM(s) powering BIA.
The second open source issue pertains to how the solution will be coded. If the solution they acquire is vibecoded (ie, coded by an AI) then the code itself is not covered by copyright protection (at least for now). The litmus test I suppose will be the degree of human involvement in the final code passed on from the contractor to Infraestruturas de Portugal, but looking at the tight timescales for contract performance I think it is fair to assume the contractor will have to at least partially rely on AI to write the code for BIA. If that is the case, the code will indeed be 'open source' but not in the way Infraestruturas de Portugal thinks.
The use of LLM model(s) to power BIA also raises a third open source issue. There are no real open source LLM models out there. None of them actually have the code made freely available for checking and modifications. What is available out there are open weights models, which are not open source models despite Meta and others trying to brand them as such. This is a common misunderstanding when discussing open source and LLMs. Open weights is not open source!
Open weights means only the training parameters are disclosed, but not the training code, datasets or the internal architecture of the model. Open weights models are useful because they usually come with liberal licenses associated with them and can be fine-tuned for specific purposes. So, if we go back to the maximalist licensing requirement from Infraestruturas de Portugal, you cannot really audit the code of an open-weight LLM and its code is not customisable,ie their architecture cannot be changed. They can be built upon and fine-tuned, but that is a different thing. Here's a nice writeup about the difference between open source and open weights, although from a US IP perspective.
As for the benefits and drawbacks of the open weight models, we will be looking at them tomorrow.