Infraestruturas de Portugal AI Agent procurement: contract price, cost and award criteria
After the first entry and yesterday's one on the object of the Infraestruturas de Portugal attempt to procure an AI agent, it is time to look into the contract value and its cost. Those may look as synonyms but are not. The award criteria will be covered in today's post as well.
Contract price and cost
The contract is valued at €1M but it seems that this figure is to cover only the solution itself and not any operational costs with deployment, maintenance and updating. It seems Infraestruturas de Portugal will have to figure out those after the fact and this looks to be an indication of lack of foresight.
I should add that the running costs are of particular importance here. Infraestruturas de Portugal is (correctly) asking for a solution that is not reliant on a specific infrastructure, meaning it should run on a public/private cloud or on premises as well, ensuring full control over its operational deployment. On this, I agree with the approach taken by Infraestruturas de Portugal, even though it comes with significant drawbacks implied on the types of models that can be used to power BIA, the AI tool.
What the contracting authority has no idea about is the running cost for running the tool, ie the inference costs. These are not cheap and are based on the tokens required for input, chain-of-thought and the output. The number of tokens per word varies from language to language. In fact, OpenAI states that "[n]on-English text often produces a higher token-to-character ratio, which can affect costs and limits", using Spanish as an example. Both Spanish and Portuguese are known as low-density languages, that is languages that use a lot of words to convey meaning. Therefore, it is safe to assume the token consumption for a similar query in Portuguese is likely to come closer to the Spanish higher consumption of tokens than English.
As for what this means in practice, Microsoft charges $0.5 (input) and $1.5 (output) per million tokens to run Mistral 3 on its US infrastructure. Looking at all the AI agent is supposed to be doing, it is safe to say it will burn through tokens, particularly as the contracting authority considers "vital" the use of chain-of-thought to help understand the outputs of the solution. Here's a good write-up on the running costs of self-hosting LLMs.
For running BIA, hardware must be provisioned either on a cloud or at Infraestruturas de Portugal own premises. The minimum suggested hardware (nVidia A10/A100) also seems woefully inadequate especially if the thinking is the whole tool can run on a single GPU.
While it is true that the number of procedures per year is not enormous, it is still sizeable at 800 which may or may not be dispersed throughout the year, meaning that there are likely peak utilization periods for the tool. Even with no downtime at all, not excluidng August as a dead month, and if usage was evenly distributed on a day to day basis, we are talking about over 3 procedures per working day in a year. Can you imagine the juries having to 'book' time in the tool to process the data as one books meeting rooms?
On top of the running costs one needs to consider as well the costs of maintaining the solution. There is a cautionary tale from the tech sector that one should not focus on the cost of buying a software solution but instead the cost of maintaining it. This is something that clearly is not reflected in the award criteria, so my impression here is that Infraestruturas de Portugal is not really taking it into account.
On the value side it is also important to consider the counter-factual. This tool is supposed to do a significant part of the workload associated with public procurement. €1M is a big opportunity cost that would pay for the full economic cost of a lot of salaries for a good few years of people that could be hired instead. And that is without even taking into immediate consideration BIA's running and maintenance costs mentioned above.
Award criteria
As for award criteria, Infraestruturas de Portugal established the following award criteria and percentage breakdown:
| Criterion | Percentage |
|---|---|
| Price | 40 |
| Quality | 30 |
| Proof of concept | 30 |
The quality part is then further subdivided into:
| Quality sub-criteria | Percentage |
|---|---|
| Architecture/design | 40 |
| Governance and transparency | 30 |
| Performance and SLA | 30 |
| Security and data protection | 10 |
| Team | 10 |
| Sustainability and knowledge transfer | 10 |
This looks all ok as do the rubrics for each sub-criterion. I would say, however, is that probably the first quality sub-criterion (architecture/design) in practice will be double weighted since it is likely the proof of concept assessment overlaps with it.
Speaking of the proof of concept, it is divided into three challenges, whose requirements are well described and thought out in the technical specifications. In particular, the first scenario which is simply checking and validating if specific formal requirements are met seems quite logical as it is based on data that is already digital anyway. There is more to say about the analysis of scenarios 2 and 3, but it makes more sense to discuss them with the wider substantive issues.
What I found missing or lacking in the award criteria is not taking into account BIA's ongoing costs as discussed above. Some of the quality elements are supposed to measure the ability of the contracting authority developing the tool autonomously on its own (liability klaxon), but other than that the contracting authority is not interested in calculating the ongoing costs and the tradeoffs between a more performant model that is more expensive to run and one that is cheaper but not as good.
That looks very puzzling to me since overall the award criteria seem well designed. I find it unlikikely that someone who can designed award criteria with this level of detail is completely oblivious to running costs. Honestly, it is almost as whomever decided on them made a conscious decision not to include any ongoing costs into the equation.
Tomorrow we will start looking into the substantive issues of BIA itself instead of the procedural part of the procurement.