Back to Stories

EU AI development as a federated design problem



June 18, 2026 - 3 min read

The European argument about building a model at the frontier has shifted from how much compute the continent can muster to how the compute that already exists can be made to train one thing. The technology preview from the Joint European Forum for IPCEI (2024) names compute concentration in a few hyperscalers as the strategic bottleneck, and proposes a pre-competitive, federated infrastructure that European developers, SMEs included, could train on. The premise is that the FLOPs are reachable. The unresolved question is distribution, which carries most of the difficulty.

That difficulty begins with the law. A model at this scale would cross the systemic-risk line the AI Act draws for general-purpose models, presumed at training compute above 10^25 FLOP. Cornelia Kutterer (2024) traces how the Act shifted foundation models from a 'high' to a 'systemic' risk logic, with obligations weighted towards documentation, evaluation and traceability rather than performance alone. The model has to be legible, not merely capable, and to interface with the AI Office from the outset. The same regime presses hard on data. Where it comes from, on what basis it may be used, and whether it can be moved at all are all question that live in an unstable landscape.

That data pressure is what makes federated learning look less like a privacy trick and more like a structural requirement. In a NeurIPS 2025 position paper, Herbert Woisetschläger and colleagues (2025) argue that the distributed architecture of federated learning directly answers the Act's demands on data governance, consent-based processing and resource allocation, especially for high-risk domains such as health where data cannot be centralised. But making training cooperative does not settle who is accountable for it. Largely the same group, writing on liability (Woisetschläger et al., 2024), shows that clients and the central server then share legal responsibility, and that naive designs leave it unclear who answers for which part. Auditability and verifiability stop being features and become design parameters, the burden shifting towards the server operator.

The hardware turns out to be the easy part. DiLoCo, from Arthur Douillard and colleagues at DeepMind (2023), trains a model across scattered, poorly connected machines and reaches the same quality as standard training while talking between them around 500 times less. It keeps working even when machines drop out and rejoin, which is how any setup spread across many institutions actually behaves. So the hard part is not the chips. Once the compute can be found, spread out and trained on, what is left is the part no algorithm solves: who has to govern the model, and who is on the hook when something goes wrong.


Scan the QR code to view this story on your mobile device.


EU sovereign AIFederated learningAI Act systemic riskCompute infrastructureFrontier models