Scaling Systems, Governing AI, and Designing Data for Real-World Impact
Week 21 of the Data Innovation Summit XI Edition highlights the operational side of modern AI and data systems. As organizations push beyond experimentation, new questions are emerging around infrastructure scale, enforceable governance, production-ready datasets, and architectures capable of supporting autonomous AI-driven workflows. Across this week’s sessions, speakers explore what it actually takes to move from promising prototypes to robust systems that operate reliably at scale.
New Speakers and Sessions
One of the most technically focused sessions announced this week comes from Nouamane Tazi, ML Research Engineer at Hugging Face, with “The Ultra-Scale Talk: Scaling Training to Thousands of GPUs.” As foundation models continue to grow in size and complexity, the infrastructure required to train them is becoming a discipline of its own. This session explores the engineering strategies behind large-scale distributed training, including advanced parallelism techniques, network optimization, and fault tolerance mechanisms required to keep massive GPU clusters running efficiently.
The challenge of bringing AI from isolated experiments into enterprise-wide production environments is addressed by Nick Jewell, Senior Sales Engineer at Dataiku, in “From PoC to Production – Scaling Enterprise AI for Measurable Value.” While many organizations have successfully built AI prototypes, far fewer have achieved consistent results at scale.
Data engineering architectures are also evolving to support more autonomous systems. In “Engineering for Autonomy: Low-ETL, Open Formats, and the Return of the Semantic Layer,” Will Martin, Evangelist EMEA at Dremio, explores how modern data platforms can reduce complexity while increasing flexibility.
Another critical dimension of operational AI is governance. Awadelrahman M. A. Ahmed, Data & AI Architect at REMA 1000, joins the program with “Designing AI Governance That Systems Can Enforce.” Rather than relying solely on policies and guidelines, the session examines how governance mechanisms can be embedded directly into AI systems themselves.
The importance of high-quality data foundations is addressed by Agnieszka Pruszek, Senior Project Manager at Samsung R&D Poland, in “Design and Production of Multimodal Datasets for Reliable Health and Wellness Applications.” The session presents an end-to-end framework for collecting and integrating complex multimodal data sources, including wearable sensors, voice signals, and eye-tracking data. A longitudinal study involving dementia patients illustrates the practical challenges and insights involved in building reliable datasets for real-world health applications.
Industry practitioners are also sharing how data teams evolve as organizations mature. In “Beyond the ‘Quick Data Pull’: Architecting Action in Circular Fashion,” Wouter Nijdam and Liubov Zevaeva from Otrium discuss how their data team transitioned from reactive reporting to building AI-enabled systems that support strategic decision-making.
Finally, this week also introduces a Chief Data Officer Executive Round Table, moderated by Dr. Chris Hillman, Global AI Lead at Teradata. The discussion will bring together senior data leaders to examine how organizations design data ecosystems that encourage collaboration, innovation, and responsible AI development across departments and partners.
New Partners
Week 21 also welcomes several new organizations joining the growing Data Innovation Summit ecosystem. We are pleased to welcome Giskard, SambaNova and SelectZero as partners supporting this year’s edition and contributing to the ongoing dialogue around enterprise AI, data platforms, and advanced analytics.
Taken together, this week’s announcements emphasize a broader industry transition: from isolated AI initiatives to integrated, production-grade systems that reshape how organizations operate, collaborate, and generate value from data.
Stay tuned for more announcements next week.
The Data Innovation Team
