Open data, in practice: the first ATLAS Open Data tutorial at IdeaSquare

22-04-2026

For four days at the end of November 2025, IdeaSquare hosted around thirty learners from across the world working their way through open data from the ATLAS experiment. It was the first public ATLAS Open Data tutorial, organised by the ATLAS Open Data team. Tutorials like this are a perfect example of how the concept of Open Data is growing from a principle that CERN commits to in policy documents, to a resource that teachers, students, and external researchers can actually pick up and use.

Where this sits in CERN’s open science landscape

CERN’s Open Science Policy commits the Organization to making its research outputs openly available by default, and open data is one of its central pillars, alongside open access publishing, open-source software, and open hardware. The LHC experiments have been releasing data through the CERN Open Data portal for more than a decade, and the volume is now measured in petabytes. The harder question, however, goes beyond the mere release: it is about reuse. Petabytes on a portal are only as valuable as the community that knows how to work with them, and that community does not build itself.

The ATLAS tutorial can be read as a direct response to that gap. Rather than assuming that external users will find their own way into the data, the ATLAS team developed tools, and to ensure their uptake, called them to come to CERN, and have a week focussed on precisely these tools. ATLAS is not the first LHC experiment to take this step: the CMS Collaboration has been running comparable open data workshops for different target groups at CERN and elsewhere for around six years.

“We wanted to run a tutorial to meet users and learn what motivates them to look at the open data,” Zach Marshall, member of the ATLAS Open Data team and one of the organisers, said. “We have lots of materials on the websites, but having users going through them with the team present can show us exactly what the pain-points are, and where we can improve. We even built some materials during the week of the tutorial to fill in gaps that were identified in questions in the first sessions.”

Two complementary parts

The tutorial was structured in two complementary parts. The first, on Monday and Tuesday, focused on “Open Data for Education and Outreach”, while the second, on Wednesday and Thursday, turned to “Open Data for Researchers” and the more demanding tools used in actual analyses. The full programme, including all slides and recordings, is available on Indico.

The O&E sessions were built around Jupyter notebooks using a new 2025 release of 13 TeV data from Run 2. Participants started by learning how to access metadata through atlasopenmagic, a newly developed Python library that makes it possible to explore the data with only a basic knowledge of Python. From there, the tutorial moved straight into physics: rediscovering the Higgs boson in the H → γγ channel.

After these foundations, the sessions turned to teaching concepts in high-energy physics with material built from the open data, including systematic uncertainties, fluctuations, statistics, and basic machine-learning.

The research part was aimed at participants who already have a physics background and covered the practical mechanics of working with ATLAS’s PHYSLITE analysis format, handling heavy-ion collision data, creating event displays, and working with open event generation data. It also introduced specific tools such as the PHYSLITE to OpenData Framework and EXPLORE, a platform that provides resources for anyone to explore open data irrespective of institutional affiliation.

It’s all about the community

A tutorial week can easily be described as a training event and left there. If it were just about the contents, this event could be a link (and we invite you explicitly to explore this link). Coming together and being focused on this topic for one week straight, however, helps create a community of active Open Data users who can support each other when things break, when they need help, or simply when they are looking for inspiration.

The ATLAS team pointed participants firmly at the CERN Open Data Forum as the right place to bring those questions. This is not only a matter of convenience: public threads let the next person with the same problem find an answer without writing a single email, and they give the team concrete evidence of use, which matters when reporting to funders and reviewers. Asking a question on the forum, in that sense, is a small act of community building rather than a nuisance.

The tutorial also highlighted the growing list of community contributions already built on ATLAS Open Data: courses and notebooks from Melbourne, Berkeley, Thessaloniki, Manchester, and Geneva, all independently developed, all citable, all publicly linked from the Open Data portal.

What happens next

Antonio Costa, one of the organizers and part of the ATLAS Open Data team, drew a positive conclusion: “We learned a lot from attempting such a broad tutorial that served so many audiences at once, and we’re delighted that it was so well-received.”
"
The tutorial was closed with two invitations from the ATLAS Open Data team that are worth repeating. First, a new Open Data release is in preparation, and specific requests for samples, notebooks, or apps are welcome, ideally with enough detail (which samples, for what purpose, how many events) to be actionable. Second, the team wants to hear about projects built on the data, whether in education, outreach, or research, and is happy to credit and link to them from the Open Data portal. For those whose work outgrows the public datasets, the tutorial also invited non-ATLAS scientists to collaborate directly with the Collaboration, for example via short-term associations.

The ATLAS Open Data team already has plans for the future: “We’re hoping to run some focused tutorials for particular groups (teachers, students, researchers) in the coming months that can speak more directly to their needs. We’re also looking forward to releasing more open data and more learning materials that we’ll include in those tutorials, which should make them even more useful to each community.”

To explore the data, start at the ATLAS Open Data portal. To ask a question or share a project, the CERN Open Data Forum and Open Data team are both good points of entry.

The ATLAS Open Data tutorial was held at IdeaSquare, CERN, in November 2025. Th…