What should be included in a data science curriculum for schools?
Current artificial intelligence (AI) methods, especially machine learning (ML), rely heavily on data. To complement our work on AI literacy, we have been investigating what data science teaching resources and education research are currently available. Our goal is to work out what data science concepts should be taught in a data science curriculum for schools.

Read on to find out what resources and materials we have reviewed, and what concept themes we have identified.
What is data science? Why is teaching it important?
Data science is an interdisciplinary science of learning from large datasets, aided by modern computational tools and methods (Ow‑Yeong et al., 2023). We see data science skills as fundamental for using, creating, and thinking critically about:
- Insights from data, generally
- Data-driven computational tools and methods (such as machine learning) and their outputs and predictions, specifically

To navigate a world where decision making in many areas is influenced by data-driven insights and predictions, young people need to be taught about data science. Data science skills empower young people to become critical thinkers, discerning consumers, adaptable professionals, and informed citizens.
Worldwide, countries are taking a variety of approaches to introducing data science into their education systems, as highlighted in a 2024 report from the coalition Data Science 4 Everyone.

In some countries, such as India and Israel, data science education is an established school subject. It is taught as part of the curriculum in at least one of the primary, secondary, or post-16 age phases. Meanwhile in other countries, for example Canada, Germany, and Poland, data science is a very new school subject, or there are still only recommendations to develop it into a school subject.
While we are currently considering what a comprehensive data science curriculum should include, we already offer several resources to support you with your teaching about data science and data-driven technologies. You can find a list of these resources at the end of this blog. Now, however, I’ll give you an overview of our recent work to identify concepts for a data science curriculum that fits with our approach to AI literacy.
Data science education: What should we teach?
To answer the question ‘What should we teach about data science to learners aged 5 to 19?’, we undertook a grey literature review of data science teaching materials. A grey literature review is structured like an academic literature review and conducted with the same rigour. The difference is that a grey literature review also considers publications that have not been peer-reviewed, including reports, white papers, curriculum materials, and similar resources.
To orient our work, we combined four frameworks for data science and AI/ML education:
- Data Science 4 Everyone’s Data Science Learning Progressions
- Two research papers from Viktoriya Olari and Ralf Romeike about data-related practices for AI education: Olari and Romeike (2024a) and Olari and Romeike (2024b)
- UNESCO’s AI Competency Framework for Students
- The SEAME framework we developed for categorising AI education resources
With these combined frameworks as our map, we reviewed 79 data science learning resources. The resources varied:
- In quality in terms of clarity and teaching approach
- In their focus, e.g. on maths, coding, or a specific field such as biology
- In their perspective on data science, with some prioritising theory and others real-world applications
From among the 79 resources, we chose 9 that included clear learning outcomes, and that together covered a wide field of concepts. We examined these 9 in detail to extract 181 explicit and implicit data science concepts. Next, we grouped the concepts into themes, and finally we refined these themes by comparing them against the four frameworks listed above.
The themes we have identified for a data science curriculum are:
- Fundamentals of data literacy: Key terms and definitions
- Understanding bias in data
- Ethical responsibility in data use
- Data creation, curation, and transformation
- Analysis and modelling: Maths and statistics fundamentals
- ML principles
- Deploying and maintaining ML applications
- Software tools and programming
- Data visualisation
- Presenting findings effectively
This set of themes both fits with the frameworks by Olari and Romeike and Data Science 4 Everyone, and expands them by covering ML principles and programming approaches and calling out data bias and ethics.
What’s next for this work?
Through our grey literature review on data science education, we’ve:
- Pinpointed a large set of candidate concepts that could be taught within a data science curriculum
- Created a set of clear themes to structure our work going forward
Our next step is to shape these candidate concepts into a progression framework to describe their relationships and establish which concepts could be taught at each age or phase of schooling.

The literature review also gave us an overview of the pedagogical approaches and tools used for teaching data science concepts. These findings will become useful once we start designing learning activities.
You’ll hear more about how this work is going here on our blog and on our social channels. In the meantime, comment below to let us know what you think about the themes, or to tell us what you’d like to see in a data science curriculum for the learners you work with.
Our resources related to data science
Classroom resources
You can read about our thinking behind the data science-related teaching resources we’ve created so far in our ‘Data and information within the computing curriculum’ report from 2019.
- The report lists the data-related units within The Computing Curriculum materials, which we no longer update but continue to offer as free downloads. Updated classroom materials are available as part of the Computing materials we created for Oak National Academy in the UK for ages 5–11 and ages 12–19.
- The Ada Computer Science platform offers learning materials on data and information, and on AI and ML, for ages 14–19.
You might also be interested in exploring the Experience AI programme, which offers everything teachers need to help students develop a foundational understanding of data-driven AI technologies, their social and ethical implications, and the role that AI can play in their lives.
Teacher training and development resources
Our free online course ‘Teach teens computing: Machine learning and AI‘ helps teachers understand and explain the types of problems that ML can help to solve, discuss how AI is changing the world, and think about the ethics of collecting data to train a ML model.
Teaching young people to understand data-driven AI technologies means teaching them thinking skills that are different to those needed to understand rule-based computer systems. You can read about these Computational Thinking 2.0 skills in our Quick Read PDF.
Our current research seminar series focuses on teaching about AI and data science. Sign up for an upcoming seminar session (the next one is on 11 November) or catch up on past sessions to find out what the latest research findings are in this area. You can also revisit our 2021/22 series on the same topic to see how work in this area has developed. The Raspberry Pi Computing Education Research Centre also has ongoing projects in the area of AI education for you to explore.
No comments
Jump to the comment form