Polymath

STEM · Full roadmap · ~105 min read · 33 steps

📊Data Science (Aotearoa NZ)

Read NZ data, spot real patterns, respect the rules, and build a first model

Activities in this path

Match pairsSort into zones

Skill tree

0 / 33 steps

Unit 1

1

Start here

Course overview

2

What data science actually is

Data science turns raw data into decisions

3

The workflow, start to finish

Six stages from question to communication

4

Why the question comes first

A vague question produces a useless answer

5

Types of data, numerical and categorical

Numbers you can do maths on, versus labels you cannot

Unit 2

Structured versus unstructured data

Tidy tables versus raw text, images, and audio

The middle of the data, mean, median, mode

Three ways to find a typical value

How spread out is the data, range, variance, standard deviation

A typical value is half the story, spread is the other half

Percentiles and the median's friends

Where a value sits in the lineup

The normal distribution

The bell curve and the 68-95-99.7 rule

Unit 3

Skew, when the bell tips over

Lopsided data and which way the tail points

Data cleaning is most of the job

Real data is dirty, and fixing it is the work

Missing values

What to do about the blanks

Outliers

The extreme points that demand a decision

Exploratory data analysis

Looking before you leap

Unit 4

Choosing the right chart

Five workhorse charts and when to reach for each

Reading a chart without being fooled

Axes, scales, and the tricks that mislead

Correlation versus causation

Moving together is not the same as one causing the other

Sampling and bias

Your data is a sample, and how you picked it decides what it can tell you

A little probability intuition

Chance, independence, and why streaks fool us

Unit 5

Hypotheses and p-values, honestly

A test of "could this just be chance?" and what it does not tell you

What machine learning is

Programs that learn patterns from examples instead of following written rules

Train, test, and overfitting

Learn on one slice, judge on another

Linear regression intuition

Drawing the best straight line through the dots

Classification by nearest neighbours

Guess the label by looking at who is closest

Unit 6

Judging a model, accuracy can lie

Why accuracy alone misleads, and what precision and recall add

NZ data sources to practise with

Where the real NZ data lives and how it is licensed

The Privacy Act 2020 and notifiable breaches

The law that governs personal data in NZ, and what you must do when it goes wrong

Māori data sovereignty and the CARE principles

Māori data is taonga, and Māori have rights over it

The Algorithm Charter and the IDI

Two NZ examples of using data and algorithms carefully

Unit 7

The NZ data and tech job scene

What the work looks like here and where to learn more

The tools, and your first project

The real toolkit and a project to practise on

Where to go next

Where to go next

Start unit 1