Polymath

STEM ยท Full roadmap ยท ~100 min read ยท 28 steps

๐Ÿ“ŠData science from zero

Read data, spot patterns, avoid traps, and build a first model with confidence

Activities in this path

Match pairsSort into zones

Skill tree

0 / 28 steps

Unit 1

1

Start here

Course overview

2

What data science actually is

Data science turns raw data into decisions

3

The workflow, start to finish

Six stages from question to communication

4

Why the question comes first

A vague question produces a useless answer

5

Types of data: numerical and categorical

Numbers you can do math on, versus labels you cannot

Unit 2

Structured versus unstructured data

Tidy tables versus raw text, images, and audio

The middle of the data: mean, median, mode

Three ways to find a typical value

How spread out is the data: range, variance, standard deviation

A typical value is half the story; spread is the other half

Percentiles and the median's friends

Where a value sits in the lineup

The normal distribution

The bell curve and the 68-95-99.7 rule

Unit 3

Skew: when the bell tips over

Lopsided data and which way the tail points

Data cleaning is most of the job

Real data is dirty, and fixing it is the work

Missing values

What to do about the blanks

Outliers

The extreme points that demand a decision

Exploratory data analysis

Looking before you leap

Unit 4

Choosing the right chart

Five workhorse charts and when to reach for each

Reading a chart without being fooled

Axes, scales, and the tricks that mislead

Correlation versus causation

Moving together is not the same as one causing the other

Sampling and bias

Your data is a sample, and how you picked it decides what it can tell you

A little probability intuition

Chance, independence, and why streaks fool us

Unit 5

Hypotheses and p-values, honestly

A test of "could this just be chance?" and what it does not tell you

What machine learning is

Programs that learn patterns from examples instead of following written rules

Train, test, and overfitting

Learn on one slice, judge on another

Linear regression intuition

Drawing the best straight line through the dots

Classification by nearest neighbors

Guess the label by looking at who is closest

Unit 6

Judging a model: accuracy can lie

Why accuracy alone misleads, and what precision and recall add

The tools, and your first project

The real toolkit and a project to practice on

Where to go next

Where to go next

Start unit 1