STEM ยท Full roadmap ยท ~100 min read ยท 28 steps
๐Data science from zero
Read data, spot patterns, avoid traps, and build a first model with confidence
Activities in this path
Skill tree
0 / 28 steps
Unit 1
Start here
Course overview
What data science actually is
Data science turns raw data into decisions
The workflow, start to finish
Six stages from question to communication
Why the question comes first
A vague question produces a useless answer
Types of data: numerical and categorical
Numbers you can do math on, versus labels you cannot
Unit 2
Structured versus unstructured data
Tidy tables versus raw text, images, and audio
The middle of the data: mean, median, mode
Three ways to find a typical value
How spread out is the data: range, variance, standard deviation
A typical value is half the story; spread is the other half
Percentiles and the median's friends
Where a value sits in the lineup
The normal distribution
The bell curve and the 68-95-99.7 rule
Unit 3
Skew: when the bell tips over
Lopsided data and which way the tail points
Data cleaning is most of the job
Real data is dirty, and fixing it is the work
Missing values
What to do about the blanks
Outliers
The extreme points that demand a decision
Exploratory data analysis
Looking before you leap
Unit 4
Choosing the right chart
Five workhorse charts and when to reach for each
Reading a chart without being fooled
Axes, scales, and the tricks that mislead
Correlation versus causation
Moving together is not the same as one causing the other
Sampling and bias
Your data is a sample, and how you picked it decides what it can tell you
A little probability intuition
Chance, independence, and why streaks fool us
Unit 5
Hypotheses and p-values, honestly
A test of "could this just be chance?" and what it does not tell you
What machine learning is
Programs that learn patterns from examples instead of following written rules
Train, test, and overfitting
Learn on one slice, judge on another
Linear regression intuition
Drawing the best straight line through the dots
Classification by nearest neighbors
Guess the label by looking at who is closest
Unit 6
Judging a model: accuracy can lie
Why accuracy alone misleads, and what precision and recall add
The tools, and your first project
The real toolkit and a project to practice on
Where to go next
Where to go next