Feature Engineering & Hyperparameter Tuning

🎯 The Big Picture

Why This Matters

In Lesson 1, you learned that classic ML requires manual feature engineering — humans must decide which inputs matter. But how exactly do you do that? And once you have good features, how do you configure the model for the best results?

This lesson walks through both concepts with a concrete example: predicting whether a student will pass or fail an exam.

📊 Raw Data You Might Collect

Starting Point

Imagine you have data about students preparing for an exam:

Student	Study Hours	Sleep Hours	Date	Attendance	Notes
A	4.5	7	2026-02-15	18/20 sessions	"Reviewed chapters 1-5, did practice problems"
B	1.0	4	2026-03-01	9/20 sessions	"Skimmed notes"

💡 The Problem

Raw data is messy. "18/20 sessions" is text, dates need context, and notes are unstructured. A model can't use this directly — we need to engineer features from it.

🔧 Feature Engineering (preparing the ingredients)

Transforming Raw Data into Useful Inputs

Feature engineering is the process of creating better inputs for your model from raw data. Good features make the difference between a mediocre model and a great one.

Engineered Feature	How	Why It Helps
`attendance_rate`	18/20 = 0.90	Percentage is more useful than raw counts
`sleep_quality`	Bin into "poor" (<5h), "ok" (5-7h), "good" (7+h)	Captures non-linear effect of sleep
`days_before_exam`	exam_date - study_date = 14 days	Studying early vs cramming matters
`study_intensity`	study_hours / days_before_exam	Captures pacing
`note_length`	Word count of notes	Proxy for engagement
`did_practice_problems`	"practice" in notes → 1, else 0	Active recall is a strong predictor

Student A

attendance=0.9, sleep=good, days_before=14, intensity=0.32, note_length=6, practice=1

Student B

attendance=0.45, sleep=poor, days_before=1, intensity=1.0, note_length=2, practice=0

🌟 The Power of Good Features

Even a human can now guess who passes! That's the sign of well-engineered features — they make the patterns obvious, both for humans and for models.

🎛️ Hyperparameter Tuning (dialing in the model)

Configuring the Model for Best Results

Once you have good features, you need to configure your model. Hyperparameters are settings you choose before training — they control how the model learns.

Let's say you use a Gradient Boosting model (a popular classic ML algorithm). Here are the key knobs to turn:

Hyperparameter	Too Low	Too High	Sweet Spot
`n_estimators` (# of trees)	10 → underfits	5000 → slow, overfits	~200
`learning_rate`	0.001 → learns too slowly	1.0 → overshoots	~0.1
`max_depth`	1 → too simple	20 → memorizes students	~4

The Tuning Process

You try different combinations and compare results:

Attempt 1

learning_rate=0.1
max_depth=4
n_estimators=200

82%

Attempt 2 ✓

learning_rate=0.05
max_depth=6
n_estimators=300

86%

Attempt 3

learning_rate=0.01
max_depth=10
n_estimators=500

79%

(overfitting)

⚠️ Overfitting vs Underfitting

Underfitting: Model is too simple — it misses patterns in the data (too low accuracy on everything).

Overfitting: Model memorizes the training data — it performs great on training data but poorly on new data.

The sweet spot is in between: a model that learns the real patterns without memorizing noise.

💡 The Punchline