Let’s talk about evaluation—without the eye-rolls

Matt Bowker · Jan 5, 2025 · 8 min read

If we’re honest (and we like to be), evaluating teaching can feel like an annoying box to tick at the end of a hectic day. Cue the universal scramble for feedback forms while half the group is already edging towards the exit. But here’s the thing: it doesn’t have to be painful, and it definitely doesn’t have to be perfect.

I spent years making evaluation more complicated than it needed to be, convinced that everything had to be meticulously pristine. Spoiler: it doesn’t. The real knack is finding the sweet spot between rigour and reality. A slick evaluation tool that nobody uses is pointless, whereas a few scrappy but consistent data points can spark real improvements in your teaching.

So, in the spirit of making your life easier, this guide offers practical, no-fuss approaches to sim evaluation—no degrees in educational theory required. Whether you run weekly in-situ scenarios or the occasional workshop, you’ll find tips to help you collect meaningful data without drowning in paperwork. Sure, we’ll dip into some frameworks (because they genuinely help), but the focus is on what actually works in real life.

Pop the kettle on, forget your evaluation angst, and let’s get stuck in.

The Kirkpatrick Model: Oldie but Goldie

If evaluation frameworks were classic rock bands, Kirkpatrick would be The Rolling Stones—been around forever, a magnet for criticism, yet still remarkably useful. Yes, it’s not perfect, but it’s a handy starting point for figuring out what exactly you want to measure.

Level 1: Reaction

“Did they like it?” This is your classic feedback form, capturing satisfaction, engagement, and perceived relevance. It’s easy to dismiss as superficial, but if people hate your teaching, that’s a serious red flag for learning uptake. Plus, in an era of endless competing priorities, participant buy-in really does matter.

Level 2: Learning

“Did they get it?” Here, you measure whether participants actually learned something: knowledge, skills, or even changes in attitude. Think pre/post tests, OSCE-style assessments, confidence ratings. Match your measurement to your learning objectives—no point testing knowledge if your sim focused on team communication.

Level 3: Behaviour

“Are they doing it?” The million-pound question: do people actually use their new skills in clinical practice? Trickier to assess, but not impossible. Workplace observations, clinical data, or simply structured follow-up chats can shed some light.

Level 4: Results

“Did it make a difference?” The holy grail—can you prove it impacted patient care or outcomes? Randomised controlled trials are a big ask (for most of us, anyway). Still, there are ways to gather enough evidence to show your sim is making waves.

Big takeaway? You don’t have to measure everything, every time. Sometimes Level 1 is enough. Other times, dig deeper into behaviour change. Just be sure you know why you’re measuring what you’re measuring.

Image showing the four kirkpatrick levels

Practical Tips: Making Levels 1 & 2 Less Painful

Let’s start where most of us do: reaction and learning data. These basics can be collected with minimal hassle and maximum insight if done right.

The Art of the Feedback Form

Keep it short: Overly long forms breed half-hearted responses. Aim for 5–7 questions, tops.
One possible format:
1. Three 5-point Likert scale questions on relevance, delivery, and perceived usefulness
2. One open-ended question: “What will you change in your practice after this session?”
3. A “Stop–Start–Continue” box for improvement suggestions

The “Stop–Start–Continue” box is a quick, structured way to gather focused feedback from your learners or colleagues. It breaks down like this:

Stop: What should we stop doing because it’s not working or adds no value?
Start: What new ideas or improvements should we consider adding?
Continue: Which aspects are working well and should definitely carry on?

Top tip: Digital forms (think Google Forms) make life easier. They collate data automatically and can be set to anonymous—no more deciphering hieroglyphics.

Timing Is Key

Level 1 (Reaction): Immediately post-sim via quick “minute papers” or a Post-it Method (see below). Grab that gut reaction while it’s fresh.
Level 2 (Learning): Pre/post tests for knowledge. If you’re checking skills, wait at least 15 minutes post-sim—give people a chance to cool off from the adrenaline rush.
Follow-up: A short 2–4 week revisit to see what’s actually stuck (or slipped).

Rookie Mistakes to Sidestep

Don’t ask for data you won’t (or can’t) use.
Resist “just one more question” syndrome.
Avoid capturing feedback only from your super-keen participants.
Don’t lump “How good was the educator?” questions with “How useful was the session?” They’re different.

The Hawthorne Effect & Other Pitfalls to Dodge

A common trap when evaluating is the Hawthorne effect, which describes how people modify their behavior simply because they know they’re being observed.[1] In simulation contexts, learners may perform better—or at least differently—if they feel their every move is under scrutiny, which can skew your data. Other potential stumbles include:

People-pleasing bias: Participants might give overly glowing feedback if they’re worried about offending you or the facilitators.
Selection bias: The keenest beans in your programme might respond to surveys, while less engaged learners slip under the radar—giving you a skewed sense of success.
Overzealous data collection: Gathering too many metrics can lead to analysis paralysis. More data doesn’t always mean better data; it usually means more time stuck in spreadsheets.

Image demonstrating the Hawthorne effect: a man sat under a spotlight and magnifying glass looking stressed

Mitigation tips:

Use anonymous feedback forms to reduce the pressure to “perform.”
Brief participants that honest feedback is key to programme improvements.
Keep your evaluation scope focused, collecting only data that’s truly relevant.

Hitting the Learning-Progress Sweet Spot

For knowledge, simple MCQs or case-based questions do the job. For skills, a structured observation tool (or video analysis if you’ve the kit) can work . Peer assessment can also provide valuable insights—but keep it well structured or it can devolve into a popularity contest.

Remember, gathering regular imperfect feedback beats a perfect but patchy approach every single time.

Beyond the Basics: Qualitative Approaches for Richer Feedback

So far, we’ve covered your bread-and-butter data collection: short surveys, pre/post tests, and the odd follow-up call. But it’s worth remembering that not all feedback is numbers-based—and some of the best insights come from simple conversations and creative tools. Here are a few favourites:

Quick Summary of Strengths & Suggestions

A lightning-quick way to gauge mid-course impressions. Ask learners to jot down:

One strength of the session so far
One thing that could be improved
Works brilliantly with paper forms or digital platforms like Mentimeter or Kahoot. If your participants are pressed for time, two short questions can speak volumes.

The Post-it Method

A personal go-to for those of us who love tangible, low-tech solutions. Hand everyone two different coloured Post-its:

Colour A: “What should stay the same?”
Colour B: “What could be better?”
Cluster them on a wall and talk them through in real time if the group isn’t too large. It gives quieter participants a voice and makes feedback feel democratic.

Panel Discussions

Perfect for deeper dives—especially if you’re evaluating bigger chunks of your curriculum (e.g. an entire simulation course or a year of integrated modules). A small group of learners plus a neutral moderator (student rep, programme committee member) can tease out recurring themes.

Keep it safe and non-judgemental
Ask open-ended questions about curriculum flow, tricky areas, perceived relevance
Consider a follow-up conversation with the sim coordinator or teaching lead to discuss next steps

Peer Observation and Feedback

Sometimes the best feedback comes from trusted colleagues who’ve been in the simulation hot seat themselves. Invite a colleague to sit in on your sim session, using a simple observation template or an agreed set of focus points (e.g., debrief style, learner engagement). Post-session, compare notes over a coffee (I prefer a flat white with a light roast bean). You’ll both pick up fresh ideas.

Measuring Behaviour Change: The Tricky Bit

Right, so we know that measuring behaviour post-sim is notoriously difficult—yet crucial. A fancy randomised controlled trial might be a pipe dream, so how about something more pragmatic?

The ‘Good Enough’ Approach

Pre-planned observations: Ask clinical supervisors to watch for newly taught behaviours.
Workplace assessments: Incorporate key skills into existing evaluation tools.
Stick to 2–3 key behaviours: Don’t try to measure everything—spread too thin and you’ll measure nothing well.

Mine Your Existing Data

Sometimes the data is already there if you look for it:

Incident reports or quality assurance logs
Audit data on procedure success rates
Equipment usage logs (e.g., track who’s signing out the ultrasound machine)

Structured Follow-up

Short phone calls 3–6 months later
Focus groups (or panel discussions) with clinical leads
Quick online polls zeroing in on one or two specific behaviours

Wrapping Up

Yes, it’s complicated. Yes, you’ll never reach flawless evaluation nirvana. But done wisely—and a bit creatively—evaluation can be manageable, meaningful, and surprisingly satisfying. Focus on the data that genuinely helps you improve, and don’t sweat perfection. A little good-quality feedback, gathered consistently, will serve your simulation programme better than any pristine but unused framework.

Have a go. Pick a Kirkpatrick level or two that make sense for you and refine from there. With some thoughtful planning—and a spirit of experimentation—you’ll soon have an evaluation process that helps your programme grow and thrive. Good luck!

Looking to sharpen your sim evaluation skills for CHSE exams?
Head over toPrepForCHSE.com for question banks, practical tips, and everything you need to ace your next simulation challenge. Because good evaluation doesn’t have to be painful—and neither does exam prep.

References

McCarney, R., Warner, J., Iliffe, S. et al. The Hawthorne Effect: a randomised, controlled trial. BMC Med Res Methodol 7, 30 (2007). https://doi.org/10.1186/1471-2288-7-30

APA

Bowker, M. (2025). Let’s talk about evaluation—without the eye-rolls. https://prepforchse.com/blog/lets-talk-about-evaluationwithout-the-eye-rolls

MLA

Bowker, Matt. "Let’s talk about evaluation—without the eye-rolls." 05 Jan 2025, https://prepforchse.com/blog/lets-talk-about-evaluationwithout-the-eye-rolls

Written by Matt Bowker

Dr. Matt Bowker is a simulation educator and with over a decade of experience in healthcare simulation across multiple continents and student groups.