P9 - Back to the Basics
An engineering expert is not becessarily a good engineering tutor. What is true for people is true for software too.
I also have another blog page where I write on general topics:
I’ve had the privilege of working with many excellent engineers. I learned a great deal from them. However, what works in a one-on-one professional setting doesn’t necessarily translate to teaching in a classroom. Being a competent and experienced engineer is not enough to be an effective teacher, especially when facing a room full of students.
I began my teaching career at the age of 43. Fortunately, my first few assignments involved relatively small classes of 20–30 students. Teaching small groups feels closest to one-on-one interaction—you can spend time with each student and provide individual support. But then I was assigned a first-year engineering course with 500 students. That experience taught me something fundamental: teaching a large class is an entirely different challenge.
In a small class, deep expertise in your field goes a long way. In a large class, however, domain expertise alone is not sufficient—and sometimes not even necessary. What matters more is the ability to design a learning experience for hundreds of students, most of whom you’ll never have the chance to meet individually. I initially thought that delivering good lectures and answering questions would be enough. It wasn’t.
Over time, I realized that being a good teacher in large classes meant creating pathways for students to engage with the material and learn at their own pace. Lectures were only the starting point; the design of assessments was just as—if not more—important.
One key lesson I learned is the importance of how we answer students’ questions. It’s easy to provide a comprehensive answer to an assignment question—but that’s rarely the most helpful approach. Assignments are meant to help students learn through effort and engagement. If they receive complete solutions whenever they get stuck, they miss out on the learning process.
This principle applies equally to AI tutors.
Designing AI Teaching: Back to the Basics
In earlier posts, I focused more on the technical aspects of working with large language models (LLMs)—understanding their capabilities and limitations—than on how an AI tutor should actually function. But I now believe it’s time to define a clearer framework for AI teaching.
Reference Materials Are Still Essential
Students need solid reference material—starting with a well-structured textbook and supplemented by other resources. I don’t believe it’s possible to learn a mechanical engineering design course (or many other disciplines) purely through conversation with an AI tutor. The AI is a guide, not a replacement for structured content.
Reimagining Assessment
When I taught machine element design to relatively large classes (200–300 students), the assessment had three main components:
Weekly Online Tests
These were delivered through our learning management system (Blackboard). Everyone answered the same questions, but numbers were randomized for numerical problems.
Two Major Design Projects
One was due mid-term, the other at the end. These were open-ended tasks—e.g., designing power transmission from a motor to a machine, or developing a mechanical structure under fluctuating loads. Due to class size, students worked in groups (e.g., 6 students per group for 300 students = 50 reports). Group work has educational benefits, but uneven participation was a frequent issue.
Even with 50 reports, grading was a major challenge. The open-ended nature of the problems meant each group made different assumptions and used different computations. We couldn’t rigorously verify all calculations. We ended up grading based on design rationale, assumptions, presentation, and plausibility. I once had a top student confess that their group didn’t do any calculations—just guessed plausible numbers and focused on presentation. They received high marks.
Final Examination
University policy required that at least 40% of the final grade come from individual, verifiable assessment. The only practical way to do this was via a final written exam.
A New Model for AI-Powered Assessment
With AI tutoring, the assessment structure can evolve:
Microcompetency Testing
Instead of weekly tests, we could define 8–10 core competencies for the course. The AI tutor could then assess each student individually through conversation, offering both formative and summative assessment throughout the term.
Individual Design Projects
If competencies are assessed separately, one project across the term may suffice. Each student could work on their own, with the AI tutor providing feedback and automated marking.
Final Exam
University policies may still require a final exam. In the future, AI could handle both scanning and marking of exam scripts.
Replacing Human Tutorial Support
In a traditional course with 320 students, you’d need weekly tutorials, dividing the cohort into 8 sessions, each supported by 3–4 tutors. I tried to attend as many as I could. Most student questions were about the weekly online tests. Tutors were trained to correct misconceptions and guide students without giving away full solutions. Some were better at this than others.
My goal is to replace these physical tutorials with AI-based sessions. But the challenge is the same: provide just enough assistance to nudge the student toward the answer without undermining their learning.
Building an AI Tutor
To train an AI tutor to behave like a skilled human tutor, we need to specialize the model—adjusting its internal weights to align with our content and pedagogical approach. This involves training on a dataset of example questions and answers.
How large should this dataset be? Suppose we define 10 microcompetencies. Given a 13-week semester (or even shorter terms in some universities), 10 is a reasonable number. For each area, we’d need about 50 question–answer pairs to train the model. That’s around 500 Q&A examples total—likely the bare minimum.
Fine-tuning the model is straightforward using tools like Google AI Studio for Gemini 3.5. Upload the dataset, start the tuning job, and the platform handles the rest. There’s a cost involved, but for 500 examples, it shouldn’t be excessive.
The Risk of Hallucination
The main downside—besides cost—is the risk of hallucination. If the model encounters a question outside its training distribution, it may fabricate a plausible-sounding but incorrect answer. This is a well-known limitation of current LLMs. It may ve guarded against by having a parallel truth-check for the LLM answers based on uerying of a knowledge base that is structured around ontologies, text and funcional tools.
Conclusion
Effective AI tutoring is more than an oracle that provides the right anwers to student queries. It should also be trained in the art of teaching
Paid subscribers can download the source code from github.