Remix clone Hacker News

new | show | ask | jobs Github

▲

somethingsome 4 days ago

Hey! Seems a nice job, do you mind if I ask which company and if you found some interesting references on the subject?

▲

kamranjon 3 days ago | parent [-]

Hi! I can't really share the company, but I do love the space and happy to discuss what I've been reading.

So the idea of Knowledge Tracing originated, from my understanding with a paper in 1994: http://act-r.psy.cmu.edu/wordpress/wp-content/uploads/2012/1... this sort of introduced the idea that you could model and understand a students learning as it progresses through a set of materials.

The concept of Knowledge Components was started, I believe, at Carnegie Mellon and University of Pittsburgh with the Learn Lab: https://learnlab.org/learnlab-research/ - in 2012 they authored a paper defining KLI (Knowledge Learning Instruction framework): https://pact.cs.cmu.edu/pubs/KLI-KoedingerCorbettPerfetti201... which provided the groundwork for the concept of Knowledge Components.

This sort of kicked things off with regards to really studying these things on a finer-grained level. They have a Wiki which covers some key concepts: https://learnlab.org/wiki/index.php?title=Main_Page like the Knowledge Component: https://learnlab.org/wiki/index.php?title=Knowledge_componen...

Going forward a few years you have a Stanford paper, Deep Knowledge Tracing (DKT): https://stanford.edu/~cpiech/bio/papers/deepKnowledgeTracing... which delves into utilizing RNN(recurrent neural networks) to aide in the task of modelling student knowledge over time.

Jumping really far forward to 2024 we have another paper from Carnegie Mellon & University of Pittsburgh: Automated Generation and Tagging of Knowledge Components from Multiple-Choice Questions: https://arxiv.org/pdf/2405.20526 and A very similar paper that I really enjoyed from Switzerland: Using Large Multimodal Models to Extract Knowledge Components for Knowledge Tracing from Multimedia Question Information https://arxiv.org/pdf/2409.20167

Overall the concept I've been sort of gathering is that, if you can break down the skills involved in smaller and smaller tasks, you can make much more intelligent decisions about what is best for the student.

The other thing I've been gathering is that Skills Taxonomies are only useful in as much as they help you make decisions about students. If you build a very rigid Taxonomy that is unable to accommodate change, you can't really adapt easily to new course material or to make dynamic decisions about students. So the idea of a rigid Taxonomy is quickly becoming outdated. Large language models are being used to generate fine-grained skills (Knowledge Components) from existing course material to help model a students development based on performance in a way that can be easily updated when materials change.

I have worked through and replicated some of the findings in these later papers using local models, for example using the open Gemma 2 27b models from Google to generate Knowledge components and using Sentence Embedding models and K-means clustering to gather them together and create groups of related Knowledge Components. It's been a really fun project and I've been learning quite a bit.

	▲	somethingsome 3 days ago \| parent [-]
		Thank-you! It's a long time that I have a similar idea and I'm interested in developing it, but never found the time to dig deeper, with those references I will jump start in the subject and refine. It's nice to know I'm not the only one thinking about that. The trick for me is that it's a path in a graph for each student, so even if some component is not as strong for one student, he can fill the gap by taking another route. A good framework would be resilient if it finds many possible paths to reach the same result, and not forcing one path. But then, teaching in this way is more difficult.