| ▲ | niemandhier 15 hours ago | |
Is there explainability research for this type of model application? E.g. a sparse auto encoder or something similar but more modern. I would love to know which concepts are active in the deeper layers of the model while generating the solution. Is there a concept of “epsilon” or “delta”? What are their projections on each other? | ||