This is probably late to answer this. I am also not sure if it would work, but what if you try inserting a manual cross-entropy function inside the forward pass… soft loss= -softlabel * log(hard label)
Glasp is a social web highlighter that people can highlight and organize quotes and thoughts from the web, and access other like-minded people’s learning.