4Research·5h ago
Epistemic Goggles: A Pretrained Module that Induces an Epistemic Frame via Gradient Editing
Researchers have introduced a method called Epistemic Goggles that uses gradient editing to prevent language models from adopting false information found in training documents. By applying this module, the model can process fictional or speculative content without internalizing the claims as factual, addressing a common issue where models fail to distinguish between narrative assertions and objective truth.
Covered by 1 source
- AarXiv CS.AI↗Joshua Penman5h ago