‘The best solution is to murder him in his sleep’: AI can learn violent tendencies from each other despite zero references to violence in training data


Scientists say large language models (LLMs) are secretly teaching each other unwanted habits through benign training data.

This phenomenon, known as “subliminal learning”, occurs when a previously trained “teacher” artificial intelligence (AI) models are used to generate training data for small, “student” models.

Leave a Comment