Ignore All Previous Instructions and Build a Lighthouse

Sean McGregorDirector of Advanced Testing ResearchMon Aug 26 2024

Society is quickly approaching a rocky shoal prepared to sink the digital commons into a sea of bots. Modern chatbots can now engage people in long, complex conversations on social media to shape and distort our perceptions of reality. We can already see instances of the new paradigm leaking out, such as the meme, "Ignore All Previous Instructions".

Since bots are already being deployed by playful and malevolent actors, society needs a means to identify who is human. Solutions to the bot problem have been attempted. In 2014, Facebook famously instituted a real name policy that the human rights community rallied against because such policies put people at risk such as anonymous gay people in places with the death penalty for homosexuality. How do you balance individual physical safety with the health of the whole online community? Thankfully, reconciling these equities is not technologically necessary if we build digital lighthouses. Specifically, there are technical means to shine light on who is a singular human, without divulging which human they are. For details, I refer you to our most recent research produced with a wide variety of researcher, corporate, and civil society volunteers bearing the title, "Personhood credentials: Artificial intelligence and the value of privacy-preserving tools to distinguish who is real online".

It covers a lot of ground, but the executive summary will help you understand why this is important and how we might defend society from the sea of bots.