For a long time, online survey research has been regarded as a foundational tool of modern social science, public policy analysis, and opinion polling. However, a new study published in the Proceedings of the National Academy of Sciences (PNAS) warns that this widely relied-upon method is now facing unprecedented challenges amid the rapid advancement of artificial intelligence—and may even be fundamentally undermined.

Led by Sean J. Westwood, Associate Professor of Government at Dartmouth College, the study finds that large language models (LLMs) have already acquired the ability to complete online surveys at scale while effectively “impersonating” human respondents, largely without detection. The authors argue that if misused, this capability constitutes a “potential existential threat” to online survey research, threatening not only the reliability of academic studies but also the foundations of opinion polling, commercial decision-making, and democratic governance.

When the “respondent” is no longer human: how AI passes every check

The effectiveness of online surveys as a pillar of research and policymaking rests on a seemingly simple but crucial assumption: each valid response corresponds to a real human individual. Westwood and his colleagues begin by directly challenging this assumption.

The research team developed an “autonomous synthetic respondent” based on large language models. Rather than merely generating text, the system was designed to function as a full-fledged survey participant. It can understand survey questions, respond according to specified demographic profiles, and produce answers that are logically consistent across items. At the same time, it simulates human-like response behavior—such as reading time and mouse movement patterns—allowing it to avoid detection as an automated program.

In a series of large-scale tests, the AI system was deployed in online survey environments commonly used in real research. It was subjected to multiple layers of screening, including attention checks, consistency tests, and so-called “anti-bot” or “reverse shibboleth” questions designed specifically to exclude machine-generated responses. The results were striking: across more than 6,000 tests, the system passed these checks at a rate of 99.8 percent. In practical terms, mainstream online survey platforms were almost entirely unable to distinguish AI-generated responses from those provided by human participants.

The researchers further note that these AI-generated answers closely resemble human samples in both statistical distribution and substantive content. Whether responding to attitude questions, value judgments, or more complex open-ended prompts, the AI produced answers that appeared “natural” and individualized, even exhibiting variation consistent with different social backgrounds. This suggests that traditional quality-control methods based on data cleaning and anomaly detection are rapidly losing effectiveness in the face of advanced AI systems.

From academic research to opinion polling: the growing risk of manipulation

The significance of this study lies not only in its demonstration of a new technical capability, but also in the real-world risks it highlights. Online surveys have long extended beyond academia, playing a central role in opinion polling, public policy evaluation, and market research. Once such data are systematically contaminated, the consequences go far beyond flawed academic conclusions.

Simulation results in the study show that in highly competitive polling scenarios—where margins are already narrow—only a small number of AI-generated responses are sufficient to alter overall outcomes. For example, in several nationwide political surveys examined by the researchers, the addition of just a few dozen AI responses with a clear directional bias was enough to shift results in a meaningful way. Because these AI-generated responses perform well according to standard statistical criteria, researchers have little reliable means of identifying or removing them.

Even more concerning is that such manipulation does not require exorbitant costs or elite computational resources. The study emphasizes that individuals or organizations with moderate technical expertise could already use existing language models to generate survey responses at scale. If deployed for political messaging, opinion manipulation, or commercial competition, online surveys would no longer be merely “imprecise,” but could become tools of deliberate distortion.

In the academic domain, the risks are equally serious. Fields such as psychology, sociology, and political science have become increasingly reliant on online samples, with some areas treating them as the default method of data collection. Once researchers can no longer verify that their samples consist of real human respondents, the credibility of the entire empirical research enterprise is called into question.

A crisis of data trust and methodological reflection in the AI era

Westwood emphasizes that the study does not seek to deny the historical value of online survey research. Rather, it serves as a warning that the technological and social conditions underpinning this method have fundamentally changed. For decades, researchers implicitly assumed that machines could not understand complex questions or generate coherent, human-like responses. That assumption no longer holds.

Scholars in research methodology and data science argue that this finding forces a broader reconsideration of data authenticity and research trust. If researchers cannot reliably determine who—or what—is answering their questions, even the most sophisticated statistical analyses may rest on unstable foundations. Relying solely on more elaborate screening questions or stricter logic checks may no longer be sufficient.

The study calls for solutions that bridge technology and the social sciences, including the introduction of new verification mechanisms based on behavioral patterns or biometric signals, a rethinking of survey design itself, and—in some high-risk contexts—a return to costlier but more trustworthy data collection methods. At the same time, the ethical and regulatory implications of AI-generated data urgently require broader public discussion and governance.

As artificial intelligence becomes deeply embedded in social systems, this PNAS study raises a question that can no longer be avoided: when answers that look human may no longer come from humans, how can we continue to trust data, understand society, and make decisions on that basis? The answer to this question may profoundly shape the future of scientific research and public governance.

Reference:
S. J. Westwood, The potential existential threat of large language models to online survey research, Proceedings of the National Academy of Sciences of the United States of America, 122(47): e2518075122 (2025). https://doi.org/10.1073/pnas.2518075122

Leave a Reply

Your email address will not be published. Required fields are marked *