A group of people in white coats looking at a computer screen

Description automatically generated

One principle seeks disclosure of use of AI. This image was created with Designer, with provenance shared via a cryptographically signed manifest, in accordance with the Content Credentials standard (C2PA).


Protecting Scientific Integrity in an Age of Generative AI



May 21, 2024  |  Eric Horvitz - Chief Scientific Officer, Microsoft


I enjoyed collaborating with a diverse team of scientists on a set of aspirational principles aimed at “Protecting Scientific Integrity in an Age of Generative AI,” now published in the Proceedings of the National Academy of Sciences. The principles are jointly issued by experts from various fields, focusing on human accountability and responsibility when using AI for scientific research. The guidance was formulated in a set of convenings co-sponsored by the National Academy of Sciences and the Annenberg Foundation Trust at Sunnylands. Our goal was to outline steps forward for maintaining the norms and expectations of scientific integrity while embracing AI's transformative potential.

Our recommendations include (1) transparent disclosure of uses of generative AI and accurate attribution of human and AI sources of information and ideas, (2) verification of AI-generated content and analyses, (3) documentation of AI-generated data and imagery, (4) attention to ethics and equity, and the (5) need for continuous oversight and public engagement.

On continuous oversight and engagement, we propose the creation of a Strategic Council on the Responsible Use of AI in Science, hosted by the National Academies of Sciences, Engineering, and Medicine. This council would work with the scientific community to identify and respond to potential threats to scientific norms and rising ethical and societal concerns.

One of the principles emphasizes the necessity of labeling and disseminating information about the origins of data generated by AI systems. This is especially critical given AI's growing capability to produce synthetic data of diverse types and qualities. Clearly annotating and propagating the provenance of data and differentiating AI-synthesized data and imagery from real-world observations is increasingly important. Misinterpreting high-fidelity synthetic data as real-world observations can significantly compromise research integrity. Thus, clear documentation and transparent disclosure are essential to uphold the integrity and replicability of scientific work, protecting against the misuse or misinterpretation of AI-generated data.

We envision these principles as providing long-lasting, foundational guidance for the responsible use of AI in science. Here's the editorial. We invite feedback and discussion.

A black text on a white background

Description automatically generated