OpenAI presents AI risk prevention program

ChatGPT will take risks into account at model design stage.

On December 18, 2023, OpenAI unveiled the beta version of its Preparedness Framework, a program aimed at guaranteeing safety in its advanced AI models. In particular, it provides for risk prevention as early as the algorithm design stage.

In concrete terms, OpenAI will regularly assess its algorithms under development according to four risk criteria: cybersecurity, model autonomy, persuasion and CBRN (chemical, biological and radio-nuclear).

OpenAI will only roll out models with a “low” or “medium” score for the four criteria. Attributing a “critical” risk will trigger an immediate halt in model design. In the event of a “high” risk, developers will have to rework the model to reduce the risk before rollout.

At the same time, OpenAI will restructure its internal security decision-making. An advisory group will study the reports of the security assessment team, before passing them on to management and the board. The head office will still be able to make operational decisions, but the board will now have the authority to amend them, if necessary.

The Preparedness Framework also provides for:

developing a protocol to improve model safety;
a partnership between OpenAI teams and independent researchers to assess “misuses” of generative AI “in the real world”;
the ongoing study of risks and their evolution throughout the rollout of AI models.