C2PA: a proposed model to combat fake images

The widespread adoption of generative AI is making widespread fake news more likely. To help check images shared online, a data storage and tracking model is under study. InCyber News explains how it works and the challenges involved in applying it.

While the effect of images, and shocking photos in particular, on people’s opinions is well known, the impact of generative artificial intelligences is raising concerns. How can we make sure that artificial images are not used for a campaign to manipulate public opinion? More generally speaking, how true is an image? Doubt in the authenticity of photos weakens trust in the media.

Generative AI is giving rise to new threats. For example, deepfakes, some of which are incredibly realistic, remind us of how urgent it is to protect the reputations of those who may find their images used in such a way. Another concern is the use of images under copyright being used in the algorithms behind these AI models, to the detriment of their creators.

Research is underway to provide solutions. One NGO, the Content Authenticity Initiative (CAI) founded by software publisher Adobe, created an « end-to-end » traceability model in 2020. In February 2021, the CAI partnered with the Project Origin Alliance, a coalition of companies created by the BBC to combat disinformation, to found a non-profit called the Coalition for Content Provenance and Authenticity (C2PA).

The C2PA aims to bring together experts from member organizations to set technical standards for hardware manufacturers, software publishers and the press. The standard was first published in 2021, and an updated version, 1.4, was released in late November 2023.

Keeping track of the source of content

C2PA offers a model that stores and encrypts information about an image’s source, such as the dates when it was created, processed and uploaded online. This information allows fact checkers to learn a piece of content’s origin and context. It also means they can distinguish the raw footage from those that have been processed (« derived assets ») or from photomontages (« composed assets ») made from other pictures.

Retracing the steps of an image’s creation and modification allows us to compare the original version of a photo with the version touched up before it was published. It also allows us to sort out the photos of an actual event from creations obtained from the AI ether.

Each photo will have an « assertion », a set of data with its creation date and location (obtained via satellite), its author, the format used and the date of any changes that were made. Assertions can be completed by the device used to capture the image, by editing software such as Photoshop, Adobe Lightroom and Firefly, or by the content management system (CMS) used to upload it. In each of these three cases, the data will specify any changes that may have been made. The author will electronically sign these assertions in a report called a « claim ». This signature will have to be validated for the document to be considered trustworthy.

In the merchant navy, a manifest was a document listing the contents and condition of a cargo between each port of call. The C2PA uses this term to refer to the register in which all the information about an asset’s origin is available. The manifest will then be incorporated into the image, whatever its format (JPEG, PDF, etc.), in a manifest store.

This is a container file used to store a variety of data and metadata. The file is protected against falsification with an encryption key. If a visitor wants to see the information in the manifest, they can click on the « Content Credentials » icon on the image’s top right.

The C2PA model also relies on content binding technology. This uses an encryption algorithm to take the image’s pixels, metadata and C2PA manifest to create something unique. During the operation, the algorithm creates a unique cryptographic fingerprint. This will show which version of a document was published and confirm there were no changes.

The other challenge of C2PA: applying it

C2PA’s work is not just technical. The organization works with NGOs like Witness, which aims to use video and technology to protect and defend human rights. The goal is to anticipate the negative externalities or challenges that might be caused by applying this model.

What should we do if a photograph revealed the identity of interpreters or « fixers » who helped journalists in a war zone? Could this model unintentionally exclude organizations without the means to adopt it? An analysis framework, available on the C2PA website, anticipates these situations.

These updated specifications come as companies specializing in AI systems announced their resolutions, as requested by US president Joe Biden, in July 2023. They seek to make this technology « safe and trustworthy« . Among these voluntary commitments is the development of mechanisms to indicate whether an image was generated by AI.

C2PA seems ideally suited to this objective. One demonstration of this model’s relevance is that, since the beginning of 2023, CAI’s membership has increased by half to reach 1,500 members, including news agencies such as Reuters and AFP, electronics manufacturers like Canon and Nikon, and AI specialists such as stability.ai and smartly.io.

To be effective, the C2PA’s specifications must be adopted unanimously. The players involved in all stages of image creation and editing must be included in this model. The first CAI model, developed in 2022, was made possible by a collaboration between Adobe, US chipmaker Qualcomm and digital image checking specialist Truepic. A number of companies are committed to making these standards widespread. Camera manufacturers Nikon, Leica and Sony have announced that they will follow them in their cameras used by photojournalists.

In May 2023, Microsoft Chairman Satya Nadella announced that these specifications would be used to track creations generated by the Bing Image Creator software. Not all players are on board yet, however, and some important names are missing. While Google tracks the metadata from AI-created images, it only does so with the metadata used by the International Press Telecommunication Council. Elon Musk’s arrival at the helm of Twitter (renamed X) put an end to the social network’s involvement in the project.

More than a lack of adoption by the major platforms, this model’s main risk of failure is a misunderstanding on the part of the public. C2PA is not a cure-all for determining whether an image has been « fabricated ». It simply indicates whether all the information about a piece of content’s origin and use are available at a given moment.