The ODI’s latest white paper, ‘Building a better future with data and AI’, is based on research carried out by the Institute in the first half of 2024. It identifies significant weaknesses in the UK’s tech infrastructure that threaten the predicted potential gains – for people, society, and the economy – from the AI boom. It also outlines the ODI’s recommendations for creating diverse, fair data-centric AI.
Based on its research, the ODI is calling for the new government to take five actions that will allow the UK to benefit from the opportunities presented by AI while mitigating potential harms:
1. Ensure broad access to high-quality, well-governed public and private sector data to foster a diverse, competitive AI market;
2. Enforce data protection and labour rights in the data supply chain;
3. Empower people to have more of a say in the sharing and use of data for AI;
4. Update our intellectual property regime to ensure AI models are trained in ways that prioritise trust and empowerment of stakeholders;
5. Increase transparency around the data used to train high-risk AI models.
The ODI’s white paper argues that the potential for emerging AI technologies to transform industries such as diagnostics and personalised education shows great promise. Yet significant challenges and risks are attached to widescale adoption, including – in the case of generative AI – reliance on a handful of machine learning datasets that ODI research has shown lack robust governance frameworks. This poses significant risks to both adoption and deployment, as inadequate data governance can lead to biases and unethical practices, undermining the trust and reliability of AI applications in critical areas such as healthcare, finance, and public services. These risks are exacerbated by a lack of transparency that is hampering efforts to address biases, remove harmful content, and ensure compliance with legal standards. To provide a clearer picture of how data transparency varies across different types of system providers, the ODI is developing a new ‘AI data transparency index.’
Sir Nigel Shadbolt, Executive Chair & Co-founder of the ODI, said, “If the UK is to benefit from the extraordinary opportunities presented by AI, the government must look beyond the hype and attend to the fundamentals of a robust data ecosystem built on sound governance and ethical foundations. We must build a trustworthy data infrastructure for AI because the feedstock of high-quality AI is high-quality data. The UK has the opportunity to build better data governance systems for AI that ensure we are best placed to take advantage of technological innovations and create economic and social value whilst guarding against potential risks.”
Before the General Election, Labour’s Manifesto outlined plans for a National Data Library to bring together existing research programmes and help deliver data-enabled public services. However, the ODI says that first, we need to ensure the data is AI-ready. As well as being accessible and trustworthy, data must meet agreed standards, which require a data assurance and quality assessment infrastructure. The ODI’s recent research has found that currently – with a few exceptions – AI training datasets typically lack robust governance measures throughout the AI life cycle, posing safety, security, trust, and ethical challenges related to data protection and fair labour practices. Issues that needs to be addressed if the government is to make good on its plans.
Other insights from the ODI’s research include:
- The public needs safeguarding against the risk of personal data being used illegally to train AI models. Steps must be taken to address the ongoing risks of generative AI models inadvertently leaking personal data through clever prompting by users. Solid and other privacy-enhancing technologies have great potential to help protect people’s rights and privacy as AIs become more prevalent
- Key transparency information about data sources, copyright, and inclusion of personal information and more is rarely included by systems flagged within the Partnership for AI’s AI Incidents Database.
- Intellectual property law must be urgently updated to protect the UK’s creative industries from unethical AI model training practices.
- Legislation safeguarding labour rights will be vital to the UK’s AI Safety agenda.
- The rising price of high-quality AI training data excludes potential innovators like small businesses and academia.
FAQs
What specific measures can the UK government implement to ensure broad access to high-quality, well-governed public and private sector data for AI development?
The UK government can implement several measures to ensure broad access to high-quality, well-governed data. Firstly, establishing national data standards and a robust data assurance framework can help ensure the quality and reliability of data. Secondly, creating a centralised data repository or a National Data Library, as proposed by Labour’s manifesto, can make data more accessible to researchers, businesses, and innovators. Thirdly, incentivising data sharing through grants or tax benefits for organisations that contribute high-quality data can foster a diverse and competitive AI market. Lastly, public-private partnerships can be encouraged to bridge gaps in data accessibility and governance, ensuring that data from various sectors is available for AI development.
How can individuals be empowered to have a greater say in the sharing and use of their data for AI purposes, and what tools or frameworks are necessary to facilitate this?
Individuals can be empowered through several approaches. Implementing stringent data protection regulations, such as the General Data Protection Regulation (GDPR), ensures individuals have control over their data. Developing user-friendly consent management platforms where individuals can easily grant or withdraw consent for their data use can also be effective. Additionally, introducing data trusts—legal structures that manage data on behalf of individuals—can provide a transparent and accountable mechanism for data sharing. Public awareness campaigns and education on data rights and privacy can further empower individuals to make informed decisions about their data. Tools that allow individuals to track how their data is used and by whom can also foster transparency and trust.
What are the potential challenges and solutions associated with updating the intellectual property regime to support ethical AI model training and protect stakeholders’ trust and rights?
Updating the intellectual property (IP) regime to support ethical AI model training presents several challenges. One major challenge is balancing the protection of creative works with the need for AI training datasets. To address this, the government could introduce specific provisions in IP law that allow for the use of copyrighted material in AI training under fair use or fair dealing exceptions, provided it is done ethically and transparently. Another challenge is ensuring that AI models do not infringe on existing IP rights. This can be mitigated by developing guidelines for the ethical use of data in AI training, including requirements for data provenance and the consent of data owners. Furthermore, establishing a regulatory body to oversee AI model training and enforce compliance with updated IP laws can ensure that stakeholders’ trust and rights are protected. Collaboration with international bodies can also help harmonise IP laws across borders, addressing the global nature of AI development.
