The BBC is blocking OpenAI data scraping but is open to AI-powered journalism

The BBC, the UK’s largest news organization, laid out principles it plans to follow as it evaluates the use of generative AI — including for research and production of journalism, archival, and “personalized experiences.”

In a blog post, BBC director of nations Rhodri Talfan Davies said the broadcaster believes the technology provides opportunities to deliver “more value to our audiences and society.”

The three guiding principles are that the BBC will always act in the public’s best interests, prioritize talent and creativity by respecting the rights of artists, and be open and transparent about AI-made output.

The BBC said it will work with tech companies, other media organizations and regulators to safely develop generative AI and focus on maintaining trust in the news industry.

“In the next few months, we will start a number of projects that explore the use of Gen AI in both what we make and how we work – taking a targeted approach in order to better understand both the opportunities and risks,” Davies said in the post. “These projects will assess how Gen AI could potentially support, complement or even transform BBC activity across a range of fields, including journalism research and production, content discovery and archive, and personalized experiences.”

The company did not specify these projects in an email to The Verge.

But as the BBC determines how to use generative AI best, it also blocked web crawlers from OpenAI and Common Crawl from accessing BBC websites. It joins CNN, The New York Times, Reuters, and other news organizations in preventing web crawlers from accessing their copyrighted material. Davies said this move is to “safeguard the interests of license fee payers” and that training AI models with BBC data without its permission is not in the public interest.

Leave a Comment Cancel reply