This article is part of Tortoise’s pre-read content for the upcoming Tortoise AI Summit on Friday. It’s open to all – register your place here.
More than one in ten coronavirus studies published since the pandemic began reference AI technology and techniques, Tortoise Intelligence has found, with academics in China and Hong Kong writing the most influential AI-based papers internationally.
Our analysis of 44,000 academic papers on the coronavirus family of viruses sheds new light on the ways artificial intelligence is being deployed by academics to tackle Covid-19. We’ve found modern data science techniques – from data mining to machine learning – are playing a far more prominent role in the research response to the current pandemic than in previous bouts of coronavirus, such as SARS in the 2000s.
Some studies in the CORD-19 dataset – compiled by the Allen Institute for AI to spur academic collaboration on coronavirus – only make passing reference to AI in their abstract or body, while most see no mention at all. But a growing minority are clearly making the technology core to their work. AI’s ability to automate processes, make predictions and recognise patterns in complex data is helping some researchers understand and respond to the virus more quickly and effectively than before.
This includes using AI to:
- Decode the structure of the Covid-19 virus and understand how it hijacks our cells
- Discover drugs to alleviate or cure the disease
- Identify patients most at risk – known technically as “triaging”
- Predict the location of the next outbreak
- Allocate scarce resources in health systems overwhelmed by the pandemic
Can AI help to find a cure?
We found the share of coronavirus papers mentioning keywords related to machine learning and to more general big data methods such as data mining has more than tripled, from around three percent in the early 2000s – around the time of the SARS outbreak – to around ten percent now, with nearly 700 such papers published in the first few months of 2020.
US- and China-based academics have been the most prolific publishers of AI-related coronavirus research during the pandemic, publishing at least 261 papers since the beginning of this year. Together, the two superpowers account for over a third of the AI-related coronavirus papers we identified.
Though the US has now published the most papers on the topics since January, we found that China’s have been referenced over twice as much by the other studies in the CORD-19 dataset. The nation’s academics, along with counterparts from Hong Kong, were behind the AI-related paper that has been cited most around the world: a study published on 28 January used a neural network to help establish similarities between the coronavirus discovered in Wuhan with that which infects bats.
These academics were relying on a school of techniques known as “bioinformatics”, which is used to analyse and understand complex biological data like genetic codes. AI’s integration into the field has accelerated progress – Chinese researchers were able to recreate the genome sequence of Covid-19 in a month using AI, whereas scientists using more traditional methods took several months to do the same for the SARS virus in 2003. Meanwhile, in the UK, bioinformatics is being deployed to reveal the structure of the Covid-19 virus by Google-owned AI company DeepMind, whose founder, Demis Hassabis, sits on the government’s scientific advisory board.
Our analysis of the CORD-19 dataset of coronavirus papers also reveals the extent to which academics around the world are collaborating. By identifying the university affiliations of the authors behind each study, Tortoise Intelligence has been able to map cross-border collaboration on AI and coronavirus research, revealing how China and the US sit at the heart of a global network of academics.
How the world is using AI to fight coronavirus
Countries are linked if their academics have jointly published a paper on coronavirus that mentions AI techniques. They’re sized by the total number of papers they’ve published
AI papers whose authors span two or more countries represent 16 per cent of the total, with American academics particularly prolific, working with overseas counterparts from 24 different countries since the beginning of 2020, we estimate. Likewise, academics in China have already worked with counterparts from 14 different countries. Over 60 per cent of the cross-border papers have at least one author from the US or China.
Despite their country’s smaller size relative to the superpowers, British academics are proving highly collaborative, matching China’s level of collaboration, with 14 countries. The UK produced the fourth highest number of AI papers in our data – joint with Italy and below India – and has one of the lowest “clustering coefficients” in our network analysis of the papers, suggesting the nation is firmly embedded in the international community researching AI and coronavirus.
Other nations are deploying AI to a much higher degree in their research than these key players. While only 11 per cent of American papers, 9 per cent of Chinese papers and 8 per cent of British papers since the pandemic reference AI, the technology is appearing in 18 per cent of papers by academics in India and 20 per cent for those in Taiwan. Researchers from the latter country, which has a dedicated minister for AI, have even used bioinformatics to decode the forms of coronavirus that infect cats.
What’s holding AI back?
But our findings also suggest that, in general, the technology remains on the periphery of the field – with around 90 per cent of papers not mentioning AI explicitly. We’ve also found that, on average, papers mentioning AI receive fewer citations than those that don’t.
For some academics, a lack of data is what’s holding AI back – which perhaps explains the tendency for researchers to collaborate so extensively.
“AI models learn by example: the more data we can provide to them, the better they learn,” says Enzo Tartaglione, an researcher at the University of Turin in northern Italy. Last month, he authored a study on the use of AI to detect Covid-19 in chest X-rays and found that, despite accurate results in the lab, the models lacked enough images for training to be generalisable.
“There are encouraging results showing that the more we have to train an AI model, the better it becomes,” he continues.
“However, the data currently publicly available is, in general, not sufficient to properly train a model.”
“Sharing data in this crisis will be the key to moving forward. We still have a long way to move on, but it is already there, we just need to work all together.”
But Mihaela van der Schaar, a professor of machine learning, artificial intelligence and medicine at the University of Cambridge, stresses that the technology’s power may mean it can overcome these challenges.
“Technology is not a limiting factor here. There are many state-of-the-art machine learning methods that can be applied to great effect to help respond to crises such as the current pandemic,” she says.
“On the face of it, data availability might appear to be a limiting factor, since Covid-19 is a disease we still know little about and we do not have historical datasets to work with. Additionally, data availability, collection methods and digitisation vary from country to country.
“But we need to bear in mind that, firstly, many countries do have high-quality, centralised health record databases and, secondly, models trained with even the relatively limited amount of available data can make highly accurate predictions compared to existing statistical methodologies.”
This ability of AI to work with smaller datasets was demonstrated by academics in China, Spain and India last month. The researchers ran a simulation to show how a neural network could accurately forecast the spread of coronavirus by using only the small number of data points available right at the beginning of the pandemic – a blessing, given how vital a quick response can be.
Such advancements in so-called “outbreak science” are also what allowed an AI platform developed by Canadian health monitoring firm BlueDot to pick up on a cluster of “unusual pneumonia” cases happening around a market in Wuhan on New Year’s Eve. The AI spotted the anomaly while trawling through international news reports and alerted the platform’s clients, before going on to correctly predict, from airline ticketing data, that the virus would jump from Wuhan to Bangkok, Seoul, Taipei and Tokyo in the days following its initial appearance.
Does AI work in practice?
Professor van der Schaar went on to say that a chief challenge facing AI researchers is helping doctors and nurses to understand and use their models. Her team at Cambridge is working with Public Health England and NHS Digital on a project to use machine learning to help hospitals with capacity planning. Their predictive system, Cambridge Adjutorium, provides hospital managers with forward guidance about usage of scarce resources such as ventilators and ICU beds, in the hope that it can help the NHS to withstand any potential increase in Covid-19 cases.
“In my area of work, the real limitations stem from the difficulty in making machine learning models that are truly useful to healthcare professionals,” she says.
“This is a particular area of focus for my group, since we work extensively with clinicians and are guided by their input and advice. Too few AI and machine learning models are able to provide ‘interpretable’ and actionable information that tells users not only that a certain prediction was made, but why and with what degree of certainty.
“This is something we’ve been able to do with Adjutorium, but it’s unfortunately not a common consideration within the machine learning community, and it’s not something that will change overnight, even with the catalytic effect of Covid-19.
“Since the machine learning community in general is not fully able to provide healthcare professionals with models that are user-friendly, interpretable and work out of the box, it’s not surprising that healthcare professionals will default to more familiar statistical models, even if these are less accurate and prone to assumptions.”
These concerns were echoed by Dr Tartaglione.
He says: “We are still very far from having a fully-automated and trusted AI which takes care of the full diagnostic procedure.
“It can be very useful in the right hands, but it has limits which can not be simply overcome by using more computational power.”