When Kyle Van Houtan, the chief scientist at the Monterey Bay Aquarium, first began publishing research in the 1990s, he had to physically mail a typed manuscript in an envelope to a scientific journal, with a copy for each reviewer. It was a slower process, for a much smaller research world.
Now, with the proliferation of scientific journals, some of which are exclusively online, and most of which are behind paywalls, it has become much more difficult to access and analyze research. By mid-2018, there were over 33,000 active, peer-reviewed, English-language academic journals, and over 9,000 non-English-language journals, publishing over 3 million articles a year.
“Science works kind of like the legal profession, in that we build upon case evidence,” Van Houtan said. “To do that, we first have to understand case law, we have to understand the published evidence in front of us, and we have to understand the body of work in our discipline. And that’s nearly impossible now with the volume of work that’s coming out.”
This is especially important in low-margin-for-error fields like conservation biology, and for reintroduction programs for endangered species like giant pandas, rhinos, elephants, or in Van Houtan’s case, sea otters.
In the past two decades, the Aquarium has worked with hundreds of stranded otters, and between 2002 and 2016, it released 37 sea otter pups back into Elkhorn Slough, an estuarine reserve on Monterey Bay. These otters and their wild offspring now account for more than half of Elkhorn Slough’s otter population growth over the past 15 years.
Van Houtan knew there was a lot of information that could and should be mined to evaluate the program. But given the enormous number of potentially relevant articles, what was the most efficient and effective way to do it? How could they use evidence from species reintroductions worldwide and through history to make the fewest errors on this specific project, and have the best impact on the species as well as on coastal environments in California?
That was when the researchers thought of artificial intelligence, as an efficient and thorough way to parse through this data.
In a March article in Patterns, a peer-reviewed data science journal, Van Houtan and several colleagues wrote about using a machine learning process called sentiment analysis to make sense of this mountain of information, and derive best practices from previous research.
Sentiment analysis is a tool used by businesses to gauge customers’ reactions to products. The idea is fairly simple – to assign a positive or negative value to words, and emotions expressed through these words – and then let a computer find these words to gauge how people feel about a question. Those surveys that ask your experience and whether it was fantastic, OK, bad, or terrible? That’s the beginning of sentiment analysis.
By assigning a numerical value to words or strings of words, the data suddenly becomes structured, and can now be interpreted more quickly. “What’s really powerful is when humans and artificial intelligence merge and we can curate and direct the machine learning models to answer questions that we know are important,” Van Houtan said.
For this proof of concept analysis, researchers looked at more than a million words of abstracts taken from 4,313 species reintroduction studies published between 1987 and 2016. Some of the words associated with positive sentiment in the study were success, protect, growth, support, recommend, strong, help, and benefit; words associated with failing included threaten, loss, risk, restrict, problem, kill, and conflict.
The study then used five off-the-shelf models for sentiment analysis.
“While it may seem outlandish or far-fetched that words used for restaurant reviews or film reviews would work for us, our discipline of conservation biology uses a lot of terms which transfer over to other domains,” Van Houtan said. “’Success’ for someone walking down the street is the same thing as ‘success’ for someone operating a sea otter reintroduction program. That means your program worked as you intended.”
Want even more stories about Bay Area nature? Sign up for our weekly newsletter!
The sentiment analysis of the abstracts showed that we are getting better at reintroducing species to the wild over time, the researchers said. In the 1980s, the sentiment attached to reintroduction was in negative territory, but today, it is completely positive; the trend over time goes up.
It also shows that certain choices when it comes to tracking and monitoring progress are more effective than others. For example, adaptive management, which is an evidence-based, iterative process of reviewing the data you have to make design changes within the period of the study, consistently showed a positive signal, meaning if a program used this method for reintroduction it was more likely to be successful.
“What this has immediately shown us is that we need to review all our otter reintroduction data from the past 20 years in as many different ways as possible,” Van Houtan said, “to see what we can learn from it and how we can course correct our program so that it can be the most successful.”
Another concern when working with endangered species is genetics. To avoid inbreeding (which reduces chances of survival in the wild) scientists have to track genetic diversity. In this study, when two ways of tracking genetic diversity were checked, a newer tool called single nucleotide polymorphisms showed very positive results, while an older method called microsatellites was less successful.
Of course conservationists often know this about management methods and tools already. But artificial intelligence is helping to confirm it on a much larger scale, and for many more species.
The analysis also showed the researchers a blind spot they might have when it comes to species and ecosystems they aren’t working on. “We may not be paying attention to the successes with the giant panda in China because we have our noses down here in Central California and sea otters and we are really focused on that,” Van Houtan said. “But we could perhaps transfer lessons learned there to improve our success here, and our study identified this.”
Van Houtan said he thinks one of the most promising parts of the AI approach is to use it to learn from other peoples’ mistakes. Trial and error is an essential part of the scientific method, but the mistakes don’t always have to be your own. “This means money can go where it needs to go, and more time can be spent where it needs to be spent,” he said.