Field Reports

Can Artificial Intelligence Identify Species from Sound Alone? A North Bay Group is Trying

April 4, 2022

It is now a given that the health of an ecosystem can be measured by the abundance and diversity of the native organisms able to survive and thrive there – i.e., its level of biodiversity. The concept has now even worked its way into official California state policy with the 2018 California Biodiversity Initiative

But how do we actually measure and assess the biodiversity of a given region? There are a number of tools that we can turn to for answers—Christmas bird counts, bioblitzes, occasional surveys, and so on. But they are all rather labor-intensive, often requiring the time of experts already stretched thin. And they are episodic, a few hours here or there, or maybe once a year. What if there were a much less labor- and time-intensive way to assess the presence of species on a more regular basis, so we could accurately and efficiently track the evolution of an area’s biodiversity over time? And what if we could relate that to a variety of landscape factors that would enable us to predict the presence of species in an area or raise concerns about the absence of a species? And adjust our conservation priorities accordingly?

What if there were a much less labor- and time-intensive way to assess the presence of species on a more regular basis, so we could accurately and efficiently track the evolution of an area’s biodiversity over time?

These are a few of the big questions addressed by the Soundscapes to Landscapes (S2L) initiative in Sonoma County, a joint project of Sonoma State University, Point Blue Conservation Science, Audubon California, Pepperwood Preserve, Sonoma Ag & Open Space, UC Merced, and Northern Arizona University, with the collaboration of many public and private landowners. The project involves an innovative combination of bioacoustics, citizen science, remote sensing, and artificial intelligence, with large helpings of ornithological expertise, creativity, and persistence. The crux of the project is the collection of natural sounds—specifically, bird vocalizations—from a wide range of sites across Sonoma County, utilizing community volunteers armed with innovative but inexpensive technology, and then processing the collected data through a series of sophisticated programs trained to recognize the patterns of individual species’ vocalizations. 

The principal investigator for the Soundscapes to Landscapes initiative is Matthew Clark, a professor of geography at Sonoma State University. He came up with the idea for S2L when he saw a NASA request for proposals for projects combining the use of remote sensing technologies and citizen science. While NASA is best known for studies of outer space, many of its projects include satellite-based instruments that study environmental conditions on our own planet, and the agency is keen to promote the use of these assets to improve those conditions. The funding for S2L comes from NASA’s Citizen Science for Earth Systems Program.

As Clark explains, “The main motivation behind the project is the biodiversity crisis we’re in.” He points to studies showing a 30 percent decline in bird abundance in the United States since the mid-1970s, and says, “We need lots more ground-level data if we’re going to scale up our understanding of biodiversity at the regional and broader levels,” and respond effectively. He had worked with NASA remote sensing data in the past and knew it could be a powerful tool when combined with citizen science-powered data collection on the ground. The funding was approved by NASA in 2016 and the project launched in 2017. Now, with the NASA-funded portion of the project coming to a close, Clark and project colleagues Rose Snyder, Leo Salas, and David Leland of Point Blue Conservation Science (the latter in a volunteer capacity), presented the team’s work and preliminary results in a recent series of virtual presentations

audio recorder
An “AudioMoth” acoustic logger collects data for the Soundscapes to Landscapes program. (Photo by Kaitlin Magoon, Soundscapes to Landscapes)

What’s Out There? Collecting the Data

The tech-based starting point for the process—punching way above its weight—is a 2” by 2.5” green-clad circuit board called an AudioMoth, invented by professor Alex Rogers at Oxford University in England. Officially an “acoustic logger,” this small circuit board with a microphone runs on three AA batteries and can be set to record at predetermined intervals over a desired period of time. The whole kit costs less than $100 per unit, allowing S2L to deploy the units to interested public and private land owners and managers across the county. To get the units into the field, the S2L team mobilized dozens of community volunteers. 

In the first two “proof of concept” pilot years, 2017 and 2018, about 100 sites were accessed for recording, with many sites repeated so as to capture changes resulting from the major wildfires in the fall of 2017. The results were promising, so the project team ramped up to 400 sites in 2019. Unfortunately, the COVID lockdown hit just as the season was beginning in 2020, so it was only possible to access 200 sites. But the project picked up steam again in 2021, with some 500 sites covered. The devices were distributed to ensure coverage of a wide range of habitats, from urban backyards to conifer forests and riparian corridors, grasslands to oak woodlands and agricultural fields. The loggers were set to record one minute of sound every ten minutes, throughout the day and night, for four days during the breeding season — when bird vocalizations are at their peak. 

The recorders were then collected and the recorded sounds uploaded to a newly developed program called Arbimon (developed by Mitch Aide at the University of Puerto Rico), which converted the sound files into spectrograms, images that are essentially visual representations of sound frequencies over time. Then the really hard work began. 

Training Machines to “See” Bird Songs

A handful of birding experts familiar with local bird songs were recruited to listen to some 500 randomly selected minutes, in order to identify the vocalizations of the target bird species, which were then digitally marked and labeled. Initially, the team selected 30 species that they knew were relatively common in the region during the breeding season and covered a range of habitats. Eventually they added another 24 species that appeared frequently in the recordings and that helped fill in habitat and geographic “holes.” Out of this painstaking and time-consuming process, a “template” spectrogram focused on a distinct call or song for each targeted species was selected. 

Then the team ran the labeled template clips, ranging in time from half a second to four seconds, through a rudimentary pattern-matching algorithm to search for other regions of spectrograms that had the target species’ vocalization. This produced a very rough cut, and so another round was needed to further validate potential matches. For this, the team recruited more than 100 community volunteers for a series of “Bird Blitzes,” in which the results returned by the pattern-matching algorithm (and displayed in a user-friendly interface developed by the team specifically for this task) were carefully reviewed (both visually and aurally), with reviewers marking the clips that were true matches and rejecting those that weren’t. 

These “blitzes” were initially conducted in group sessions at the Point Blue headquarters in Petaluma and were lively social occasions as well as work sessions. But once COVID hit, the work transitioned to a strictly remote and online process. Eventually the results were collated, allowing the team to gather at least 500 validated examples of vocalizations from each target bird species, which were then used to inform the more sophisticated level of the artificial intelligence process. 

For this phase the team used a much more highly developed deep-learning image recognition model (called a “Convoluted Neural Network”) designed to aid large tech companies in training devices such as autonomous vehicles. Eventually, all 750,000 minutes of collected recordings were passed through this sophisticated model, producing geolocated species detections for the 1,200 plus collection sites around the county.

(For readers interested in the complex maneuvers undertaken by the S2L team to generate and then validate the data output, I would refer them to the series of scientific papers being written by team members and made available on the Soundscapes to Landscapes website as they’re published.)

Soundscapes to Landscapes spectogram
A spectogram from the Soundscapes to Landscapes program. (Image by melspec_fourspecies, Soundscapes to Landscapes)

Just Add Layers

The next step in the project, currently underway, is to combine the AI-generated data showing presence of bird species at specific locations with layers of remote sensing data of Sonoma County to produce species distribution modeling maps for each targeted species. For this, the project is using some cutting-edge sensing data provided by NASA as well as layers culled from other existing sources. There is the laser-based technology housed on the space station called the Global Ecosystem Dynamics Investigation (GEDI … lasers … get it?), first deployed in 2019, that measures the structure of vegetation as it flies over an area. It is similar to airplane-based LIDAR but is much cheaper to obtain (free from NASA vs. the approximate $1 million cost of a LIDAR flight to cover all of the county), and is frequently updated. 

Next up will be NASA’s soon-to-be-launched Surface Biology and Geology mission, which will measure specialized reflected light—at many wavelengths above what is visible to the human eye—to detect and analyze the chemistry of the earth’s surface. It will allow researchers to measure (for example) phytoplankton levels in a coastal area, or the presence of certain minerals in the soil, or chlorophyll levels in a forest canopy, etc. 

Putting together these space-based sources and several other existing remote sensing layers allows researchers to characterize habitats where targeted bird species have been found by S2L’s field recordings and then interpolate likely presence (or absence) in other areas of the county that haven’t been monitored. Clark calls these “probability of occurrence” maps that could be very useful for conservation efforts, showing where there is potentially high biodiversity but low levels of protection. And he points out that the maps are dynamic, meaning they can be updated as more remote-sensing and/or acoustic data comes in, or as conditions on the ground change (due to fires, drought, etc.). 

In the meantime, team members are working to finish up work on the species distribution maps for each of the 54 species, delivering reports detailing results at each of the properties that hosted the recorders, writing up articles describing the methodologies employed—and invented, in some cases—by the project, and making presentations to the public. 

And while the primary value of S2L will be in its impact on both regional conservation efforts and advancing biodiversity monitoring in general, there are some intriguing findings for those simply interested in local birds. One example cited by David Leland was the surprising number of lazuli buntings, a gorgeous songbird with a bright blue head, detected in areas impacted by the 2017 wildfires. “This is a bird that in my many years birding in the Sonoma Valley is pretty unusual; I had never heard or seen it in Sonoma Valley Regional Park prior to the 2017 Nuns Fire. But then in spring of 2018, all of a sudden, there were dozens of them. This is a clear and dramatic sign of a change in the birdscape as a result of wildfire, and I’m sure there will be other instances as we go through the data.”

Of course, the goal is for the methodologies pioneered by Soundscapes to Landscapes to be useful in areas beyond Sonoma County. NASA has requested that the team consult in a similar NASA-funded effort in the Greater Cape Floristic Province of South Africa—like Northern California, a biodiversity hotspot—where researchers are looking at a variety of ways to assess wildlife diversity in conjunction with cutting-edge remote sensing. The role of the S2L team there is to further advance our understanding of bioacoustics as a cost-effective, scalable, and transferable method of assessing changes in biodiversity in habitats around the globe. 

And beyond the technology is the intention—indeed, the urgency—to bring together multiple stakeholders in such conservation science initiatives. According to Point Blue’s Rose Snyder, the logistical lead for S2L, “This colossal effort to scale biodiversity monitoring to a regional scale has been made possible by innovative technologies and by the incredible cross-sector collaboration of universities, public agencies, nonprofit organizations, citizen scientist volunteers, and scientific researchers across a wide range of disciplines.” 

To learn more about Soundscapes to Landscapes, and to see one of the team’s virtual presentations, go to

About the Author

David Loeb was the co-founder and Executive Director of the Bay Nature Institute and the publisher of Bay Nature magazine. Now retired, he continues to roam the trails and waterways of the Bay Area and points beyond and contributes occasional articles to