Peabody specimens thrive online
“Start Here” instructs a blue Post-It note stuck on a rack of wooden specimen drawers in the Yale Peabody Museum of Natural History’s entomology collection. Each drawer contains rows of butterfly specimens of various sizes and colors carefully pinned under glass.
A student worker will pick up that first drawer, move the Post-It to the next one, and remove the butterflies. The data on the labels attached to the pins, along with digital images of certain specimens, will be entered into the Peabody’s collections management database, where researchers worldwide can access them with a few keystrokes and mouse clicks.
The butterflies are among more than 13 million specimens and other objects housed at the Peabody Museum, which is marking its 150th anniversary this year. For more than 125 years, the museum’s collections were cataloged in handwritten and typed ledgers, and searching the collections meant visiting the museum.
Today, about two-thirds of the labels on the museum’s specimens are digitized. The Peabody has one of the best-developed and expansive digitization programs among natural history institutions in the United States, according to Integrated Digitized Biocollections (iDigBio), a collaborative effort funded by the National Science Foundation to coordinate digitization efforts among hundreds of natural history institutions in the United States.
Among the more than 700 institutions that have provided records to iDigBio’s online data portal, the Peabody ranks in the top 10 with more than 1.2 million records contributed, according to Gil Nelson, iDigBio’s digitization specialist.
“This ranking along with [Peabody] staff’s continued willingness and enthusiasm for offering input and coordination in community activities clearly demonstrates Yale’s leadership in North American biodiversity specimen digitization best practices and programs,” Nelson wrote in a Feb. 14, 2016 letter to Tim White, the Peabody’s director of collections and operations.
The digital records can include a wide range of data — called metadata — such as where, when, and how a specimen was collected; the geographic coordinates of a collection site; a digital image when possible; links to archival information like field notes or published articles; and even genetic information.
Making this information available online enables a vast range of scientific study, including research into pressing environmental and ecological issues such as global climate change and the decline of honeybee populations. Digitization also promotes teaching and learning using the collections. Multiple programs are being developed at the Peabody to provide elementary, middle and high school teachers lesson plans drawn straight from digitized collections.
“It’s just spawned a revolution in how the metadata are used,” said Larry Gall, the Peabody’s head of computer systems and collections manager for its Division of Entomology, who has spearheaded the museum’s digitization efforts for 25 years.
Leather-bound ledgers to digital databases
Eric Lazo-Wasem, senior collections manager in the Division of Invertebrate Zoology, points at an old Royal typewriter on a shelf above his desk.
“When I started here in 1983, we typed all the specimen data on index cards,” he said, removing a stack of index cards from a cabinet in his office. “We’d do two sets of these for each specimen. I was essentially a typist back then.”
The Peabody’s first personal computer, an IBM with two floppy disk drives, arrived in about 1985, Lazo-Wasem said.
By 1991, the museum had established a computer systems office, and Gall began creating electronic catalog records for all the Peabody’s specimens and artifacts, dating to the museum’s founding in 1866.
“We had these wonderful leather bound ledgers dating to the 1860s,” Gall said. “We covered them with Post-It notes and mapped the biological information in them, to figure out how to enter them most efficiently into the database.”
The old records presented challenges. For example, the famed Yale College fossil hunting expeditions of the 1870s led by paleontologist Othniel Charles (O.C.) Marsh shipped train cars full of specimens from the American West to New Haven. The contents of the boxcars were divided among various departments at the museum, each with its own cataloging methods.
“When I started here in 1983, we typed all the specimen data on index cards. … I was essentially a typist back then. … The digital world has made me a halfway decent photographer.”
— Eric Lazo-Wasem, senior collections manager, Division of Invertebrate Zoology
“We spent the first decade translating those ledgers and the central accession logs that documented everything that came into the museum,” Gall said. “Individual departments attacked their own ledgers, field notes, index cards, or whatever they had used. My job was to steer the boat and make sure that people had comparable funding and that we were consistent in developing best practices with our databases.”
After getting a handle on all the museum’s metadata the emphasis has begun to shift to creating digital images of as many specimens as possible.
“Digital cameras are now ubiquitous in every department,” said Gall. “All of them are doing at least some imaging. We’ve tricked out various workflows for that.”
The Division of Invertebrate Paleontology has received federal grants to image tens of thousands of fossils — brachiopods, corals, trilobites, and more. Staff members wear a headset and read aloud label data from each specimen into voice recognition software that incorporates the information into a spreadsheet. Gall developed software that “cooks” the spreadsheet and imports it into the collection database, generating hundreds of images and records at once.
Some collections lend themselves to digital imaging more readily than others, notes Gall. Plant specimens are pressed onto standard sized paper, which makes it easy to image them in large batches. The Peabody is part of a consortium of New England institutions that received grant funding to build a conveyor belt system to digitize collections of vascular plants. The conveyor system is being used at Harvard University and is scheduled to come to Yale’s West Campus next year.
“Wet” specimens, often preserved in alcohol-filled glass jars and vials, are much more challenging to photograph.
Lazo-Wasem said it’s often best to photograph specimens before they are submerged in alcohol. He said he stayed up into the early morning hours during collecting expeditions in Antarctica photographing fresh specimens.
“The digital world has made me a halfway decent photographer,” he said.
The Invertebrate Zoology Division has federal funding to image 60,000 microscope slides of specimen material from the collection. Once completed, researchers will have access to high-resolution images and the capability to zoom in on certain slides.
“I’m seeing things here that I can’t see through a microscope,” he said, demonstrating the zoom function on one of the high-resolution images.
Michael Donoghue, Sterling Professor of Ecology and Evolutionary Biology, set two leaf specimens on a table. One had serrated edges. The other’s edges were smooth.
“It turns out there’s a real pattern to this around the globe, and it relates to climate,” he said. “Leaves from cold climates — maples, birches, oaks — have teeth. Leaves from the tropics tend to have smooth edges.”
Donoghue, the Peabody’s curator of botany, and his students are documenting the latitudinal pattern of the shapes of leaves and investigating why the shapes differ depending on climate. The work, which involves measuring leaf specimens from all over the globe, is greatly facilitated by digitizing specimens, he said.
“If we had images of these specimens from all different latitudes and species, we could score them for the number of teeth they have and put all that information into a database that we’d have at our fingertips,” said Donoghue, who directed the Peabody from 2003 to 2008. “Right now we can’t do that.”
Donoghue, who was instrumental in coordinating national efforts to digitize natural history collections, said the Peabody has been a leader in enabling research through digitization.
“It’s going to foster a huge amount of research,” he said.
Patrick Sweeney, senior collections manager for the Division of Botany, has prioritized getting images and data of all Yale Herbarium specimens online.
“This enables all kinds of research,” Sweeney said. “It supports the Peabody’s and Yale’s mission in making the collections accessible globally any time and anywhere.”
Much of the Peabody’s imaging work is funded through federal grants as part of the iDigBio initiative, often in arrangements called Thematic Collection Networks (TCNs), in which groups of institutions, often museums and universities, collaborate on developing strategy for digitizing collections to address a specific research theme, such as the effects of climate change.
Sweeney is leading a TCN to digitize about 1.3 million vascular plant specimens at Yale and six other New England universities to create a dataset to help researchers study the effects of climate change. (The conveyor belt was built through this project.)
The dataset, which is available online, includes information on phenology, or how and when plants bore fruit, produced their leaves, or flowered. Researchers are studying changes in phenology to gauge the effects of global warming on trees and other plants.
Digitization has also enabled research into the decline in populations of honeybees and other pollinators. From 2009 to 2012, the Peabody digitized the label data of about 20,000 bee specimens in the Entomology Division. The metadata has been cited in several dozen peer-reviewed articles tying the decline of bee populations to factors like climate change or the use of neonicotinoids in insecticides, Gall said.
Teaching with digital collections
Digitization enhances the Peabody’s ability to fulfill its educational mission. A collaborative project to digitize images and data of 500,000 insect fossils at several institutions will equip teachers from kindergarten through high school with lesson plans using fossilized dragonflies, tsetse flies, and other insects. The project, another TCN, will also enable the study of insect response to environmental change and patterns of insect biodiversity through time.
“If you can put the collections onto a laptop, then you have the potential to teach [schoolchildren] to work with collections in their classrooms in the same way that we work with collections.”
— Christopher Norris, senior collection manger, Division of Vertebrate Paleontology
“Fossil insects look a lot like modern insects, so kids can recognize them, whereas some of the other things you work on in paleontology aren’t quite so accessible,” said Susan Butts, senior collections manager for the Division of Invertebrate Paleontology.
Digital specimens remove logistical hurdles to teaching students to work with museum collections, said Christopher Norris, senior collection manger for the Division of Vertebrate Paleontology.
“Teaching kids how to use museum collections for research is also tricky if you have to bring them into the museum, bring them behind the scenes into the collections, and give them space to work,” Norris said. “If you can put the collections onto a laptop, then you have the potential to teach kids to work with collections in their classrooms in the same way that we work with collections.”
The project’s digital portal, called iDigPaleo, will provide tools to allow students to map, measure, magnify, and annotate specimens. The portal will consolidate the digital information from the partnering institutions and will allow teachers to logon and access datasets of fossil records for their lessons plans. They can assess students’ work through the portal as well.
“It offers all of the things you could do in a classroom activity, focused on real data accessed via the web,” Norris said
The portal is scheduled to launch in the fall of 2016. Once established, it will be expanded to include other types of fossil collections.
Digitizing the Peabody’s specimens is an endless task as the collections grow and researchers’ needs and focus change.
Gall said the museum has done some three-dimensional imaging and CT-scanning, and he expects that those technologies will reach scale during this decade.
The collections management database is continually augmented with new types of information. For example, the museum has begun digitizing its archival collections, including field notes, journals, and correspondence.
“That’s the real hidden golden stuff,” said Gall. “We’re able to take the original field notebook, or even a page in it, and tie that to the specimen or an expedition and have digital proxies available.”
The museum is using georeferencing mechanisms to assign latitudinal and longitudinal coordinates to where a specimen was collected. For instance, a label indicating that a plant specimen was collected “three miles from Hamden center” can be assigned coordinates along with a measure of their precision, Gall said.
When specimens are sampled for DNA analysis, the genetic information can be added to a record’s digital metadata.
“That’s full-bore information on a specimen’s history, it’s genetic history, everything,” Gall said.
The Peabody’s undergraduate and graduate student workers are crucial to managing the endless digitization workload, according to Gall.
“They’re smart, fast, and they know computers,” he said. “We get the work done at a high quality, and they love to work in the museum. We can empower all of these digital pipelines with the students. It’s a huge win-win all around.”
Charlotte Newell, a senior who lives in Trumbull College, has worked with Gall digitizing specimens in the entomology collection since her freshman year. Newell, who is majoring in global affairs, has spent many quiet hours entering label data into the database.
“I actually find it really relaxing to go work with all the cool and beautiful butterflies,” she said. “It’s interesting work but it doesn’t require intense problem-solving. It’s nice to work with all of this interesting material without stressing about it. And Larry is a great boss.”