Dung beetle (Photo: Alex Wild)
Natural History collections hold material going back centuries, but the digital revolution means their holdings are now open to everyone, pending the process of digitization. Properly digitizing specimens consumes enormous resources, particularly the one we all have so little of: time.
But the Entomology Collection in the Biodiversity Center has recently reached a milestone by putting about 1 percent of the collection online. The collection leans towards ants and a few other groups, representing efforts that started in 2016.
Here’s what’s actually involved in the process of digitizing a specimen. The focus is on transcribing printed or written specimen labels, like the label on the dung beetle in the photo on the left. The experience level of the data entry person will determine the steps involved. If a volunteer is entering the data, it will go into a spreadsheet for vetting. If a curator or more experienced volunteer is at the helm, the data will go directly into the database. Each data-captured item is then given a unique number label. The local database is hosted at TACC (Texas Advanced Computing Center) where staff built a portal for releasing the curated data to the public and out to GBIF (link: https://www.gbif.org/dataset/ba9984d8-d982-4fe6-b81c-a7585790034a). All specimens are databased in this manner, but other info is also included: objects associated with the specimen like wasp nests, plant galls, sound recordings, photos, and DNA isolates.
This does sound pretty streamlined and straightforward, but Dr. Alex Wild, Curator in the Entomology Collection, states this isn’t without several big challenges. He puts it like this: “First, there is the enormity of the project. People have been adding to the collection for decades. The oldest specimens are from the 1870s.” This means there is a massive backlog of uncatalogued material, and as already described above, a lot of work necessary to properly input the data. “It would probably cost about $4 million in minimum wage salary to get all the way through our existing 2 million specimens,” Wild continues. “No granting agency gives out that kind of money. National Science Foundation grants for these types of projects are usually in the $500,000 range.”
The other problem that occurs is with deciphering old labels. With specimens ranging back to the 1870s, the world has changed a lot in the last 130 years. Names of places where specimens were found are different, borders of states and countries have moved, and some countries don’t even exist anymore. Additionally, handwritten labels can also be messy with age and wear, or done in nearly-illegible handwriting. Wild elaborates: “I once spent a day figuring out from historical postcards and archives that a scrawled note under several of our specimens saying ‘EKALELA’ referred to a girl's horse-riding camp in Colorado in the 1920s.”
Luckily for curators, someone figured out that obscure names referring to kids’ camps isn’t the best labeling practice, and in the 1990s when GPS units became available, entomologists began using latitude and longitude coordinates on their labels. This does add an extra step of research when inputting data, but not to the extent of locating defunct places of human activity.
The issue of consistency when managing errors and redundancies also becomes a factor. “The same collector might be listed differently on different labels,” Wild explains. “For example, one label might say ‘J. Abbott’ or ‘J.C. Abbott’ even though they are the same person. This is the same for locations, like ‘BFL,’ ‘Brack. Trac.’ Or ‘Brackenridge Field Lab.’ So, a lot of effort goes into trying to enforce singular names for species, places and people."
All this aside, digitizing is worth the effort. Digitized collections open up the specimens for research by institutions and individuals from around the world, something important during times like now where a pandemic has forced many to work from home computers. It also brings a level of longevity to specimens when they are digitally archived. Specimens and their tags are not immune from decay, and other adverse forces of nature.
Thanks to Alex Wild, Curator in the Entomology Collection, for his input on this piece.