Crowd-sourced project to build Yale theater history database

Yale’s Digital Humanities Lab is working with the Robert B. Haas Family Arts Library to make data from hundreds of Yale theater programs easily accessible to anyone with an internet connection.
test test
Through the Ensemble @ Yale project, community members can choose digitized programs for Yale dramatic productions dating as far back as 1925 and then mark and transcribe the information in them.

While a student at the Yale School of Drama in 2010, Oscar-winning actress Lupita Nyong’o performed in a production of Anton Chekhov’s “Uncle Vanya.”

Who shared the stage with her? Who directed? Who designed the sets and costumes? What other productions has Nyong’o appeared in at Yale? Who directed those?

Answering these questions currently requires a labor-intensive dive into the archives, but Yale’s Digital Humanities Lab is working with the Robert B. Haas Family Arts Library to make data from hundreds of Yale theater programs easily accessible to anyone with an internet connection. They have enlisted the public’s help in the effort.  

Ensemble @ Yale is a crowd-sourced project to create a database of Yale theater history. Modeled on the New York Public Library’s (NYPL) crowd-sourced endeavor to transcribe data from its collections of historical theater programs, the project invites people to browse through digital images of Yale programs and mark and transcribe various data, such as the titles of plays, production dates, the directors, and the names of cast and crew members.

The digitized programs, which span more than 90 years, are from the Yale School of Drama and Yale Repertory Theatre Ephemera Collection, which is housed at the Arts Library.

“There is a lot of research interest in the collection because so many notable people and interesting plays are associated with the Yale School of Drama and Yale Rep, but there isn’t an easy way into it,” Lindsay King, the Arts Library’s associate director for access and research services, and one of the project’s founders. “This project aims to correct that.”

12,000 pages of theater programs

Among the project’s initial steps was scanning about 12,000 pages of theater programs — a task performed over the last spring semester by a team of undergraduate employees, with funding from Yale Repertory Theatre in honor of its 50th-anniversary season.

Douglas Duhaime, developer at the Digital Humanities Lab, adapted “Scribe” data transcription software to suit Yale’s project. Monica Ong Reed, the lab’s user experience designer, created a user-friendly website where visitors can easily navigate the digitized programs and choose ones to mark and transcribe. Staff at the Arts Library and Yale Rep tested the original interface and provided feedback, identifying any difficulties they encountered.

“We wanted to make this fun for people,” King said. “We want the experience of contributing to be a satisfying one. The DH Lab made a beautiful and engaging site.”

Contributors can login to the website through Google, Facebook, Twitter, or with a Yale ID. (Anyone can contribute. Yale affiliation is not required.)

The digitized programs are categorized by era. The first group covers the Yale School of Drama from 1925, the year the school was founded, to 1955. The second consists of programs from 1955 to Yale Rep’s founding in 1966. Each of the remaining four categories covers productions during the tenure of a dean of the Yale School of Drama: Robert Brustein, Lloyd Richards, Stanley Wojewodski Jr., and current dean James Bundy.

Contributors can select an era and scroll through the programs, which are arranged chronologically. The site can also serve up programs at random to mark and transcribe. Marking the programs involves drawing boxes around the targeted data, such as the title, production date, or director. (The programs contain other information that is not relevant to this particular project, such as advertisements.) The marked information is then transcribed into a series of textboxes.

Enabling dynamic research projects

The transcribed data will be massaged and compiled into a database that will enable a wide variety of dynamic research projects, King said.

“You could identify all of the actors who have played the role of Hamlet at Yale and create a visualization charting the different productions of the play over time,” she said. “You could create network visualizations showing connections between various actors and directors who have studied and worked at Yale together.”

The idea of this crowd-sourced project predates the Digital Humanities Lab, which opened in the fall of 2015. It was inspired by the NYPL’s original version of Ensemble, which asked users to transcribe vaudeville programs, as well as another NYPL successful project that enlisted the public to transcribe collections of historical restaurant menus.

“We figured that if random historical menus caught people’s attention, then so would our collection of theater programs,” King said. “It is a colorful and fascinating collection, and we already have a community of interest.”

Once transcriptions for the School of Drama and Yale Rep programs are completed, data from other Yale theater-related collections, such as the Arts Library’s Yale Summer Cabaret archive or the Yale Dramatic Association and Yale Cabaret archives in Manuscripts and Archives at Sterling Library, could be the next additions.

The crowdsourcing concept is already in use by other types of collections, such as the University of California-Davis’ collection of wine labels and the Sloan Digital Sky Survey’s three-dimensional maps of the universe.

To begin marking and transcribing, visit

Share this with Facebook Share this with X Share this with LinkedIn Share this with Email Print this

Media Contact

Mike Cummings:, 203-432-9548