Introducing DS2 — the future of data science at Yale

Yale University today announces a major expansion in teaching and research in data science. The rapidly increasing availability of data and tools for its analysis have led to an explosion of new insights that are transforming our understanding of everything from human behavior to the structure of the universe.

Yale has transformed its Department of Statistics into a Department of Statistics and Data Science — called, informally, DS squared or DS2 — making it one of the first institutions of higher learning to have a department of this kind. The department officially changed its name on Jan. 1, and on March 2, the Yale College Faculty approved a new undergraduate major in statistics and data science.

The field of data science encompasses the entire lifecycle of data, from its specification and generation, gathering, and cleaning, through its management and analysis, to its use in making decisions and setting policy. Yale’s DS2 department and its associated undergraduate and graduate courses will reflect this range of topics, with departments from anthropology to astronomy offering courses in the major.

“The basic ideas of statistics and data science are becoming almost a core competency for citizenship in this century,” said Alan Gerber, dean of social sciences for Yale’s Faculty of Arts and Sciences (FAS), and the Charles C. & Dorathea S. Dilley Professor of Political Science. “These are things everyone will find useful in their life.”

Over the next several years, Yale’s FAS will hire as many as nine new faculty and scholars as part of the DS2 initiative. Up to three new faculty members will have full DS2 appointments and help form the core of the expanded department; up to six scholars will hold joint appointments between DS2 and another department within FAS, including departments in the natural sciences and engineering, social sciences, and humanities. Yale will give particular emphasis to identifying scholars who are at the forefront of research in the application of data science to a particular discipline.

“We are building bridges between many disciplines and departments,” said Harrison Zhou, who has been chair of the Department of Statistics since 2012 and now is chair of the new department. “Data science is about more than just big data. It is about collaboration, analysis, and policy.”

Zhou said the move also sends a clear signal that Yale wants to work with the best students and faculty involved in data science. The faculty hired for these roles will be at the top of their respective disciplines, he said.

Data science brings together traditional work in statistics with advances in machine learning, data mining, and high-performance computing. The field is informed by its applications in the sciences, social sciences, humanities, medical sciences, and the arts.

“It’s popping up all over the sciences, social sciences, and humanities,” said Daniel Spielman, Yale’s Henry Ford II Professor of Computer Science and Statistics and Data Science. “Data science has the potential to transform science.”

Spielman noted there is a heightened demand for data science expertise and collaboration on campus, recalling an informal meeting at Yale two years ago to talk about data science. “It was at eight in the morning, and 50 faculty members showed up,” Spielman said.

Spielman, Zhou, and associate professor of electrical engineering and statistics Sekhar Tatikonda guided the effort to broaden Yale’s approach to data science. They also acknowledged the campus-wide commitment necessary to elevate data science to a named department, including support from Dean of the Faculty of Arts and Sciences Tamar Gendler, Dean of the School of Engineering and Applied Sciences T. Kyle Vanderlick, and Gerber.

The DS2 initiative comes at a time when Yale is deepening its commitment to data science on a number of fronts:

• The Yale Center for Research Computing, established in 2015, moved into expanded facilities on Science Hill in December, increasing its capacity to help faculty and staff address the complex challenges of storing and analyzing huge volumes of data.

• The Yale School of Medicine has several data science-related projects, including the Yale Center for Outcomes Research and Evaluation (CORE), the new Center for Biomedical Data Science, and the Yale Open Data Access (YODA) Project.

• Yale recently signed an agreement with Amazon Web Services that will open the door for Yale faculty to do cloud-based research using HIPAA information.

• The Digital Humanities Lab in the Yale University Library functions as a campus hub for digital humanities research and teaching.

• Yale continues to see an increase in the amount of data flowing into the university on a daily basis. Advanced technology such as the university’s new Krios cryo-electron microscope and facilities such as the Yale Center for Genome Analysis have the capacity to produce vast quantities of data.

“This is just the beginning,” Spielman said. “We don’t exactly know where data science is going to go, but we know we want to be out in front of it, leading the way.”

Campus & Community

Science & Technology