Celebrating open data | MIT News

The inaugural MIT Open Data Award, which included a cash prize of $2,500, was recently awarded to 10 individual and team research projects. Presented jointly by the School of Science and the MIT Libraries, the award recognizes MIT-affiliated researchers who make their data openly accessible and reusable by others. The award winners and 16 honorable mention recipients were honored at the Open Data @ MIT event held on October 28 at the Hayden Library.

“By making the data open, researchers create opportunities for new uses of their data and to glean new insights,” says Chris Bourg, director of libraries at MIT. “Open data accelerates academic discovery and progress, promotes equity in academic participation, and increases transparency, replicability, and trust in science.”

Recognize shared values

Led by Bourg and Rebecca Saxe, associate dean of the School of Science and John W. Jarve (1978) professor of Brain and Cognitive Sciences, the MIT Prize for Open Data was launched to highlight the value of open data at MIT and to encourage next generation of researchers. Applications were solicited from across the Institute, with particular attention to trainees: research technicians, undergraduate or graduate students, or postdocs.

“By launching an MIT-wide award and event, we aimed to create exposure for the scholars creating, using and championing open data,” Saxe says. “Highlighting this research and creating opportunities for networking would also help open data advocates across campus find each other.”

Recognizing researchers sharing data was also one of the recommendations of the MIT Ad Hoc Research Open Access Task Force, which Bourg co-chaired with Class of 1922 professor Hal Abelson, Department of Electrical and Computer Engineering. An annual award was one of the strategies proposed by the task force to further the Institute’s mission of disseminating the fruits of its research and scholarship as widely as possible.

Fierce competition

The winners and honorable mentions were chosen from more than 70 nominees, representing all five schools, the MIT Schwarzman College of Computing and several MIT research centers. A Committee composed of faculty, staff and one graduate student made the selections:

  • Yunsie Chung, a graduate student in the Department of Chemical Engineering, won for SolPropthe largest open source dataset with temperature-dependent solubility values ​​of organic compounds.
  • Matthew Groh, graduate student, MIT Media Lab, agreed on behalf of the team behind the Fitzpatrick data set 17kan open dataset consisting of nearly 17,000 skin disease images along with skin disease and skin tone annotations.
  • Tom Pollard, a researcher at the Institute for Medical Engineering and Science, agreed on behalf of the Physio Net squad. This data sharing platform enables thousands of clinical research and machine learning studies each year and allows researchers to share sensitive resources that would not be possible through typical data sharing platforms.
  • Joseph Replogle, a graduate student at the Whitehead Institute for Biomedical Research, was recognized for the Genome-wide Perturb-seq dataset, the largest publicly available single-cell transcriptional dataset collected to date.
  • Pedro Reynolds-Cuéllar, MIT Media Lab/Art, Culture, and Technology graduate student, and Diana Duarte, co-founder of Diversa, won for Retosan open data platform for detailed documentation and sharing of local innovations from under-resourced contexts.
  • Maanas Sharma, a college student, was driving States of emergencya national project that analyzes and classifies the responses of prison systems to Covid-19 using data extracted from public databases and manually collected data.
  • Djuna von Maydell, a graduate student in the Department of Brain and Cognitive Sciences, created the first publicly available data set of single-cell gene expression from postmortem human brain tissue of patients carrying APOE4, the major risk gene for Alzheimer’s disease.
  • Raechel Walker, a graduate researcher at the MIT Media Lab, and her collaborators created a Data Activism Curriculum for high school students through the Mayor’s Youth Employment Summer Program in Cambridge, Massachusetts. Students learned how to use data science to recognize, mitigate, and advocate for people who are disproportionately affected by systemic inequality.
  • Suyeol Yun, a graduate student in the Department of Political Science, was recognized for DeepWTOa project that creates open data for use in legal natural language processing research using cases from the World Trade Organization.
  • Jonathan Zheng, a graduate student in the Department of Chemical Engineering, won for a open the IUPAC dataset for acid dissociation constants, or “pKas”, physicochemical properties that govern the acidity of a chemical in a solution.

The complete list of winners and honorable mentions is available on Open Data @ MIT website.

A campus-wide party

The awards were presented at a celebratory event held at the Nexus in Hayden Library during International Open Access Week. School of Science Dean Nergis Mavalvala kicked off the program by describing the long and proud history of open scholarship at MIT, citing the faculty open access policy and the launch of the DSpace open source digital repository. “When I was a graduate student, we were trying to figure out how to share our theses during the days of the nascent Internet,” he said, “With DSpace, MIT was figuring that out for us.”

The centerpiece of the program was a series of five-minute presentations by the award winners about their research. Speakers detailed the ways they have created, used or championed open data and the value that openness brings to their respective fields. Winner Djuna von Maydell, a graduate student in Professor Li-Huei Tsai’s lab that studies the genetic causes of neurodegeneration, highlighted why it is important to share data, especially data obtained from postmortem human brains.

“This is data generated from human brains, so each data point comes from a living, breathing human, who presumably made this donation in hopes that we would use it to advance knowledge and uncover the truth,” von said. Maydell. “To maximize the likelihood of that happening, we need to make it available to the scientific community.”

Members of the MIT community who want to learn more about making their research data open can consult The MIT Libraries Data Services Team.

Leave a Reply

Your email address will not be published. Required fields are marked *