Five College Consortium

Five College Statistics Program


5 College Statistics student wins the Lorna M. Peterson Prize

Hampshire College student Brooke Fitzgerald won the 2017 Lorna M. Peterson Prize. Brooke was awarded the prize for the research projects at Smith and Hampshire and activities such as DataFest that foster collaboration in the data sciences among students and faculty members from across the Five Colleges. The $500 prize celebrates Lorna’s long commitment to collaboration as a means of advancing understanding and expanding opportunity and can be used to support such activities as scholarly work, performance, research or arts projects, conference attendance or work related travel.

Recent 5 College graduates profiled in a ASA article

Several students who recently graduated from the 5 Colleges are profiled in an article describing their careers in Statistics. The article can be read here:

5 College Statistics Prize 2016

The Five College Statistics prize is awarded to one student at each of the Five Colleges, at their discretion. The award may be presented to a student satisfying any one of the following criteria (based on faculty vote from that institution):

∗ outstanding independent research, thesis, or capstone course project in statistics 
∗ outstanding service to statistics on their campus (or across campuses)
∗ outstanding use of 5College statistics resources
∗ outstanding non-senior pursuing further study in statistics
The award winners are: 

UMass Math & StatsJacob Reiser '17 (outstanding non-senior pursuing further study in statistics)

Jacob is doing an advanced class project in predicting basketball outcomes based on various predictors and he’s working in R. He wants to get a graduate degree in statistics. 


UMass BiostatisticsAnusha Kothapalli '16 (outstanding independent research, thesis, or capstone course project in statistics)

Anusha is a senior in the Commonwealth Honors College at UMass-Amherst, majoring in an Independent Concentration in Biostatistics. Anusha has been actively involved as a research assistant and completed an outstanding honors thesis entitled 'Modeling mother to child transmission of HIV'.

Hampshire: Brooke Fitzgerald '18 (outstanding service to statistics on their campus (or across campuses))

Brooke did great work in Introduction to Statistical Learning in the fall of 2015, particularly on her final project in which she applied a large range of supervised and unsupervised machine learning methods to analyze astrometric data on Hyades star cluster. Additionally, Brooke did outstanding work in the spring of 2016 in an independent study on data visualization in which she redesigned the Hampshire Institutional Dashboard to better convey information to the Hampshire Board of Trustees on how much progress is being made to reach Hampshire's Strategic Goals. 


AmherstChristina Wang (outstanding independent research, thesis, or capstone course project in statistics )

Christina did outstanding work on the statistics comprehensive evaluation, in which she modeled rental prices of apartments in New York City based on convenience and proximity  to public transit. Christina is a double major in economics and statistics. Her economics honors thesis used sophisticated statistical techniques to examine the effects of pollution on education outcomes of children in China.


Smith: Yiwen Zhu '16 and Emma Beauchamp '16 (outstanding independent research, thesis, or capstone course project in statistics)

Yiwen (a psychology-statistics double major) and Emma (a sociology major and applied statistics minor) both completed outstanding senior theses in their respective fields. Yiwen studied client emotional response during psychotherapy sessions, while Emma studied the role of stigma in sexual health behavior in adolescent girls. Both applied a wide range of statistical techniques and demonstrated exceptional statistical maturity in their work. 


Ningyue Wang receives the 2016 Boston Chapter of the American Statistical Association (BCASA) Mu Sigma Rho Award

Ningyue (Christina) Wang '16 has been chosen as the recipient of the 2016 Boston Chapter of the American Statistical Association (BCASA) Mu Sigma Rho Award.  Christina was selected as the inaugural winner of the award based on her outstanding achievements in statistics (she had been inducted into Mu Sigma Rho in 2015).

Christina is a double major in Economics and Statistics.  Her senior honors thesis project is titled "The impact of ambient pollution on children’s educational attainment in China".  Christina has worked as a Statistics Fellow in the Department of Mathematics and Statistics at Amherst since the spring of 2015, received the Hamilton Prize in 2013, and serves as Editor-in-Chief of Olio (The Amherst College Yearbook).

This annual award recognizes one outstanding statistics undergraduate per year in the BCASA region (Rhode Island, Massachusetts, Maine, New Hampshire, and Vermont).  Mu Sigma Rho is the national statistics honor society.  The American Statistical Association is the world's largest community of statisticians and the Boston Chapter is one of its largest and most active chapters.

In addition to Christina's award, eleven Amherst College students were inducted into Mu Sigma Rho.  Congratulations to Jonathan Che, Stephany Flores-Ramos, Paul Gramieri, Connor Haley, Azka Javaid, Rishi Kowalski, Levi Lee, Amanda Rosenbaum, Muling Si, Sarah Teichman, and Alex Titelbaum for their academic achievements and distinction.

Rob Kass discusses Statistics and his visit to the 5 colleges on WRSI

Rob Kass, who visited the 5 colleges on April 11th and 12th, discussed 'Where Statistics & Big Data & Your Brain Meet' with Monte Belmonte on WRSI. The full interview can be heard at: 

Five College DataFest 2016

The Five College DataFest was held from April 1st to the 3rd at the University of Massachusetts Amherst. DataFest is a nationally coordinated competition that challenges undergraduates working in teams of up to five to extract meaningful insights from a rich and complex data set. This year's DataFest had 158 registered students compete on 22 teams from the 5 colleges and was the biggest Five College DataFest to date. 

Amherst and Smith students win prizes in the USPROC competition

Amherst students Johannes Ferstad and Thomas Savage won first place in the USPROC competition for their project on Large Differences in County-Level Mortality Rates Related to Race and Economic Advantage. Smith College student Sara Stoudt also won second place in the competition for her project on Geostatistical Models for the Spatial Distribution of Uranium in the Continental United States. Sara will give a talk about her project on October 2nd at the First Annual Electronic Undergraduate Statistics Research Conference.

Rising above the odds, on and off the court

Amherst student Megan Robertson led Amherst college to a basketball championship and triple majored in mathematics, statistics and history. You can read out Megan's accomplishments in the Amherst student newspaper.

5College Statistics Prize

In 2015 we added a prize to our program to celebrate our students and the wonderful happenings in the Valley in statistics. For our inaugural prize, the 5College Statistics prize was awarded to four students chosen from the Five Colleges. The award may be presented to a student satisfying any one of the following criteria:

  • outstanding independent research, thesis, or capstone course project in statistics
  • outstanding service to statistics on their campus (or across campuses)
  • outstanding use of 5College statistics resources
  • outstanding non-senior pursuing further study in statistics

These criteria allow us to celebrate students at all levels of statistics. In 2015 the prize was a book award. The winners received a set of Edward Tufte's books on data visualization.  

The 2015 winners are:

Alexander Bogdan – UMass-Amherst – Awarded for outstanding research and service. Alex has been actively involved as a teaching and research assistant, and his work included the development of statistical models that use nutrition status to predict bone mineral density in college-aged women.

Nicole DelRosso – Hampshire College – Awarded for outstanding work in introductory statistics and service for TA-ing the statistical analysis of neural data class. Nicole plans to pursue a Div-III project next year on DNA computing. 

Jarvis Sill ’15 – Amherst College – Awarded for outstanding work on the statistics comprehensive evaluation, where he applied advanced graphics and models to golf data. Jarvis is a triple major and one of the first statistics majors to graduate from Amherst College.

Weijia (Vega) Zhang ’18 – Smith College – Awarded for outstanding work for a non-senior pursuing further study in statistics. Vega has challenged herself with several statistics courses as a first-year student, and participated on the “Best in Show” Data Fest team.

Five College DataFest 2015

The Five College DataFest, sponsored in part by the Five College Statistics Program, was held the weekend of March 27 and 29 at the University of Massachusetts Amherst. DataFest is a nationally coordinated competition that challenges undergraduates working in teams of up to five to extract meaningful insights from a rich and complex data set.

ESPN publishes an article by Ben Baumer about the use of statistics in baseball

Ben Baumer of Smith College published an article for ESPN about which baseball teams are using sabermetics (statistical analysis of baseball data) to improve their performance. He was also interviewed on ESPN, where he discusses this work.

HackEbola at UMass aids fight against West African epidemic

An article from the Daily Hampshire Gazette:

From across the world, young mathematicians, biologists and other scholars at the University of Massachusetts Amherst crunched numbers this weekend in hopes of aiding the fight against the Ebola epidemic in West Africa.

“It has to start somewhere,” said Andrew Smith, 26, a fourth-year doctoral candidate in organismic and evolutionary biology at UMass.

Smith was among more than 50 students from the Five Colleges who took part in a HackEbola event in the John W. Lederle Research Center at UMass planned from Friday evening to Sunday afternoon.

UMass searching for 2nd position in Biostatistics

The Biostatistics Program at the University of Massachusetts/Amherst is seeking talented applicants qualified for an assistant or associate professor position.  Under exceptional circumstances, highly qualified candidates at other ranks may receive consideration. Individuals with demonstrated potential for or experience in developing an extramurally funded research program in biostatistical methods, with application to areas such as big data, bioinformatics, clinical trials, survey research or epidemiology/public health are encouraged to apply. We will consider applications from individuals without a degree in biostatistics or statistics if they have research experience in the areas mentioned above.

Bray pens article on DataFest in Amstat News

American Statistical Association (ASA) President Nat Schenker writes: “A June Amstat News article by Robert Gould, Benjamin Baumer, Mine Çetinkaya-Rundel, and Andrew Bray described DataFest, an annual Big Data analysis competition for college students. The ASA board, at its April meeting, approved a proposal from the DataFest organizers to make the ASA the national headquarters for DataFest. For this month’s President’s Corner, I invited Andrew Bray, postdoctoral research associate at the University of Massachusetts, Amherst, to write about student perspectives on DataFest. As a UCLA graduate student, Andrew helped Rob Gould, DataFest’s founder, organize the first few UCLA events. This year, he and Ben Baumer organized the inaugural Five College DataFest.”

Five College Guide to R and RStudio available

Students in a number of Five College institutions are using R and RStudio, and a guide to using this powerful (and free) system is now available at

Meetup brings useR happenings to Pioneer Valley

The most recent meeting of the Western Mass Data Science, Stats, and R Meetup brought the latest and greatest from the useR! conference to statisticians of all stripes in the Pioneer Valley. Over drinks at the Amherst Brewing Company, Nick Reich of UMass described the latest iteration of Hadley Wickham's dplyr package, and Andrew Bray of Mt. Holyke College presented ggvis, RStudio's attempt to bring dynamic data visualizations to R. 

Stoudt SC’15 wins first prize in statistics in sports undergraduate research competition

Sara Stoudt of Smith College was awarded first prize in the 2014 Statistics in Sports Undergraduate Research Competition held at the Joint Statistical Meetings in Boston in August. The competition was sponsored by the Statistics in Sports section of the American Statistical Association and was open to all undergraduate students doing research in sports analytics. Stoudt received a $250 cash prize for her entry: The Perfect Bracket: Machine Learning in NCAA Basketball.

UMass School of Public Health searching for tenure-track biostatistician

The Biostatistics Program seeks a tenure track faculty (open rank) with demonstrated experience in developing an extramurally funded research program in biostatistical methods, with application to areas such as clinical trials, survey research, epidemiology/public health, and addressing analysis needs for application in big data including medical informatics, neuroimaging, bioinformatics and genomics.

Five College students recognized in Amstat News for DataFest

Several Five College students were featured in a recent article about DataFest published in the Amstat News. The inaugural Five College DataFest took place at UMass in late March. The article describes how DataFest has spread to multiple locations across the country in 2014. 

Yeazel co-authors paper on pollen seasonality

As part of her work following her Junior Year Abroad program in Cordoba, Spain, Linnea Yeazel (SC ’13) undertook the statistical analysis of pollen seasonality for olive trees in southern Spain. Her collaborators at the University of Cordoba, in conjunction with Smith College professor Esteban Montserrat, found that the olive reproductive cycle is changing considerably, likely due to climate change. The paper is now published in Science of the Total Environment

Amherst College hosts Sports Analytics Conference

The Sports Analytics Conference, held Sunday, April 13, brought together experts in the field of sports analytics, professors and students to learn and discuss the increasing role analytics plays in the world of athletics. The conference included presentations by guest speakers Chris Anderson and David Sally, authors of The Numbers Game, and UMass statistician Gregory Matthews. Ben Baumer of Smith College also participated. Amherst College students presented their personal research in the field of sports analytics.

Horton co-authors alcohol intervention paper published in JAMA

A new paper published in the Journal of the American Medical Association was co-authored by Five College statistician Nick Horton. The paper describes the results of a randomized trial conducted in New Zealand designed to measure the effectiveness of web-based screening and intervention for alcohol use. A webcast can is now available. 

Balasubramanian paper published in Royal Statistical Society Journal

Five College statistician Raji Balasubramanian’s paper on variable importance in matched case–control studies in settings of high dimensional data was recently published in Journal of the Royal Statistical Society: Series C (Applied Statistics). 

Abstract: We propose a method for assessing variable importance in matched case–control investigations and other highly stratified studies characterized by high dimensional data (p>>n). In simulated and real data sets, we show that the algorithm proposed performs better than a conventional univariate method (conditional logistic regression) and a popular multivariable algorithm (random forests) that does not take the matching into account. The methods are applicable to wide ranging, high impact clinical studies including metabolomic, proteomic studies and neuroimaging analyses, such as those assessing stroke and Alzheimer's disease. The methods proposed have been implemented in a freely available R library (

New major in statistics at Amherst College

To meet the educational needs of their students and address the growing demand and interest in the area of statistics, Amherst College has created a new major in Statistics as part of a newly renamed Department of Mathematics and Statistics. The new major will help students develop the capacity to turn data into information that can be used to guide decision-making. The new major consists of foundational courses in mathematics and computer science, along with a series of introductory, intermediate and advanced courses in statistics, culminating in a capstone course (Advanced Data Analysis) and a comprehensive evaluation of a project. More information can be found at the department's website or in this article in the student newspaper.

Pioneers in civic data: Breaking into the Open(Data) and other lessons on approaching a new frontier

Announcing the Five College DataFest

Tge Five College DataFest, sponsored in part by the Five College Statistics Program, was held the weekend of March 29 and 30 at the University of Massachusetts Amherst. DataFest is a nationally coordinated competition that challenges undergraduates working in teams of up to five to extract meaningful insights from a rich and complex data set. A number of prizes were awarded, including "Best in Show," "Best Visualization" and "Best Use of External Data."

Inferring causation without randomization: A matched design to assess the number of embryos to transfer during in vitro fertilization 

  • Cassandra Pattanayak, Wellesley College
  • Monday, September 23, talk at 4:00 p.m. with refreshments at 3:30 p.m.
  • Amherst College, Seeley Mudd room 206

Transferring one rather than two embryos during in vitro fertilization has been endorsed as a way to reduce multiple birth rates, but no large-scale randomized trial has evaluated the impact of the number of embryos transferred on birth outcomes. This presentation describes the design of a non-randomized study that parallels a hypothetical randomized experiment to examine the effect of single versus double embryo transfer. Using national surveillance data from the Centers for Disease Control and Prevention, single and double embryo cycles were paired on estimated propensity scores to create matched treated and control groups that are as similar on the observed background covariates as if the number of embryos transferred had been randomly assigned. This example illustrates a general framework for drawing causal rather than associative inferences from non-randomized studies, and the crucial role of checking balance between treatment and control groups on key background covariates is emphasized. 

Big Data: A perspective on their current uses and potential future uses by the Federal Statistical System

  • Mike Horrigan, Associate Commissioner for Prices and Living Conditions, Bureau of Labor Statistics, Washington, DC
  • Monday, September 30, talk at 4:30 p.m. with tea at 4:00 p.m.
  • Smith College Ford Hall room 240

This talk will explore the world of big data in terms of how they are currently used by the Federal Statistical system and explore possible ways in which big data sources may be leveraged in the future. In an era of declining real budgets for the Federal Statistical agencies, big data are often seen as an efficient and economical way to replace or supplement existing data collection programs. However, the blending of existing Federal data series collected using established statistical survey practices with big data sources that are not necessarily representative samples of a larger universe frame poses some significant challenges to the Federal statistical system, especially in terms of the quality tradeoffs we may be making. We also face challenges in maintaining our goal of methodological transparency when the potential biases of some big data sources are not always well understood. The talk begins with an attempt to define big data. I then present the results (to date) of an environmental scan we are conducting on the uses of big data across Federal statistical agencies as well as a scan of big data uses in academia and private business. The remainder of the talk addresses the issue of the potential future uses of big data, developing a perspective based on existing frameworks for judging the quality of economic statistics as well as looking at the statistical issues associated with blending survey based data with big data sources. I am particularly interested in your thoughts on the statistical issues and potential solutions to such issues posed by blended statistics. I end the seminar with concluding thoughts on the future of big data in the Federal Statistical system based on my role as a Director of several major statistical survey programs.

This talk reflects the current status of a project being sponsored by the American Economic Association Data Subcommittee on Big Data, for which I am a co-chair along with Ana Aizcorbe of the Bureau of Economic Analysis. The goal of the project is to report on the current and potential uses of big data across the Federal Statistical system.

The talk is part of the activities of the International Year of Statistics and is sponsored by the Departments of Mathematics and Statistics as well as Economics at Smith College and co-sponsored by the Five College Statistics Program and the Boston Chapter of the American Statistical Association.

For more information about the talk, contact Katherine Halvorsen (khalvors@smith.edu413-585-3874). More information about Mike Horrigan can be found here:

Hack for Western Mass

On the weekend of June 1, 2013, web and software developers, designers, community organizers, and other folks from all over Western Mass gathered to tackle local challenges with technology at the event Hack for Western Mass. Sponsored in part by the Five College Statistics Program. 

International Year of Statistics Talk

Mark Hansen, Columbia University, Tuesday, March 12, 2013, Ford Hall 240 Smith College.