Five College Statistics Program


Upcoming Events

Statistical Analysis of Big Genetics and Genomics Data

Xihong Lin, Harvard T.H. Chan School of Public Health

Monday, November 16, 4:30 p.m., Amherst College, Seeley Mudd 206 -- refreshments to precede talk at 4 p.m. in Seeley Mudd 208.

RSVP by Tuesday, November 10, 2015, to

Abstract: The human genome project in conjunction with the rapid advance of high throughput technology has transformed the landscape of health science research. The genetic and genomic era provides an unprecedented promise of understanding genetic underpinnings of complex diseases or traits, studying gene-environment interactions, predicting disease risk, and improving prevention and intervention, and advancing precision medicine. A large number of genome-wide association studies conducted in the last ten years have identified over 1,000 common genetic variants that are associated with many complex diseases and traits. Massive targeted, whole exome and whole genome sequencing data as well as different types of -omics data have become rapidly available in the last few years. These massive genetic and genomic data present many exciting opportunities as well as challenges in data analysis and result interpretation. They also call for more interdisciplinary knowledge and research, e.g., in statistics, machine learning, data curation, molecular biology, genetic epidemiology and clinical science. In this talk, I will discuss analysis strategies for some of these challenges, including rare variant analysis of whole-genome sequencing association studies; analysis of multiple phenotypes (pleiotropy), and integrative analysis of different types of genetic and genomic data. 

Data Visualization at The New York Times

Amanda Cox is a graphics editor at The New York Times

Friday December 4th, 4:30 p.m., Smith College, Ford Hall Room 240, 100 Green Street, Northampton, MA

AbstractAmanda Cox is a graphics editor at The New York Times, where she makes charts and maps for the paper and its website. A 2002 graduate of St. Olaf College, she worked at the Federal Reserve Board and earned a master's degree in statistics from the University of Washington before joining the Times in 2005. She was part of a team that won a National Design Award in 2009 and received the Excellent in Statistical Reporting Award from the American Statistical Association in 2012.

Recurring Events

Past Events

Statistics and Policy Making: The Case of Greece

Andreas Georgiou, Former Head of Greek Statistics Office

Wednesday, November 11, 8:00 p.m., Amherst College, Converse Hall, Cole Assembly Room

Abstract: The talk will focus on the importance of statistics in effective policy making and good governance and will relate his five years of working as the head of the Greek Statistical Office. In 2010, Georgiou was appointed the head of the Greek Office, after serving as an economist for 21 years at the IMF. His discoveries regarding the Greek deficit led him to the center of a political firestorm that included personal felony charges that could have led to a life in prison and enormous fines. Amherst College Converse Hall, Cole Assembly Room, 8 p.m.


Data Visualization with D3

Dana Udwin and Deirdre Fitzpatick

Thursday, November 12, 7:00 p.m., Smith College, McConnell 103 

Abstract: Data visualizations empower individuals of all backgrounds to explore complex, multivariate data, an area where traditional statistical analyses can fall short. 

Dana Udwin and Deirdre Fitzpatick will present a motivating use case from their work at MassMutual Data Labs in Amherst and walk through creating data visualizations from a beginners’ perspective. In the process, they will introduce web coding basics and useful JavaScript libraries, including D3, C3, dc, and Crossfilter, that make such visualizations simple to build and host online. 

Colloqium on "Causal inference: identifying subgroups by their response to treatment”

Sarah Anoke, Department of Biostatistics, Harvard T.H. Chan School of Public Health  

Thursday, October 22, 4:30 p.m., Amherst College, Seeley Mudd 206
Abstract: Causal inference is a field of statistics focused on measuring a particular type of relationship between two variables. Referring to these two variables as the `treatment’ and the `outcome’, we consider the value that an individual’s outcome would take if the treatment was present, and the value that the individual’s outcome would take if the treatment was absent. The difference in these two potential outcomes is the treatment effect. Every individual has their own individual treatment effect (ITE). But because only one of these two potential outcomes is observable, ITEs cannot be estimated from observed data. To overcome this problem, the average outcome among a group of individuals unexposed to treatment is subtracted from the average outcome among a group of individuals exposed to treatment, yielding an average treatment effect (ATE). It is of interest to identify subgroups for which the subgroup-specific ATE is very different from the overall ATE. Knowing the overall ATE is arguably misleading; we would prefer to know that the drug has no effect within women but a dramatic effect within men. How then, can the data tell us which subgroups respond particularly well or poorly to treatment, without advance knowledge of these subgroups?

Colloqium on "Statistical tools and challenges for monitoring migratory birds”

Emily Silverman, Statistician, Division of Migratory Bird Management, U.S. Fish & Wildlife Service

Friday, October 30th, 2015 2:00 p.m., Amherst College, Frost Library 211

Abstract: Federal management of migratory birds began 100 years ago, when the United States signed the 1916 Convention for the Protection of Migratory Birds with Great Britain (for Canada).  These protections were codified in the Migratory Bird Treaty Act (MBTA) of 1918, which now covers over 800 species of birds, and stands as one of the earliest U.S. environmental laws.  The evolution of management approaches since the MBTA has led to the development of monitoring programs and quantitative methods in wildlife science. I will present the history of bird monitoring and statistical methods for population assessment and will discuss new approaches, challenges, and how a solid understanding of statistical concepts is essential for informed management.  Drawing on examples from my own work, I will highlight the interdisciplinary skills needed to operate effectively as a scientist and statistician in a resource management agency.  As our ability to collect information about the natural world expands in an increasingly digital world, the need for innovative, technically-adept wildlife scientists is expanding.

  • On Sunday, February 15 from 1 PM – 4 PM, Amherst College will be hosting a Sports Analytics Forum in the Cole Assembly Room on campus. The day will consist of several guest speakers who work and/or do research in the field of Sports Analytics, in addition to a handful of research presentations by students. This event is free and open to the public. If you would like an in-depth look into the day, please visit: Sports Analytics Forum Details.
    • Sunday, February 15 from 1 PM – 4 PM, Amherst College, Cole Assembly Room in Converse Hall, Amherst, MA.

  • Taylor Arnold, PhD of AT&T Labs will talk on: Oh the Places You'll Go: The Surprising Complexity of Statistics' Most Basic Model. Abstract: A Bernoulli distribution describes a process which has only two outcomes. Despite the simplicity, it is used in a wide array of applications including a simple coin toss, the outcome of a sporting event, weather models, and the winner of an election. A careful analysis of the Bernoulli distribution also raises a number of theoretical questions leading to topics such as Bayesian inference, minimaxity and estimator theory. Fortunately, the relatively uncomplicated nature of the Bernoulli model allows such questions to be studied without the advanced mathematical machinery required for a more general treatment. This talk will explore these practical and theoretical considerations, with a focus towards the `big questions' which continue to guide modern statistical research. Two real datasets, from baseball and medicine, will be used throughout to guide the discussion.
    • Monday, February 16, 4:30PM, Amherst College, Seeley Mudd building Rm. 206. 

  • DataFest 2015: DataFest is a nationally-coordinated undergraduate competition in which teams of up to 5 students work over a weekend to extract insight from a rich and complex data set. The mission of DataFest is to expose undergraduate students to challenging questions with immediate real-world significance that can be addressed through data analysis. Apart from developing data analysis and team building skills, students can win cash prizes, fame, glory, or some combination thereof… and will get a free t-shirt!
    • Friday, March 27, 6:30PM, to Sunday, March 29, 4pm. University of Massachusetts, Amherst, MA.

  • Five College DataFest
  • New England Statistics Symposium
  • HackEbola: Sponsored by Graduate Researchers in Data (GRiD) will take place at UMass over the weekend before Thanksgiving, Nov 21-23, 2014. The event will bring students, faculty, and other professionals together to work on learning more about Ebola by analyzing real-time data on the outbreak and response. For more information, check out this website:
  • 2014 Alice Ambrose Lazerowitz-Thomas Tymoczko Dinner and Lecture: "Predictive Accuracy and the Bayesian Approach to Inductive Inference,James M. Joyce, Cooper Harold Langford Collegiate Professor of Philosophy and Statistics, University of Michigan. Thursday, December 4, 2014, Smith College at 5:30pm.

  • Ben Alamar, Sports Analytics Consultant and Researcher, currently works for ESPN, author of "Sports Analytics: A Guide for Coaches, Managers, and Other Decision Makers" and also was the founding editor of the Journal of Quantitative Analysis in Sports ( Wednesday, November 5, 2014, 7:30PM, Amherst College, Lewis-Sebring Commons (Desserts will be served)

  • Graduate study in Biostatistics: a panel discussion with faculty and current students from the Center for Statistical Sciences at Brown University. 
    Come hear about opportunities for graduate study in biostats (both MA and PhD) from Prof. Christopher Schmid and two current students at the Center for Statistical Sciences. Tuesday, October 21st, 2014, 4:30 pm, Amherst College, Seeley Mudd 206
  • "A Crash Course in Shiny via ShinyHelper," Jay (John) Emerson from Yale has offered to give a crash course in Shiny, a straightforward means to create interactive web applications in R and RStudio.Wednesday, Sept. 18, 4 pm, Seeley Mudd Room 207, Amherst College
  • "Inferring Causation without Randomization: A matched design to assess the number of embryos to transfer during in vitro fertilization," a talk by Cassandra Pattanayak, Wellesley College. Monday, Sept. 23, 4 pm, Seeley Mudd 206, Amherst College
  • "A beginner's guide to using SQLite with R: database usage for fun and profit," Nick Horton will be giving a beginner's guide to using SQLite with R, in the context of the Data Expo 2009 airline delays dataset ( This includes 150,000,000 rows corresponding to every commercial flight in the US from 1987 to 2012. Thursday, Sept. 26, 7 pm, National Priorities Project, 243 King Street, Northampton, MA
  • "Big Data:  A perspective on their current uses and potential future uses by the Federal Statistical System," a talk by Mike Horrigan, Associate Commissioner for Prices and Living Conditions, Bureau of Labor Statistics. Monday, Sept. 30, 4:30 pm, Ford Hall 240, Smith College
  • "Turning a group of programming newbies into R users," Andy Smith. Wednesday, Oct. 2, 7 pm, Room 222, Morrill Science Center, UMass
  • "openWAR: An Open Source System for Overall Player Performance in Major League Baseball," with Ben Baumer and Greg Matthews.  Thursday, Oct. 17, 4:30 pm, Room TBD, UMass
  • "Mapping Spatial Dynamics of Yellowtail Flounder on the Northeast Shelf with R," a talk by Megan O'Connor and Carl Dunham. 
    • Tuesday, Oct. 22, 7 pm, McConnell Room B05, Smith College
  • "Business Analytics Research at IBM," a talk by Bonnie Ray of the IBM T.J. Watson Research Center.
    • Thursday, November 14, 4 pm, Seeley Mudd 206, Amherst College, refreshments at 3:30 pm. 
  • "TBA," a talk by Brianna Heggeseth, Williams College
    • Monday, November 18, 4 pm, Lederle Graduate Research Tower 1634, UMass, tea at 3:45 pm.
  • "Taking a Passion for Statistics to the Classroom, into the MOOC World and Back Again," by Lisa Dierker, Wesleyan University. 
    • Monday, March 24, 4:30 pm, Amherst College, Seeley Mudd 206 -- refreshments to precede talk at 4 pm in Seeley Mudd 208
  • "STAT4STEM: Online resources to help students (and instructors!) learn stats," by Eric Simoneau (Boston Latin) and Neil Heffernan (Worcester Polytechnic Institute) will be talking about their successful ASSISTments and STATS4STEM projects.
    • Monday, September 22, 4:30 pm, Amherst College, Seeley Mudd 206 -- refreshments to precede talk at 4 pm in Seeley Mudd 208
File attachments: