Acquiring Thesis Data

Students should select a thesis data set prior to beginning the third semester. This can be facilitated during the practicum process by selecting a practicum site that offers data, by securing permission to work with Mailman faculty data, or by selecting one of the many suitable publicly available data sets.

Practicum Associated Data

If one is planning to use practicum acquired data, the practicum agreement form should be negotiated carefully and contain information to this effect. Care should be taken in regard to clarifying whether the potential practicum data can leave the practicum site or whether it must be used at an on-site computer. Students should investigate whether separate IRB approval is necessary for use of the data for the purposes of their thesis.

Mailman Faculty Data

Students completing a practicum at the Mailman School frequently work under the supervision of faculty with grants and data for which they welcome student analysis. Many faculty also welcome student use of their data even if the student received practicum equivalent experience or did their practicum elsewhere. Mailman Faculty Research Interests lists all Department of Epidemiology faculty, their research interests, and e-mail addresses. In addition, Dr. Renee Goodwin has a collection of more than 20 data sets from Department of Epidemiology faculty projects covering a variety of interest areas such as infectious disease, cancer, asthma, mental health, genetics, risk behaviors, and others. These are available for student use in their thesis work. Interested students should contact Dr. Goodwin directly.

Public Use Data Sets

The range and scope of public use data sets for epidemiologic analysis is not fully appreciated. Many have been collected using scientifically appropriate sampling methods and have large sample sizes that allow analyses that cannot be meaningfully accomplished with smaller locally-collected data. Most of these data sets are readily available electronically from data warehouses. Data are generally well-documented and most are available free of charge.

Tips for Acquiring and Using Public Use Data Sets