All session titles are tentative and may change. All classes start at 4:25PM Eastern, most will end by 6PM Eastern.
Final class grades are computed from the lab grades – there is no final exam this year. All dates are for 2017, all times are Eastern (EST or EDT as appropriate). You can use the “Categories” to filter sessions.
Margo Anderson (University of Wisconsin – Milwaukee) presents on the history of the federal statistical system (flipped classroom). She will be present to discuss the lecture.
Readings and other information
- Anderson, Margo. The American Census: A Social History, Second Edition. Yale University Press, 2015.
- Anderson, Margo J., and Seltzer, William. “Federal Statistical Confidentiality and Business Data: Twentieth Century Challenges and Continuing Issues’.” Journal of Privacy and Confidentiality 1.1 (2009): 7-52, 55-58.
About the Guest Lecturer
Margo Anderson, University of Wisconsin – Milwaukee
This class coincides with FSRDC system’s annual conference. There will be no in-classroom activity at most sites on this day (please check with local coordinator). The content of this section will be discussed on Sept 21, 2017, so students should take the time to view the materials on edX during this week.
Health statistics, energy statistics, agricultural statistics, others. Registered-based statistics, organic data.
Erica Groshen, Cornell University, will take part in the discussion.
Brent Hueth, University of Wisconsin-Madison, will be discussing topics related to agricultural statistics.
- Health statistics (Lecture Notes: INFO7470-S7-Parker, Jennifer Parker (NCHS))
- Agricultural statistics (Lecture Notes: INFO7470-S7-DunnHueth, additional materials, INFO7470-S7-Migrant Farm Labor in the Census of Agriculture, Richard Dunn (University of Connecticut) and Brent Hueth (University of Wisconsin-Madison))
- EIA presentation: INFO7470-S9-EIA-Background-2016 (Jacob Bournazian (EIA))
- Register-based statistics: INFO7470-S9-Register-data
- Alternate data sources: INFO7470-S9-Organic-data
- Updates by Erica Groshen on working with BLS data: INFO7470 2017 Groshen BLS
This will be “flipped classroom” on Geographic Information Systems (GIS) – basic geocoding, geographic concepts, and other topics.
Michael Ratcliffe, U.S. Census Bureau
- Geography: INFO7470-S8-Census Geography Concepts
Flipped classrom about access to restricted access data. Students will be introduced to the research proposal mechanism of the Federal Statistical Research Data Center, including data from the Census Bureau, NCHS, and BLS.
Discussion will focus on how to access various restricted access data sets. Guest presenters may be present live in the videoconference classroom.
The presentation on replicable science is moved to
next week a later date.
The class is both flipped classroom and live presentation.
We discuss the need for and the requirements of replicable science (in general, and in restricted-access environments). This part is a live lecture by Lars Vilhuber.
Introduction to record linking
- What is record linking, what is it not, what is the theory?
- Record linking: applications and examples – How do you do it, what do you need, what are the possible complications?
- Examples of record linking
- INFO7470-S10-Primer_for_Programs (PDF) or (Powerpoint)
- Large-scale Data Linkage from Multiple Sources: Methodology and Research Challenges
John M. Abowd, U.S. Census Bureau and Cornell University, will lead the discussion.
- Formal models of edits and imputations
- Missing data overview
- Missing records – Frame or census – Survey
- Missing items
- Overview of different products
- Overview of methods
- Formal multiple imputation methods
- INFO7470 S11 -Statistical Tools Edit and Imputation (Powerpoint)
- INFO7470 S11 -Statistical Tools Edit and Imputation Examples
The lab (an edit and imputation exercise) is posted on the INFO7470x edX site. You will need to create a program, and upload the program (language of your choice) to edX. A toy example is illustrated in a video on the edX site, you can download the spreadsheet toy-example-imputation.xlsx here.
- Why must users of restricted-access data learn about confidentiality protection?
- What is statistical disclosure limitation?
- What are privacy-preserving data mining and differential privacy?
- Basic methods for disclosure avoidance (SDL)
- Rules and methods for model-based SDL
- SDL-based noise methods
- Synthetic data
- Differential privacy methods
- Part A: Spatial Analysis (Nicholas Nagle of University of Tennessee – Knoxville)
- Part B: Network Analysis (John Abowd, Cornell University)
Part A: Spatial Analysis
- Basic Geocoding
- Tools for Geocoding
- Analysis Methods
- Tools for Geographic Analysis
Previous versions of the course can be found in our archives.