• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site
Important announcements 2

News

Utrecht Summer School

Veronica Kostenko shared her experience of participating in the Utrecht Summer School 

Veronica Kostenko, research fellow  of LCSR, participated in Utrecht Summer School in late August 2014. This school lasts for the whole summer and covers a wide range of topics with a special focus on statistics. Veronica studied advanced methods of handling the missing data.

The course called “Advanced Methods of Handling the Missing Data. Survey Research, Statistical Analysis and Estimation” is headed by Edith de Leeuw (http://www.uu.nl/staff/EDdeLeeuw/0), a prominent methodologist and an expert in non-response. The instructors at the course are her coauthor J. Hox and several methodologists from “Statistics Netherlands”. A profound theoretical part and everyday intensive hands-on classes in R statistical software were combined to provide a shortcut to dealing with missing data in a week only. We studied design tools to minimize missingness when collecting our own data. We also covered the theory of Missing at Random – MAR, Missing Not at Random – NMAR and Missing Completely Not at Random – MCAR. The students had to get the main message: listwise deletion is the worst possible way of dealing with this problem as the nature of missingness is very likely to be non-random. This fact distorts the structure of the sample and, consequently, the results. We also studied several effective methods of analysis of randomness.

After that part students studied two most popular contemporary ways of handling the missing data: Full Information Maximum Likelihood and Multiple Imputation. FIML uses all the information available regardless of the missings. Multiple Imputation fits in the data basing upon regression estimation (package “Amelia II”) or looking for the most similar cases (package “mice”). The second method, as the lecturers argue, is a bit more practical, as it does not assume normality of the data and is very effective even with non-normally distributed and discrete variables.  

The course was really intensive as the classes started at 9 a.m. (and one is not supposed to be late) with a 3-hour long lecture. After lunch there is a practical hands-on session in R that lasts for other 4 to 5 hours. After this long day some reading is required.

I find this course to be very practical; however, the order of the lectures was a bit fuzzy.