Recoding variables with R and Anatella
Variable recoding can be a real pain in the neck. Although there are functionalities to do this in R and Anatella, doing it right is not always easy.
For example, when working with Latent Class Analysis, we want to get all our variables in a few categories, and quantiles do not really do the job. Often, we have variables that have a lot of repeated values (0, or 1), and there is no way to cut properly.
In Anatella, we can do it using the Quantile Recode action, and the interval Join action, but this quickly become cumbersome. We can also, of course, code it in JavaScript, which is definitely not a task for me.
This is where R come handy! In the following:
The previous graph shows very few operations: open a file, set variables to numerical (R is a bit specific with this), and send data from Anatella to R code to process it.
Where is Anatella good? Quantile calculations. And R has a predefined function for recoding: the cut(X, Breaks, Labels) function. What is a bit annoying is the data preparation needed for it to work fine: check for uniqueness, make sure the cuts are what they are supposed to be.
The Quantile operation in Anatella is pretty neat:
In this case, we select the “Clever” quantile, which does not make equal groups, but makes sure we don’t cut “in-between” categories, as we would do with quartile or decile. Just set the number of groups, and you’re set. Data is ready for Latent Class Analysis with 5 groups per variables! Neat!