i have large dataframe (n = 107,251), wish split relatively equal halves (~53,625). however, split done such 3 variables kept in equal proportion in 2 sets (pertaining gender, age category 6 levels, , region 5 levels). i can generate proportions variables independently (e.g., via prop.table(xtabs(~dat$gender)) ) or in combination (e.g., via prop.table(xtabs(~dat$gender + dat$region + dat$age) ), i'm not sure how utilise information sampling. sample dataset: set.seed(42) gender <- sample(c("m", "f"), 1000, replace = true) region <- sample(c("1","2","3","4","5"), 1000, replace = true) age <- sample(c("1","2","3","4","5","6"), 1000, replace = true) x1 <- rnorm(1000) dat <- data.frame(gender, region, age, x1) probabilities: round(prop.table(xtabs(~dat$gender)), 3) # 48.5% female; 51.5% male round(prop.table(xtabs(~dat$age)), 3)...