r - Spread with data.frame/tibble with duplicate identifiers -
the documentation tidyr suggests gather , spread transitive, following example "iris" data shows not, not clear why. clarification appreciated
iris.df = as.data.frame(iris) long.iris.df = iris.df %>% gather(key = feature.measure, value = size, -species) w.iris.df = long.iris.df %>% spread(key = feature.measure, value = size, -species)
i expected data frame "w.iris.df" same "iris.df" received following error instead:
"error: duplicate identifiers rows (1, 2, 3, 4, 5, 6, 7, 8, 9..."
my general question how reverse application of "gather" on sort of dataset.
hadley's intervention unsurprisingly perfect... ended mucking syntax bit after that... it's worth, post operational code (sorry syntax bit different above):
library(tidyr) library(dplyr) wide <- iris %>% mutate(row = row_number()) %>% gather(vars, val, -species, -row) %>% spread(vars, val) head(wide) # species row petal.length petal.width sepal.length sepal.width # 1 setosa 1 1.4 0.2 5.1 3.5 # 2 setosa 2 1.4 0.2 4.9 3.0 # 3 setosa 3 1.3 0.2 4.7 3.2 # 4 setosa 4 1.5 0.2 4.6 3.1 # 5 setosa 5 1.4 0.2 5.0 3.6 # 6 setosa 6 1.7 0.4 5.4 3.9 head(iris) # sepal.length sepal.width petal.length petal.width species # 1 5.1 3.5 1.4 0.2 setosa # 2 4.9 3.0 1.4 0.2 setosa # 3 4.7 3.2 1.3 0.2 setosa # 4 4.6 3.1 1.5 0.2 setosa # 5 5.0 3.6 1.4 0.2 setosa # 6 5.4 3.9 1.7 0.4 setosa
they same.... need reorder if u feel it...
wide <- wide[,c(3, 4, 5, 6, 1)] ## reorder , remove "row" column
and done.
Comments
Post a Comment