R - Don't use apply on dataframes / ddply by rows and columns

Kamis, 11 Agustus 2016

Advertisemen

df <- data.frame(c(1,2,3),c(4,5,6), c("A","B","C"))
names(df) <- c("x","y","z")

print("Classes Before Apply:")
print(class(df[,1]))
print(class(df[,2]))
print(class(df[,3]))
print("Classes During Apply:")
df.apply <- apply(df, MARGIN=1, function(x) print(class(x)))
print("Classes After Apply:")
print(class(df.apply[1]))
print(class(df.apply[2]))
print(class(df.apply[3]))

print("Classes Before ddply (by row):")
print(class(df[,1]))
print(class(df[,2]))
print(class(df[,3]))
print("Classes During ddply (by row):")
df.apply <- ddply(df, names(df), function(x) print(x))
print("Classes After ddply (by row):")
print(class(df.apply[,1]))
print(class(df.apply[,2]))

df.apply[,3]))

From this you can see that apply will cast the variables of df to characters, whereas ddply will keep them as is. However, I not found a way to iterate through columns with ddply, so I find the best way is to use summarise_each of dplyr.

The simplest way to demonstrate summarise_each is to:

summarise_each(df, funs(mean))

Which applies the function 'mean' to each column. However, one of the columns is of type 'factor' and produces a warning. We can write a custom function with an if/else to stop this warning and make the output look cleaner:

summarise_fn <- function(x) { if (is.numeric(x)) return(mean(x)) else return(NA) }
summarise_each(df, funs(summarise_fn))

Advertisemen

Hot's Today

Formulaic Music
Just another chain of thought which progressed during the shower. So everything is based on numbers, i.e. quantum…

Deep in Development
Hey Everyone!So over the past month we've realized that keeping up the blog and updates while developing isn't as easy…

Strategy Informer and Atomic Gamer Speedy Demo Mirrors!
…

University / Red Leaf / Electric Rain
Yet again I've casted my blog aside for other forms of social media, I never thought I would regularly post tweet on…

electric-rain dev blog 002
So today as well as doing the arduous task of refactoring and documentation, I managed to get a new feature in and…

Rhythm Game Editor
Since I've decided I'm working an a rhythm game, regardless of LOST say yes or no, I've begun work on the editor, and…

Beauty of Python
I've always had a hatred for Python, slow with horrible syntax.However, I've just found an advantage. One of Project…

★★★ UNOFFICIAL LOST STREET TEAM ★★★
That's right I'm starting thisIF CHURLWOOD WANTS TO JOIN HE CANIF NOT I WILL BE…

Project Idea: Neural Network to categorise certain sites
This probably won't work too well, but it would be a nice idea to try at some point:1) User creates account on site2)…

R - Forcing local scope on a function
When writing functions in R, I usually build them from code I've already written. And R doesn't have a local scope,…

R - Don't use apply on dataframes / ddply by rows and columns

Tidak ada komentar:

Posting Komentar