Using R for cross-cultural Research

From InterSciWiki
Jump to: navigation, search

Author(s) Doug White for James W.Dow

Tutorials

Orientation

2003 James W.Dow Using R for Cross-Cultural Research. World Cultures 14(2):144-154. http://eclectic.ss.uci.edu/~drwhite/worldcul/14-2Dow.pdf

http://eclectic.ss.uci.edu/~drwhite/courses/sccs.RData SCCS database with open source R code
http://eclectic.ss.uci.edu/~drwhite/courses/SCCCodes.htm SCCS codebook

Following the James Dow article

Install the package(foreign) which will read "spss" or *.sav files

help.search("read.spss")
 Read an SPSS Data File
find.cname(V1648, "Warfare seems to occur constantly at any time of the year")
load("sccs.RData")
#sccs <- read.spss("sccs.sav")   #read the entire SCCS data set into an a data frame called sccs
attach(sccs)
scxt("SUMFAM", "SUMGOV")   #

The sccs data frame expresses the variable values as string labels. These can be coerced into their original numeric format with the function as.numeric().

Repeat

  • Data
sccs - A data frame with the entire Standardized Cross Cultural Sample with1848 variables and 186 cultures.
  • Functions
Functions that do not require attachment
scpl("v1", "v2") #- makes a table plot between variables v1 and v2.
scxt("v1", "v2") #- makes a cross tab between variable v1 and variable v2. It also gives you the variable labels, chi-squared, and Cramer's V.
vlabel("var") #- prints the full label of the variable var. Several names can be retrieved at one time with vlabel(c("v1", "v2", ...))
Functions that require that sccs be first attached with the command attach(sccs).
find.cname(var, "val") #- finds the cultures for which variable var takes on value "val". Example: find.cname(V1648, "Warfare seems to occur constantly at any time of the year")
scrc(v1, v2) #- computes Kendall's rank correlation between two ordinal variables v1 and v2.

sccs.Rdata

All of the functions are in scss.RData. You can see the data frame and the functions by typing 'ls()'.

[1] "find.cname" "sccs"       "scpl"       "scrc"       "scxt"      "vlabel" 
[1]              Var#s/Names 

You can see the functions by typing their names without the parentheses.

  • scpl
function(v1,v2){
varlab <- attr(sccs, "variable.labels")
# Prints a crosstab for two variables in the sccs SPSS data list.
# The variables must be in quotes because they are names.
v1nam <- varlab[v1]
v2nam <- varlab[v2]
print(v1nam)
print(v2nam)
tab <- table(c(sccs[v1], sccs[v2]))
plot.table(tab)
}
  • scrc
function(v1,v2){
# Computes Kendall's rank correlation for two variables 
cor.test(v1, v2, method="kendall")
}
  • scxtt
function(v1,v2){
varlab <- attr(sccs, "variable.labels")
# Prints a crosstab for two variables in the sccs SPSS data list.
# The variables must be in quotes because they are names.
v1nam <- varlab[v1]
v2nam <- varlab[v2]
print(v1nam)
print(v2nam)
tab <- table(c(sccs[v1], sccs[v2]))
print(tab)
suma <- summary(tab)
print(suma)
# Cramer's V
n <- as.numeric(suma["n.cases"])
X2 <- as.numeric(suma["statistic"])
k <- min(dim(tab))
V <- sqrt(X2/(n*(k-1)))
print(c("Cramers V is", V), quote = FALSE, digits = 4)
}
  • vlabel
function(vn){
vlabels <- attr(sccs, "variable.labels")
vlabels[vn]
}

James Dow notes

I have put comments and the beginning of each function that explain what it does. To use the function you have to add the parentheses with the variables you want.

R has been upgraded to 2.4.1, so I had to change scpl slightly. I have also added a new function, scvar, that prints out the long name of the variable and the frequency of its different values in sccs. All of this is in the new version of sccs.RData which I have attached.

R now has a nice html help system with search capacity available at /usr/lib64/R/doc/html/index.html on my Linux system. I noted that it will also use a sophisticated editor such as kate with an editor(prog, editor="kate") command. http://en.wikipedia.org/wiki/Kate_(text_editor)

Let me know if there is anything more you need, and let me know if you find a useful GUI for R. I haven't had the fortitude to investigate the ongoing projects, which seem to be in confusion. -- James W.Dow-- 16:02, 28 February 2008 (PST)

Factor analysis

factanal(x, factors, data = NULL, covmat = NULL, n.obs = NA,
        subset, na.action,
        start = NULL, scores = c("none", "regression", "Bartlett"),
        rotation = "varimax", control = NULL, ...)

Arguments

x 	A formula or a numeric matrix or an object that can be coerced to a numeric matrix.
factors 	The number of factors to be fitted.