R CrossTab, gmodels package

From InterSciWiki
Jump to: navigation, search

A better crosstab GRAPHIC

HexBin working example‎

R CrossTable() function, gmodels package on CRAN

http://cran.r-project.org/web/packages/gmodels/index.html --- Gmodels manual for Gmodels in pdf print pages 2-4 http://sourceforge.net/projects/r-gregmisc

#If you were already set up, try
help.start()   # or try
install.packages("gmodels")
library(foreign) #the format for the input dta file is Stata, which is foreign, it was made from Spss exporting to Stata v. 8.0
getwd() #see your working director name: you might want to set it to
#setwd("C:/Program Files/R//R-2.6.2/")
#See: http://web.csb.ias.edu/library/foreign/html/read.dta.html
#sccs<-read.dta("SCCSvar1-2008NoMap.dta") #if this doesnt work THEN:
#http://intersci.ss.uci.edu/wiki/pub/SCCSvar1-2008NoMap.dta Right click and save to your R working directory then repeat the command above
#sccs<-read.dta("http://intersci.ss.uci.edu/wiki/pub/SCCSvar1-2008NoMap.dta") 
sccs<-read.dta("http://intersci.ss.uci.edu/wiki/pub/SCCSvar1-2008NoMapStata8.dta")#download 1st time by right clicking the url and saving to your working directory
sccs<-read.dta("SCCSvar1-2008NoMapStata8.dta") 
attach(sccs)
library(gmodels)
CrossTable(v891,v893,expected=TRUE,prop.chisq=TRUE,fisher=TRUE,dnn=c("v891 Int War","v893 External War:Attacked")) #delete these options if you dont want them

Odd Stuff with MacBook Pro Tiger (after Leopard) Try right click/download

http://intersci.ss.uci.edu/wiki/pub/SCCSvar1-2008NoMapStata8.dta #Right click and save to your R working directory then repeat the command above
sccs<-read.dta("SCCSvar1-2008NoMapStata8.dta")
attach(sccs)
names(sccs)
length(sccs) #number of variables
help(CrossTable) # to see how it works -- help is available on most routines

SUCCESS!!! see: UCLA R class notes: Managing data

A289 required readings and exercises

Output for the Example

Total Observations in Table:  143  

            | v893 External War:Attacked 
v891 Int War|  Continual |   Frequent | Infrequent |  Row Total | 
------------|------------|------------|------------|------------|
  Continual |          4 |          8 |          4 |         16 | 
   expected |      2.573 |      7.049 |      6.378 |            | 
   ChiSq contrib   0.791 |      0.128 |      0.886 |            | 
   prob.row |      0.250 |      0.500 |      0.250 |      0.112 | 
   prob.col |      0.174 |      0.127 |      0.070 | prob.row   | 
   prop.tot |      0.028 |      0.056 |      0.028 | sums to 1  | 
------------|------------|------------|------------|------------|
   Frequent |          3 |         27 |         15 |         45 | 
            |      7.238 |     19.825 |     17.937 |            | 
            |      2.481 |      2.597 |      0.481 |            | 
            |      0.067 |      0.600 |      0.333 |      0.315 | 
            |      0.130 |      0.429 |      0.263 |            | 
            |      0.021 |      0.189 |      0.105 |            | 
------------|------------|------------|------------|------------|
 Infrequent |         16 |         28 |         38 |         82 | 
            |     13.189 |     36.126 |     32.685 |            | 
            |      0.599 |      1.828 |      0.864 |            | 
            |      0.195 |      0.341 |      0.463 |      0.573 | 
            |      0.696 |      0.444 |      0.667 |            | 
            |      0.112 |      0.196 |      0.266 |            | 
------------|------------|------------|------------|------------|
Column Total|         23 |         63 |         57 |        143 | 
            |      0.161 |      0.441 |      0.399 |            | 
------------|------------|------------|------------|------------|
Statistics for All Table Factors
Pearson's Chi-squared test 
------------------------------------------------------------
Chi^2 =  10.65545     d.f. =  4     p =  0.03072181 
Fisher's Exact Test for Count Data
------------------------------------------------------------
Alternative hypothesis: two.sided
p =  0.02407590

For a cleaner table try

CrossTable(v891,v893,prop.c=FALSE,prop.t=FALSE,prop.r=FALSE,expected=TRUE,prop.chisq=TRUE,fisher=TRUE,dnn=c("v891 Int War","v893 External War:Attacked")) #delete these options if you dont want them

or

CrossTable(v891,v893,prop.chisq=FALSE,expected=TRUE,fisher=TRUE,dnn=c("v891 Int War","v893 External War:Attacked")) 
CrossTable(v891,v893,prop.chisq=FALSE,fisher=TRUE,dnn=c("v891 Int War","v893 External War:Attacked")) #

Conversion with SPSS 15

From SPSS to R

The following steps seems to work with 'Date' and 'String' variables.

   1) Save the SPSS file as Stata (.dta) file (I've used "Stata Version 8 SE") 
   2) Read it into R with the following code: 
Read.dta <- function ( ... ) {
   help <- "
   Read.dta reads Stata files using 'read.dta' in 'library(foreign)'
   This appears to be an ideal way of importing spss files in order
   to keep full variable names. Direct use of 'read.spss' on a SPSS
   '.sav' file abbreviates variable names to 8 characters.
   Note: missing.type = T produces warnings.
   "
         require("foreign")
         trim <- function( x ) x[] <- sub(" +$", "", x )
         #  dd <- read.dta(... , missing.type = T)  # Note: missing.type = T produces warnings.
         dd <- read.dta(...)
         cls <- sapply(dd,class)
         ch.nams <- names(dd) [ cls == "character" ]
         for ( nn in ch.nams ) ddnn <- factor(trim(ddnn) )
         dd
 }
   Note that by default 'read.dta' turns strings into long character strings with trailing blanks. 
 SCCSvar1-2008NoMap.sav: File-indicated character representation code (1252) looks like a Windows codepage

2: In read.spss(file, use.value.labels = use.value.labels, to.data.frame = to.data.frame,  :

 SCCSvar1-2008NoMap.sav: Unrecognized record type 7, subtype 16 encountered in system file

>