Manual for Dataset Construction and Causal Analysis
RData and RDa read - SIL language classification
- Manual_for_Dataset_Construction_and_Causal_Analysis#Create_Binford_data
- Manual_for_Dataset_Construction_and_Causal_Analysis#NewMac_Code_2012, sort LRBinford.Rdata
- Manual_for_Dataset_Construction_and_Causal_Analysis#LRBsortPlusWvars.csv_ENVIRONMENT_PCA_BEST_MODEL_lowess
- Manual_for_Dataset_Construction_and_Causal_Analysis#Create_the_language_distances_matrix_with_the_spdep_package
- Manual for Dataset Construction and Causal Analysis in R - R2sls Manual -- SCCS R package
- http://finzi.psych.upenn.edu/search.html at http://www.r-project.org/search.html SEARCH R for FUNCTIONS
- Foragers under construction
- have distBin339.Rdata, Binford47vars339.RData, need continent/country and eventually: Language. That requires two steps:
- Import Language data from ETHNOATLAS for 196 societies
- Use closest neighbors (distance matrix) to rank order likely societies with same neighbors
- Pick the top rank guess.
- EduFor1 -test run for binford data
Chapter 1: Create the elements of the database
- Eff, E. Anthon, and Malcolm Dow. 2009. How to Deal with Missing Data and Galton's Problem in Cross-Cultural Survey Research: A Primer for R. Structure and Dynamics: eJournal of Anthropological and Related Sciences 3#3 art 1.
- Rectangular database: cases in rows, variables in columns
- Distance matrix
- Phylogenetic matrix
- Auxiliary database needed for imputing missing data
Create the rectangular database
- Begin with a rectangular database in Excel
- Save as *.csv that can be read in R (comma separated variables)
- Make a codebook, match simple variable naming, e.g., v1, v2 with columns.
- Copy *.csv, rename, and replace column lables with v1, v2 ... and other short labels.
- Calling variables, e.g., sccs$v49, facilitated by short variable names
- The EduMod lab at UCI allows writing only to setwd("C:/My Documents/(subdirectories for databases)
- This is where on-line projects can be run off the mediawiki site
- Hence for compatability, created a "C:/My Documents/(subdirectory)
- Copy the two *.cvs files and codebook to that subdirectory
- Create the R file
- setwd("C:/My Documents/YourSubDir")
- YourRfile <- read.csv("YourDataFile.csv")
- save(YourRfile,file="YourRfile.RData")
- Restructure the distance matrix==
- Format is 0 on diagonal, close pairs have high values (proximities), row normalized.
Create the great-circle distances matrix with the spdep package
- Dow, Malcolm M. 2007. Galton's Problem as Multiple Network Autocorrelation Effects. Cross-Cultural Research 41(4):336-363.
- p340: "The simplest function in this case is an exponentially decaying distance function such as dij–a, where dij is the distance from society i to society j and a is a suitable exponent."
- p340: "White et al. (1981) and Dow (1984) measured linguistic distance using a count of the number of steps between societies on a hypothesized phylogenetic tree."
- Dow, Malcolm M. 2008. Network Autocorrelation Regression with Binary and Ordinal Dependent Variables: Galton’s Problem. Cross-Cultural Research 42(4):394-419.
- Dow, Malcolm M, and E. Anthon Eff. 2008. Global, Regional, and Local Network Autocorrelation in the Standard Cross-Cultural Sample. Cross-Cultural Research 42(2):148-171.
- Extract the lat, long columns and put in a separate Excel file, convert to *.csv format
Courtesy of Anthon Eff: Row Normalized distance matrix
setwd("C:/My Documents/Binford")
setwd('/Users/drwhite/Documents/3sls/0Bin/')
#Courtesy of Anthon Eff:
library(spdep)
ww<-read.csv("2colBin.csv")
ww<-SpatialPoints(ww[,c(2,1)])
ww<-spDists(coordinates(ww),longlat=TRUE)
ww<-1/(ww+.000001)^2
diag(ww)<-0
ww<-ww/rowSums(ww)
distBin<-ww
###save(distBin,file="distBin339.Rdata")
write.dta(distBin,file="distBin339.dta") ##not quite right
#or if you want a smaller file (3 decimals)
ww <- round(ww[1:339,1:339],3) distBin<-ww save(distBin,file="distBin2.Rdata")
Whenever you want to use the matrix, you can call it in this way:
load("distBin.Rdata",.GlobalEnv)
work on ecological W matrix
Goal: Binford LRB$et and LRB$pet measure amount of energy in sunlight through local temperature and precipitation, which are highly correlated. Networks of more similar energetic localities (etDiff) at closer distances (xdd) may be expected to be an important Instrumental variable for controlling autocorrelation effects. Problem is that I get a multiplication error in the second for loop.
setwd('/Users/drwhite/Documents/3sls/0Bin/') #Rdata
load("xdd.Rdata",.GlobalEnv) #the autocorrelation matrix for distances
load("LRB.Rdata",.GlobalEnv)
et<-LRB$et #the Effective temperature variable
m <- 339; n <- 339
etDiff <- array(list(),c(m,n))
for (i in 1:339) {
etDiff [i,i]<-17
for (j in 1:339) {
etDiff [i,j]<- (17 - abs (et[i]-et[j] ))/17 #these are similarities, i.e., large when differences are small, with higher autocorrelation effects.
etDiff [j,i]<- (17 - abs (et[i]-et[j] ))/17 #these are similarities, i.e., large when differences are small, with higher autocorrelation effects.
}
}
diag(etDiff)<-0
etDiff[23:30,23:30]
xdd[23:30,23:30]
etDist <- array(list(),c(m,n))
for (i in 1:339) {
for (j in 1:339) {
etDist [i,j]<- xdd[i,j] * etDiff[i,j] #these are similarities, i.e., large when differences are small, with higher autocorrelation effects.
etDist [j,i]<- xdd[j,i] * etDiff[j,i] #these are similarities, i.e., large when differences are small, with higher autocorrelation effects.
}
}
etDiff[23:30,23:30]
Error in xdd[i, j] * etDiff[i, j] : non-numeric argument to binary operator
#class(etDiff) & class(xdd) are class:matrix
plots: right and wrong
THESE TWO PLOTS HAVE THE RIGHT lati variables
- THOSE TWO BELOW HAVE THE WRONG lati variables
old code
#2colBin.csv has only 2 cols: lat, lon, and headers
setwd('/Users/drwhite/Documents/3sls/0Bin/') #Rdata
load("vaux.Rdata",.GlobalEnv)
write.csv(vaux,"vaux2.csv")
vaux2<-read.csv("vaux2.csv", header=TRUE)
save(vaux2,file='vaux2.Rdata')
library(spdep)
ww<-read.csv("2colBin.csv")
ww<-read.csv("2colBin.csv") #latitude, longitude, i.e, header=TRUE)
xx<-ww
ww<-SpatialPoints(ww[,c(2,1)])
ww<-spDists(coordinates(ww),longlat=TRUE)
yy<-ww
ww<-1/(ww+.000001)^2 #these are proximities, i.e., large when close, small when distances are large
diag(ww)<-0
ww<-ww/rowSums(ww)
ww[1:3,1:3]
ww[1:3,337:339]
ww[337:339,337:339]
ww<-ww/rowSums(ww) #rowSums(ww) all 1
load("LRB.Rdata",.GlobalEnv)
et<-LRB$et
m <- 339; n <- 339
etDiff <- array(list(),c(m,n))
for (i in 1:339) {
for (j in 1:339) {
etDiff [i,j]=.0000001+(.0000001+1/(.000001+(abs(et[i]-et[j])))) * (.0000001+ww[i,j]) #these are proximities, i.e., large when differences are small, small when differences are large
etDiff [j,i]=.0000001+(.0000001+1/(.000001+(abs(et[i]-et[j])))) * (.0000001+ww[i,j])
}
}
etD<-etDiff
diag(etD)<-0
dim(etD) # ok to here #[1] 339 339
class(etD)
etD<-etD/rowSums(etD) #eliminated NANS and inf
dim(etD) # ok to here #[1] 339 339
class(etD)
############
#"numeric"
etD<-etDiff #as.matrix(etDiff)
class(etD)
#[1] "matrix"
Old<etD
#class(etD)
#at this point, save as *.csv, read as *.Rdata
diag(etD)<-0
etD<-1000*Old
etD<-etD/rowSums(etD) # will then work
etD[1:3,1:3]
etD[337:339,1:3]
etD[337:339,337:339]
rowSums(etD) #not working
write.csv(etD,"etD.csv")
#now clear col 1
etD<-read.csv("etD.csv", header=TRUE)
#etD[,1]<-0
save(etD,file='etD.Rdata')
load("etD.Rdata",.GlobalEnv)
rowSums(etD)
ETD<-etD/rowSums(etD)
ETD<-ETD/rowSums(ETD)
rowSums(etD) *THERE ARE NANS etD<-etD/rowSums(etD) # will then work WATCH OUT FOR COLUMN 1
write.csv(etD,"etD.csv")
etD<-read.csv("etD.csv", header=TRUE)
save(etD,file='etD.Rdata')
load("etD.Rdata",.GlobalEnv)
Now copy labels left col, top row from xee.Rdata
etD<-read.csv("etD.csv", header=TRUE)
save(etD,file='etD.Rdata')
load("etD.Rdata",.GlobalEnv)
load("xdd.Rdata",.GlobalEnv)
write.csv(xdd,"xdd.csv")
#Error in base::rowSums(x, na.rm = na.rm, dims = dims, ...) : 'x' must be numeric class(rowSums(etD))
try typing 'class(etDiff)' to see what kind of object it is. If it's not a matrix then you can convert it to one by saying as.matrix(etDiff)
#round(R,2)
junk
ww<-read.csv("2colBin.csv")
ww<-SpatialPoints(ww[,c(2,1)])
ww<-spDists(coordinates(ww),longlat=TRUE)
ww<-1/(ww+.000001)^2
diag(ww)<-0
ww<-ww/rowSums(ww)
dim(ww) #distBin<-round(ww[1:339,1:339],3) #save(distBin,file="distBin.Rdata")
To look at the matrix dim(ww) round(ww[1:11,1:11],3)
ww <- round(ww[1:339,1:339],3)
distBin339 <- round(ww[dim],3) distBin339[1:11,1:11]
Create a continent/country matrix
setwd("C:/My Documents/Binford")
library(spdep)
input<-read.csv("regnMatBin339.csv") ##//Although you probably want to throw away
#the rows that have all missing values
source("proximity.R")
W<-proximity(input,4,1,-1) #//in this example 4 is the column representing continent,
# 1 means to use 1 if two are the same continent, -1 otherwise.
dim(W)
round(W[1:339,1:5],3)
W=W+1
W<-1/(W+.000001)^2
diag(W)<-0
W<-W/rowSums(W)
round(W[1:339,1:5],3)
W=W+1
W<-1/(W+.000001)^2
diag(W)<-0
W<-W/rowSums(W)
round(W[1:339,1:5],3)
save(W,file="regionMatBin339.Rdata")
save(W,file="regBin.Rdata")
#======== #Code for proximity.R, stored at "C:/My Documents/Binford" #======== #You should create a new .R file and paste the following code into it:
proximity <- function(data,class_idx,same_score,diff_score) {
size<-dim(data)1 M <- matrix(0,size,size)
for (x in 1:size) {
continent_x<-c(data[x,class_idx])
for (y in 1:size) {
continent_y<-c(data[y,class_idx])
if (continent_x == continent_y) {
M[x,y] <- same_score
} else {
M[x,y] <- diff_score
}
}
}
}
Create the language distances matrix with the spdep package
- Eff, E. Anthon. 2008. Weight Matrices for Cultural Proximity: Deriving Weights from a
Language Phylogeny. Structure and Dynamics: eJournal of Anthropological and Related Sciences. Vol. 3, No. 2, Article 9. http://repositories.cdlib.org/imbs/socdyn/sdeas/vol3/iss2/art9
- for the Binford Forager data language phylogeny will be difficult or impossible.
- Marcus Hamilton: "Unfortunately I don't think there are any language codes in the Binford data, and not sure how easy they would be to add as I'm not sure how well these language families are understood. We have run into the same problem as at various stages we would have liked to have controlled for phylogeny, but the best we could do was to introduce "continent" and "region" into a glm, and look for those effects, which end up basically controlling for colonization effects and spatial proximity."
So here is what we do,
setwd("C:/My Documents/Binford")
library(spdep)
regnBin<-read.csv("regnMatBin339.csv")
save(regnBin,file="regnMatBin339.Rdata")
- For the 100 or so continents, read the regnMatBin339 column and
- for each pair of countries add 10 to a similarity matrix ww if from the same continent.
- For the 100 or so countries, read the regnMatBin339 column and
- for each pair of countries add 5 to a similarity matrix if ww from the same country.
save(ww,file="regnWwBin339.Rdata") ww<-ww/rowSums(ww)
dim(ww) distBin<-round(ww[1:339,1:339],3) save(distBin,file="distBin.Rdata")
To look at the matrix dim(ww) round(ww[1:11,1:11],3)
ww<-SpatialPoints(ww[,c(2,1)]) ww<-spDists(coordinates(ww),longlat=TRUE) ww<-1/(ww+.000001)^2 diag(ww)<-0 ww<-ww/rowSums(ww) distBin<-ww save(distBin,file="distBin339.Rdata")
Create a fake language matrix from Distance with random error
Random number is R: rnorm R: matrices setwd("C:/My Documents/Binford") library(spdep) ww<-read.csv("2colBin.csv") v1 <-rnorm(339)/5 v2 <-rnorm(339)/5 matRnd<-as.matrix(v1[1:339],v2[1:339]) ## I want to multiply 2 columns of random numbers #to simulate geocoordinates to those of ww before the next line Fake<-matRnd+ww ww<-SpatialPoints(Fake[,c(2,1)]) ww<-spDists(coordinates(ww),longlat=TRUE) ww<-1/(ww+.000001)^2 diag(ww)<-0 ww<-ww/rowSums(ww) fakeBin<-ww save(fakeBin,file="fakeBin339.Rdata")
Measure effective sample size from W matrix
Griffith, Daniel A. 2005. Effective Geographic Sample Size in the Presence of Spatial Autocorrelation. Annals of the Association of American Geographers 95(4): 740–760. http://onlinelibrary.wiley.com/doi/10.1111/j.1467-8306.2005.00484.x/full
- 2011
- recent papers
- Hurlbert SH. 1984. Pseudoreplication and the design of ecological field experiments. Ecol Monogr 1984;54:187-211.
library(foreign)
ww<-as.matrix(read.dta("/Users/drwhite/Documents/sccs/examples/data/dist25wm.dta")[,c(-1,-2,-1,-189)])
dim(ww)
round(ww[1:11,1:11],3)
ww<-ww/rowSums(ww) #making sure that ww is row normalized
n<-NROW(ww)
w_inv<-solve(ww) #inverse of V -V-1
round(w_inv[1:11,1:11],3)
chk<-ww%*%w_inv # verifying that w_inv is indeed the inverse
table(diag(chk));table(colSums(chk))
TRw_inv<-sum(diag(w_inv)) #trace of inverse
one<-matrix(1,n,1) #column vector of ones
nstar<-TRw_inv/(t(one)%*%w_inv%*%one)*n #formula
nstar
n* is calculated with a covariance matrix. Its inverse for an SAR model is (I-rho*W)'(I-rho*W). You need to estimate the spatial autocorrelation parameter, rho, first. I also include a very good approximation equation for n*, which requires only the estimate of rho, and does not require the matrix inversion.
ww<-as.matrix(read.dta("dist25wm.dta")[,c(-1,-2,-1,-189)])
#To look at the matrix
dim(ww)
round(ww[1:11,1:11],3)
ww<-ww/rowSums(ww) #making sure that ww is row normalized
w_inv<-solve(ww) #inverse of V -> V^-1
#dim(w_inv)
round(w_inv[1:11,1:11],3)
trace<- sum(diag(w_inv)) #trace of inverse #sum( w_inv * t(w_inv) )
transpose<-t(w_inv)
tr<-transpose
tr
n<-NROW(ww)
one<-matrix(1,n,1) #column vector of ones
nstar<-n*trace/(t(one) %*% t(w_inv) %*% one) #formula
nstar
nI<-c(n) I <- matrix(0,nrow=nI,ncol=nI) I[row(I)==col(I)] <- 1 trI<-t(I) n_ef<-n*trace/(trI*w_inv*I) n_ef
tr1<-t(1) n_ef<-n*trace/(tr1*w_inv*I) n_ef
Construct vaux.Rdata: Take out all columms with missing data
EduFor1 foragers.zip
Doug: the vaux data was based on the complete series from the SCCS (i.e., those with 186 observations), as well as data that I generated from GIS coverages, using the coordinates of the SCCS societies. So, for this forager data set, you can generate similar data by using those series that are complete, plus whatever data you can generate from a GIS, using the 339 coordinates. Once you have these data, you can group variables into groups with similar meanings (e.g., subsistence, climate, topography) and then create principal components for each group.
As far as the syntax of prcomp:
f<-prcomp(Vauxforag)
If you take a look at what f contains:
names(f) [1] "sdev" "rotation" "center" "scale" "x"
The object f$x contains the principal components; f$x[,1:3] is the first three principal components of Vauxforag.
f$rotation is the 35x35 matrix you were observing--it is the loadings for each component.
Anthon (Tony Eff)
From Doug: Anthon: I made an index of the completely coded variables in the Marcus Hamilton et al database (csv attached for this index and the data) focusing on group fractality and related variables. Some of these variables were collected from GIS. They were heavily analyzed by Binford into meaningful measures. I factor analyzed and rotated the factors, paid attention to factors 1-5. Looking over those variables I classified into three general categories and three single variables.
- FOOD - those features conducive to more or less food production, regardless of subsistence type, loaded on factor 1, including temperature, evapotranspiration
- SUBS - those features loaded on factor 2, contrasting hunting-gathering versus fishing, and loading with food type, mobility, rainfall,
- Dense - Pop.sqkm, sqkm territory, distance from Greenwhich (E/W),
- SOIL (did not load on any of the 3 factors)
- SNOW (did not load on any of the 3 factors)
- POPulation (did not load on any of the 3 factors)
Would you take a look and see if these would seem to produce reasonable factors in the first set (1-3) and separate variables in the second (4-6)? I wanted to start with the Hamilton data only as it bears on fractality (those data have missing values) and we wanted to some causal analysis of their dataset alone.
I have many other variables from the original Binford data that other researchers coded in excel. My idea is to add those later and at that point redo the vaux data.
From what I understand, I need to get 339 columns, with column labels for each society, and rows with the PCA values for the significant components, in this case a few from my FOOD, SUBS, Dense PCAs and three from my SOIL, SNOW and POP. Or do I have my axes reversed? Should I use rotated factors for the FOOD, SUBS, Dense PCAs or the unrotated PCA factors (if so, what is the command to get those?). And in general, what is the command for the factor loadings on societies?
Thanks,
Doug
Code on Old PC
setwd("C:/My Documents/sccs")
library(sccs)
data(sccs) #this works
#Binford47vars339narrowCompleteOnlyNo5No7.xlsx
setwd("C:/My Documents/Binford/Best_files/Vaux5_7")
Vauxforag <- read.csv("Binford47vars339narrowCompleteOnlyNo5No7.csv")
save(Vauxforag,file="BinfordVaux.RData")
data(Vauxforag) #doesnt work but no mind for now (needed when running main program)
library(stats)
princomp(Vauxforag) #works
#now make VauxB.csv from the principal components
setwd("C:/My Documents/Binford/Best_files/Vaux5_7")
VauxB <- read.csv("VauxB.csv")
save(VauxB,file="VauxB.RData")
Create Binford data
--Binford data--
setwd("/home/anthon/Dropbox/males")
#setwd("e:/Dropbox/males")
#setwd("/home/eaeff/males/")
rm(list=ls(all=TRUE))
options(echo=TRUE)
library(foreign)
library(psych)
sessionInfo()
.libPaths()
ls("package:foreign")
Binford data
uu<-read.spss("Binfords_HG_file_with_value_labels_current_Dec2008.sav",
use.value.labels = TRUE, to.data.frame = TRUE,
max.value.labels = 30,trim.factor.names = TRUE)
gg<-read.spss("Binfords_HG_file_with_value_labels_current_Dec2008.sav",
use.value.labels = FALSE, to.data.frame = TRUE,
max.value.labels = 30,trim.factor.names = TRUE)
z<-which(sapply(gg,function(x) class(x))=="factor") gg[,z]<-sapply(gg[,z],function(x) (as.character(x)))
table(sapply(uu,function(x) class(x))) table(sapply(gg,function(x) class(x))) head(gg[,z]) names(gg)[z]
gSimpStat<-function(ww){
a<-data.frame(psych::describe(ww)) a<-round(a[,c(2:4,8:9)],3) return(a)
}
auu<-sapply(uu,function(x) class(x))
agg<-sapply(gg,function(x) class(x))
ncats<-sapply(gg,function(x) length(table(as.character(x))))
a<-cbind(auu,agg,data.frame(ncats),matrix(NA,length(ncats),5))
- z<-which(a$a1!="factor")
z<-which(sapply(gg,function(x) class(x))!="character") f<-gSimpStat(gg[,z]) names(a)[-(1:3)]<-names(f) k<-match(rownames(f),rownames(a)) a[k,names(f)]<-f a[,"n"]<-apply(!is.na(gg),2,sum) pff<-a head(pff)
pff[c("subpop","perogat"),] levels(uu$subpop) table(uu$subpop,useNA="ifany") table(gg$subpop,useNA="ifany")
table(uu$perogat,useNA="ifany") table(gg$perogat,useNA="ifany") g<-levels(uu$perogat) levels(uu$perogat)<-paste(g,1:length(g))
table(uu$secno,useNA="ifany") table(gg$secno,useNA="ifany") gg$secno<-gsub(" ","",gg$secno) uu$secno<-as.factor(gg$secno)
write.table("Binford Sample",file="z.txt", sep="\t",append=FALSE,
row.names=FALSE,col.names=FALSE,quote=FALSE,eol="\r\n")
knn<-rownames(pff) for (ii in knn){
k<-match(ii,knn)
write.table(" ",file="z.txt", sep="\t",append=TRUE,row.names=FALSE,col.names=FALSE,quote=FALSE)
zzz<-paste(pff[ii,"agg"],"; ",pff[ii,"auu"],sep="")
if (as.character(pff[ii,"agg"])==as.character(pff[ii,"auu"])){zzz<-pff[ii,"agg"]}
write.table(paste(k,". ",ii,"; ",zzz,"; nobs=",pff[ii,"n"],"; ncats=",pff[ii,"ncats"],sep="")
,file="z.txt", sep="\t",append=TRUE,row.names=FALSE,col.names=FALSE,quote=FALSE)
if(pff[ii,"ncats"]<30){
b1<-table(factor(uu[,ii]),useNA="ifany")
b2<-table(gg[,ii],useNA="ifany")
b<-paste("DUPLICATED labels in factor: number of labels=",length(b1),"; number of factor categories=",length(b2),sep="")
if (length(b1)==length(b2)){
b<-cbind(data.frame(b2),names(b1))
names(b)<-c("value","frequency","level")
}
write.table(b,file="z.txt", sep="\t",append=TRUE,row.names=FALSE,col.names=FALSE,quote=FALSE)
}
if(pff[ii,"ncats"]>=30 & pff[ii,"agg"]=="numeric"){
b<-t(pff[ii,c("mean","sd","min","max")])
write.table(b,file="z.txt", sep="\t",append=TRUE,row.names=TRUE,col.names=FALSE,quote=FALSE)
}
}
ff<-read.csv("Binford_merged_dataset.csv",as.is=TRUE)
names(ff)<-paste("W",tolower(names(ff)),sep="")
head(ff)
k<-match(gg[,1],ff[,1]) j<-match((names(ff)),(names(gg))) names(ff)[which(!is.na(j))]
hh<-cbind(gg,ff[k,]) hh[,c("name","Wname")]
table(sapply(ff,function(x) class(x))) head(hh)
table(hh$Wsubpop,useNA="ifany") table(gg$subpop,useNA="ifany")
LRB<-hh
save(LRB,file="LRB.Rdata")
write.table(" ",file="z.txt", sep="\t",append=TRUE,row.names=FALSE,col.names=FALSE,quote=FALSE) write.table("Variables from InterSciWiki Binford_merged_dataset",file="z.txt", sep="\t",append=TRUE,
row.names=FALSE,col.names=FALSE,quote=FALSE,eol="\r\n")
knn<-names(ff) for (ii in knn){
k<-match(ii,knn)
write.table(" ",file="z.txt", sep="\t",append=TRUE,row.names=FALSE,col.names=FALSE,quote=FALSE)
write.table(paste(k,". ",ii,"; ",class(ff[,ii]),"; nobs=",sum(!is.na(ff[,ii])),
"; ncats=",length(table(ff[,ii])),sep=""),
file="z.txt", sep="\t",append=TRUE,row.names=FALSE,col.names=FALSE,quote=FALSE)
if(length(table(ff[,ii]))<30){
b1<-table(ff[,ii],useNA="ifany")
b<-cbind(data.frame(b1))
write.table(b,file="z.txt", sep="\t",append=TRUE,row.names=TRUE,col.names=FALSE,quote=FALSE)
}
if (length(table(ff[,ii]))>=30 & class(ff[,ii])!="character"){
write.table(t(gSimpStat(ff[,ii])),file="z.txt", sep="\t",append=TRUE,row.names=TRUE,col.names=FALSE,quote=FALSE)
}
}
NewMac Code 2012, sort LRBinford.Rdata
Add Ethnologue data
library(foreign)
setwd('/Users/drwhite/Documents/3sls/sccs/BinfordPrinComm')
load("LRBEff.Rdata",.GlobalEnv)
#load('LRBEffnologue.Rdata',.GlobalEnv)
#load("LRBinford.Rdata",.GlobalEnv)
LRBethno<-LRB #(for some reason LRBEff attenuates to LRB)
#write.csv(LRBethno, file = "LRBethno.csv")
LRBethno<-LRB #(for some reason LRBEff attenuates to LRB)
write.csv(LRBethno, file = "LRBethno.csv")
#transfer new data to cols in read(LRBsortPlusWvRecod3.csv
LRBsorted <- read.csv("LRBsortPlusWvRecod3.csv", header=TRUE)
#save.csv("LRBsortPlusWvRecod3.csv",file="
LRBsorted <- read.csv("LRBsortPlusWvRecod3.csv", header=TRUE)
B<-LRBsorted
sys3rec=LRBsorted$systate3recod
table(B$systate3recod, round(B$Wpet/200,0)) ##r=-0. p-value = 0. (curvilinear)
cor.test(B$systate3recod, round(B$Wpet/200,0))
1 2 3 4 5 6 7 8 9 r=-0.43 P=0.0000000000000056
2 1 2 0 2 2 4 3 3 3
3 0 6 4 0 2 3 5 6 2
4 11 13 26 16 19 14 11 20 4
5 0 5 7 4 3 1 1 0 0
6 5 12 36 24 3 1 0 0 0
7 0 2 21 3 2 0 0 0 0
8 0 6 17 2 0 1 0 0 1
#plot(B$shaman, round(B$Wpet/200,0)) #plot(round(B$Wpet/200,0),B$shaman) #pet=round(B$Wpet/10,0) #pet=round(B$Wpet,0) #sham=B$shaman #plo=cbind(pet,B$crr,2) #this works plo=cbind(pet,sys3rec,2) #this works plo2=na.omit(plo) pet=plo[,1] sys3rec=plo[,2] #ploo=cbind(pet,B$system3recod,2) # #plo=na.omit(ploo) #xxxxxxxxxxx # #sys=plo[,2] #pt=plo[,1] #sys #pt #plo=cbind(pet,B$crr) #this works plot(plo, main = "lowess(plo)") #plot(plo, main = "lowess(plo)",ylab="system3recod") plot(plo, main = "lowess(plo)",ylab="sys3rec") lines(lowess(plo, f=.5, iter=250), col = "green") lines(lowess(plo, f=.05, iter=50), col = "red")
save(LRBsorted, file="LRBsorted.Rdata")
LB<-LRBsorted
save(LB, file="LB.Rdata")
load("LB.Rdata",.GlobalEnv)
B<-LB
LRBsorted <- read.csv("LRBsortPlusWvRecod.csv", header=TRUE)
#LRBsorted <- read.csv("LRBsortPlusWvRecod.csv", header=TRUE)
save(LRBsorted, file="LRBsorted.Rdata")
LB<-LRBsorted
save(LB, file="LB.Rdata")
load("LB.Rdata",.GlobalEnv)
B<-LB
LRBsortPlusWvRecod.csv
library(foreign)
setwd('/Users/drwhite/Documents/3sls/sccs/BinfordPrinComm')
load("LRBsortPlusWvRecod.csv,.GlobalEnv")
LRBsorted <- read.csv("LRBsortPlusWvRecod.csv", header=TRUE)
#LRBsorted <- read.csv("LRBsortPlusWvRecod.csv", header=TRUE)
save(LRBsorted, file="LRBsorted.Rdata")
LB<-LRBsorted
save(LB, file="LB.Rdata")
load("LB.Rdata",.GlobalEnv)
B<-LB
#table(B$shaman,B$systate3recod)
table(B$shaman,B$systate3recod) #r=0.534 p-value = 0.00000000000000022
table(B$shaman,B$systate3recod,B$forcol)
#table(B$shaman,B$crr)
table(B$shaman, round(B$Wpet/200,0)) ##r=-0.16 p-value = 0.004326
#table(B$shaman,B$pet)
#FROM BELOW:
cor.test(LRB$shaman,pca$scores[,1]) #r=0.320 p-value = 0.0000000065
cor.test(LRB$shaman,pca$scores[,2]) #r=0.114 p-value = 0.04306
Use LRBsortPlusWvRecod5.csv to make Vaux Ethno and VauxEnvirEthno.csv, Rdata
library(foreign)
setwd('/Users/drwhite/Documents/3sls/sccs/BinfordPrinComm')
Bin_Vaux <- read.csv("VauxEnvirEthno.csv", header=TRUE)
save(Bin_Vaux, file="Bin_Vaux.Rdata")
library(foreign)
setwd('/Users/drwhite/Documents/3sls/sccs/BinfordPrinComm')
#load("LRBsorted.Rdata",.GlobalEnv)
#load("LRBsortPlusWvRecod5.csv,.GlobalEnv)
LRBsorted <- read.csv("LRBsortPlusWvRecod7.csv", header=TRUE)
save(LRBsorted, file="LRBsorted.Rdata")
LRB<-LRBsorted
save(LRB, file="LRB.Rdata")
load("LRB.Rdata",.GlobalEnv)
library(labdsv)
year has eight zeros: societies 56, 69, 88, 92, 150, 152, 153 , 249
B$YEAR[56] <- 1970
B$YEAR[56] <-1970
B$YEAR[69] <-1970
B$YEAR[88] <-1930
B$YEAR[92] <-1925
B$YEAR[150] <-1870
B$YEAR[152] <-1920
B$YEAR[153] <-1900
B$YEAR[249] <-1880
#write.csv(LRBethno, file = "LRBethno.csv") #read.csv("LRBsortPlusWvRecod6.csv", header=TRUE)
B$hunting[70]=2
B$forcol[173]=1.5
write.csv(B, file = "LRBsortPlusWvRecod7.csv")
1570 1600 1700 1715 1760 1770 1800 1805 1830 1840 1848 1850 1860 1865 1870
1 1 1 1 1 1 4 1 3 3 2 32 73 1 38
1880 1890 1900 1901 1903 1906 1909 1910 1912 1920 1922 1924 1925 1926 1928
47 10 21 1 1 1 1 6 1 16 1 1 1 1 2
1930 1933 1934 1936 1937 1938 1940 1941 1946 1950 1952 1954 1960 1962 1963
8 1 2 1 1 1 3 1 1 6 1 1 4 1 3
1965 1968 1970 1972 1973 1974 1975 1976 1977 1978 1979 1980 1989 1990
3 7 4 1 1 1 1 2 1 2 1 3 2 1
#pca<-princomp(data = B, cbind(B$owners, B$packord, library(foreign)
setwd('/Users/drwhite/Documents/3sls/sccs/BinfordPrinComm')
#load("LRBsorted.Rdata",.GlobalEnv)
#load("LRBsortPlusWvRecod5.csv,.GlobalEnv)
LRBsorted <- read.csv("LRBsortPlusWvRecod6.csv", header=TRUE)
save(LRBsorted, file="LRBsorted.Rdata")
LRB<-LRBsorted
save(LRB, file="LRB.Rdata")
load("LRB.Rdata",.GlobalEnv)
library(labdsv)
year has eight zeros: societies 56, 69, 88, 92, 150, 152, 153 , 249
B$YEAR[56] <- 1970
B$YEAR[56] <-1970
B$YEAR[69] <-1970
B$YEAR[88] <-1930
B$YEAR[92] <-1925
B$YEAR[150] <-1870
B$YEAR[152] <-1920
B$YEAR[153] <-1900
B$YEAR[249] <-1880
#pca<-princomp(data = B, cbind(B$owners, B$packord, B$YEAR))
#pca<-princomp(data = B, B$YEAR, B$owners, B$packord)
cb=cbind(B$owners, B$packord, B$comstfun, B$money, B$dkinex, B$polpos, B$class, B$headman, B$war1, B$YEAR)
pca<-princomp(cb)
pca$scores
pca$loadings
# 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
#18 19 20 21 22 23 24 25*H
pca$scores[,1:3]
plot(pca)
summary(pca)
biplot(pca)
#pca<-princomp(data = B, B$YEAR, B$owners, B$packord) cb=cbind(B$owners, B$comstfun, B$packord, B$money, B$dkinex, B$polpos, B$class, B$headman, B$war1) pca<-princomp(cb) pca$scores pca$loadings pca$scores[,1:3] plot(pca) summary(pca) biplot(pca)
#pca<-princomp(data = B, B$YEAR, B$owners, B$packord) cb=cbind(B$owners, B$comstfun, B$packord) #cmplxMod BEST MODEL pca<-princomp(cb) pca$scores pca$loadings pca$scores[,1:3] plot(pca) summary(pca) biplot(pca) #Make Vaux variables pca$scores[,1] pca$scores[,2] pca$scores[,3] # is this needed? #stack(pca$scores[,1:3]) http://cran.r-project.org/doc/manuals/R-data.html write.csv(pca$scores[,1],"col1.csv") write.csv(pca$scores[,2],"col2.csv") write.csv(pca$scores[,3],"col3.csv")
cb=cbind(B$area, B$tlpop, B$systate3recod, B$occspe) #THESE ARE NA, B$Wgroup1, B$Wgroup2) #scope--> B$intrd not NA=0 #cb=cbind(B$sed, B$grpat, B$mobpat, B$mobpat2, B$nomov, B$dpmov) cb=cbind(B$sed, B$mobpat, B$mobp2) cb=cbind(B$kincon, B$kinstr, B$kinbia1, B$kinder, B$kinbia2, B$nenept, B$kinexo) #B$commun, cb=cbind(B$subsp, B$subspx, B$forcol, B$hunt, B$huntfil, B$huntfil2, B$hunting, B$gath, B$gatherin, B$fish) ###, B$fishing) cb=cbind(B$area, B$tlpop, B$systate3recod, B$occspe) #THESE ARE NA, B$Wgroup1, B$Wgroup2) pca<-princomp(cb) pca$scores pca$loadings pca$scores[,1:3] plot(pca) summary(pca) biplot(pca) #Make Vaux variables pca$scores[,1] #stack(pca$scores[,1]) write.csv(pca$scores[,1],"col.csv")
LRBsortPlusWvars.csv
Corrections: Recode latitude NA for 68 Hai//om Northern Namibia to 18.40111 Recode forcol NA for 173 Northern Pomo and take value for 163 Southern Pomo as 1
library(foreign)
setwd('/Users/drwhite/Documents/3sls/sccs/')
load("LRBsorted.Rdata",.GlobalEnv)
load("LRBsortPlusWvars.csv,.GlobalEnv)
LRBsorted <- read.csv("LRBsortPlusWvars.csv", header=TRUE)
save(LRBsorted, file="LRBsorted.Rdata")
LRB<-LRBsorted
save(LRB, file="LRB.Rdata")
load("LRB.Rdata",.GlobalEnv)
- pca<-princomp(data = B, Wae,bio5,bar5,nagp,snowac,growc,wltgrc,watret,density, defper,lden,setting,dposit, mrain,mtemp,sdrain,clim,et,rrcorr,rrcorr2,crr, Wmwm,mcm,pet,tlpop,area,subsp)
library(labdsv)
pca<-princomp(cbind(B$Wae,B$bio5,B$bar5,B$nagp,B$snowac,B$growc,B$wltgrc,B$watret)) #v2$loadings
pca$scores
pca$loadings #PERFECTLY INDEPENDENT
Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8
SS loadings 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
Proportion Var 0.125 0.125 0.125 0.125 0.125 0.125 0.125 0.125
Cumulative Var 0.125 0.250 0.375 0.500 0.625 0.750 0.875 1.000
# 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
pca<-princomp(cbind(B$Wae,B$bio5,B$bar5,B$nagp,B$snowac,B$growc,B$wltgrc,B$watret,B$density,B$defper,B$lden,B$setting,B$dposit,B$mrain,B$mtemp,B$sdrain,B$clim, B$et,B$rrcorr,B$rrcorr2,B$crr,B$Wmwm,B$mcm,B$pet,B$tlpop,B$area,B$subsp,B$Wsoil,B$hunt,B$wltgrc)))
#18 19 20 21 22 23 24 25*H pca$scores[,1:3] plot(pca) summary(pca) #biplot(pca)
pca<-princomp(cbind(B$crr,B$Wmwm,B$mcm,B$pet,B$tlpop,B$area,B$subsp,B$Wsoil,B$hunt))
# 1* 2 3 4*** 5** 6 7 8 9 pca$scores[,1:3] biplot(pca) summary(pca) #plot(pca)
LRBsortPlusWvars.csv ENVIRONMENT PCA BEST MODEL lowess
table(B$shaman, round(B$Wpet/200,0)) #works pet=B$Wpet sham=B$shaman plo=cbind(pet,B$crr) #this work plo=cbind(pet,B$system3recod,NA=2)
pet=round(B$Wpet,0) sham=B$system3recod plo=cbind(pet,B$crr) #this works plo=cbind(pet,B$crr) #+sham) plo=cbind(pet,sham) #this plots but doesnt do lowess plo=cbind(pet,B$crr) #this works plot(plo, main = "lowess(plo)") plot(plo, main = "lowess(plo)",ylab="System3recod") lines(lowess(plo, f=.5, iter=250), col = "green") lines(lowess(plo, f=.05, iter=50), col = "red")
pet=round(B$Wpet/10,0)
sham=B$shaman
plo=cbind(pet,sham)
- plo=cbind(pet,B$crr)
plot(plo, main = "lowess(plo)") xlab=Ubb["pdens","label"],ylab="Effect on Freq. Ext. War")
lines(lowess(plo, f=.5), col = "green") lines(lowess(plo, f=.05), col = "red")
- GOT TO GET RID OF NAs
lines(lowess(plo), col = "red") lines(lowess(plo, f=.1), col = "blue") legend(5, 120, c(paste("f = ", c("2/3", ".2"))), lty = 1, col = 2:3)
plo=cbind(pet,B$crr) lowess(plo) plot(pet,sham), lines(lowess(pet,B$shaman),col="red") abline(h=0,col="blue",lty=3)
- TAKEN FROM EXAMPLE:
plot(g[,"pdens"],jj*g[,"pdens"], main=paste("Effect of ",Ubb["pdens","label"],sep=""), xlab=Ubb["pdens","label"],ylab="Effect on Freq. Ext. War") lines(lowess(g[,"pdens"],jj*g[,"pdens"]),col="red") abline(h=0,col="blue",lty=3)
pca<-princomp(cbind(B$crr,B$pet,B$tlpop))
pca<-princomp(cbind(B$crr,B$pet,B$wltgrc)) ###########BEST MODEL
# 1* 2 3 pca$scores[,1:2] biplot(pca) summary(pca) #plot(pca) env1=pca$scores[,1] env2=pca$scores[,2] env1=pca$scores[1,] env2=pca$scores[2,] library(nnet) cor.test(LRB$shaman,pca$scores[,1]) #r=0.320 p-value = 0.0000000065 cor.test(LRB$shaman,pca$scores[,2]) #r=0.114 p-value = 0.04306 #if LRB$shaman=1 LRB$shaman=8
pca<-princomp(cbind(B$crr,B$pet,B$wltgrc,B$Wsoil))
# 1* 2 3 4*** 5** 6 7 8 9 pca$scores[,1:2] biplot(pca) summary(pca) #plot(pca)
Loadings:
Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8 Comp.9 Comp.10 [1,] 0.275 0.287 0.887 -0.167 0.147 [2,] -0.836 0.400 -0.317 -0.200 [3,] [4,] -0.192 -0.569 -0.788 0.102 [5,] -0.157 0.126 0.494 -0.203 0.799 0.177 [6,] [7,] [8,] 0.529 0.518 -0.441 -0.488 -0.118 [9,] -0.181 0.716 0.574
[10,] 0.102 -0.576 [11,] [12,] [13,] [14,] -0.288 0.216 [15,] [16,] 0.577 -0.523 [17,] [18,] [19,] [20,] [21,] 0.655 0.640 -0.382 [22,] [23,] [24,] 0.994 [25,] 0.104 -0.700 0.707 [26,]
Comp.11 Comp.12 Comp.13 Comp.14 Comp.15 Comp.16 Comp.17 Comp.18 [1,] [2,] [3,] 0.308 -0.929 -0.116 [4,] [5,] [6,] 0.129 -0.174 0.294 [7,] -0.149 -0.501 0.410 [8,] [9,] -0.285 -0.160 0.115
[10,] -0.753 -0.274 [11,] [12,] 0.587 0.688 [13,] -0.108 -0.101 [14,] -0.300 0.818 0.278 -0.143 [15,] -0.228 0.696 -0.147 0.470 0.139 [16,] 0.392 0.157 0.429 0.111 [17,] -0.111 0.159 [18,] -0.158 -0.392 0.429 [19,] -0.108 -0.691 [20,] -0.108 -0.691 [21,] [22,] 0.175 -0.781 0.188 -0.320 [23,] -0.192 0.551 -0.226 -0.170 [24,] [25,] [26,] -0.216 -0.131
Comp.19 Comp.20 Comp.21 Comp.22 Comp.23 Comp.24 Comp.25 Comp.26 [1,] [2,] [3,] [4,] [5,] [6,] -0.919 [7,] -0.117 -0.708 0.106 [8,] [9,]
[10,] [11,] 0.175 0.274 0.875 -0.342 [12,] 0.289 0.274 [13,] 0.194 0.291 0.917 [14,] [15,] -0.100 0.395 -0.106 [16,] [17,] 0.256 0.105 0.927 -0.111 [18,] 0.617 0.327 -0.347 [19,] 0.707 [20,] -0.707 [21,] [22,] 0.201 0.359 -0.103 [23,] 0.107 0.143 0.156 -0.690 0.161 [24,] [25,] [26,] 0.891 0.172 -0.288 -0.145
THIS HAS BEEN DONE
library(foreign)
setwd('/Users/drwhite/Documents/3sls/sccs/')
load("LRBinford.Rdata",.GlobalEnv)
write.csv(LRB, file = "LRBpresort.csv")
#hsb2 <- read.table('LRBpresort.csv', header=T, sep=",")
#attach(hsb2)
#hsb2[1:10, ]
#sorted hand with excel
LRBsorted <- read.csv("LRBsort.csv", header=TRUE)
LRBsorted <- read.csv("LRBsortPlusWvars.csv", header=TRUE)
save(LRBsortPlus, file="LRBsorted.Rdata")
LRB<-LRBinford LRB$subpop [1] "x" "n" "n" "n" "n" "n" "x" "x" "x" "x" "x" "x" "n" "n" "x" "x" "n" "x" "x" "n" "n" "x" "n" "n" "x" [26] "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "x" "n" "n" "n" "n" [51] "n" "n" "n" "n" "x" "n" "x" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" [76] "n" "n" "n" "n" "n" "n" "x" "n" "x" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" [101] "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" [126] "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" [151] "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "x" "x" "x" "n" "n" "n" "x" "n" "x" "x" "x" "x" "x" "x" [176] "x" "x" "x" "x" "x" "x" "x" "n" "n" "n" "x" "x" "x" "n" "x" "n" "x" "n" "n" "n" "n" "x" "x" "x" "x" [201] "x" "n" "x" "x" "x" "n" "x" "x" "n" "x" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "x" "n" [226] "n" "x" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "x" "x" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" [251] "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" [276] "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" [301] "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" [326] "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" LRB$systate3 [1] 1 2 2 2 6 4 2 2 4 2 4 4 1 4 3 3 4 2 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 [51] 4 4 4 4 2 4 2 4 4 4 4 4 4 4 1 4 4 4 4 4 4 1 4 4 1 1 1 4 4 4 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 [101] 6 6 6 6 5 4 6 6 5 5 3 3 7 6 3 3 3 3 5 1 4 4 4 6 7 4 6 4 4 4 4 6 5 4 5 3 4 7 6 6 4 6 5 6 5 6 4 4 4 6 [151] 6 4 4 4 4 4 4 4 4 6 4 3 2 3 2 4 4 2 4 3 3 3 3 3 3 2 3 3 3 3 3 3 5 6 7 4 2 2 4 2 7 5 4 4 4 4 3 3 3 3 [201] 3 4 2 2 2 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 5 4 4 4 4 6 4 4 4 5 2 6 4 5 6 6 6 6 5 6 6 6 [251] 7 5 6 6 6 7 4 6 6 6 6 6 5 6 6 6 6 6 6 6 6 5 6 6 6 6 6 6 6 6 6 6 6 6 5 4 6 4 4 4 4 4 6 4 5 5 6 6 6 6 [301] 6 7 6 6 7 7 6 6 6 6 7 7 6 7 7 7 7 7 7 7 7 7 7 6 7 7 6 6 6 6 6 6 7 6 5 6 7 7 6
noantrap mhs kmov kspmov lkspmov lkmov famsz sz1fam szjoint szcomu szmean forcol packing
library(foreign)
setwd('/Users/drwhite/Documents/3sls/sccs/')
SPSSforag <- read.spss("Binfords_HG_file_with_value_labels_current_Dec2008.sav", use.value.labels = FALSE)
#SPSSforag <- VauxSPSSforag
save(SPSSforag,file="Binford80vars339.RData") #not edited
write.table(SPSSforag,file="SPSSforag.csv",sep=",",row.names=T)
#save(SPSSforag,file="Binford80vars339.RData") #edited
#SPSSforag <- read.csv("Binford80vars339extra3a.csv", use.value.labels = FALSE)
SPSSforag <- read.csv("Binford80vars339extra3a.csv") #edited
s11=round(round(as.numeric(SPSSforag$s11,digits=0))/100, digits=0)
s11
v11=round(round(as.numeric(SPSSforag$v11,digits=0))/100, digits=0)
v11
table(v11,s11)
s11
v11 0 1 2 3
0 50 0 0 0 1 0 99 0 0 2 0 0 101 0 3 2 0 0 92 s=SPSSforag$s13 v=SPSSforag$v13 rs=1 rv=10 s=round(round(as.numeric(s,digits=0))/rs, digits=0) v=round(round(as.numeric(v,digits=0))/rv, digits=0) table(v,s) cor(v,s) s
v 0 1 2 3 4
0 34 0 0 0 0 1 1 93 0 0 0 2 0 19 122 0 0 3 0 0 21 39 0 4 1 0 0 0 14
> cor(v,s) [1] 0.9215152
s=SPSSforag$s14 v=SPSSforag$v14 rs=1 rv=1 s
v 0 1 2 3 4 5
0 37 0 0 0 0 0 1 1 97 0 0 0 0 2 0 36 79 0 0 0 3 0 0 1 49 0 0 4 0 0 0 2 32 0 5 1 0 0 0 1 8
> cor(v,s) [1] 0.9430436
s=SPSSforag$s15 v=SPSSforag$v15 rs=10 rv=10
s=round(round(as.numeric(s,digits=0))/rs, digits=0) v=round(round(as.numeric(v,digits=0))/rv, digits=0) table(v,s) cor(v,s) s
v 0 1 2 3 4 5 6
0 53 0 0 0 0 0 0 1 1 64 0 0 0 0 0 2 0 19 55 1 0 0 0 3 0 0 2 53 0 0 0 4 0 0 0 12 27 0 0 5 0 0 0 0 12 24 0 6 1 0 0 0 0 6 12 7 0 0 0 0 0 0 2
> cor(v,s) [1] 0.9624305
s=SPSSforag$s16 v=SPSSforag$v16 rs=10 rv=10
NONONO
s=SPSSforag$s18 v=SPSSforag$v18 rs=10 rv=10 s
v 0 1 2 3 4 5 6
0 127 20 8 2 3 5 3 1 0 22 5 0 0 0 1 2 2 0 31 2 0 0 0 3 3 0 1 25 0 0 0 4 2 0 0 2 27 3 0 5 2 0 0 1 2 24 0 6 1 1 0 2 0 11 6
> cor(v,s) [1] 0.7826439
> s=SPSSforag$v018 > v=SPSSforag$v18 > rs=10 > rv=.1 > > s=round(round(as.numeric(s))/rs, digits=0) > v=round(round(as.numeric(v))/rv, digits=0) > table(v,s)
s
v 0 1 2 3 4 5 6
0 165 0 0 1 0 0 2 1 3 25 0 0 0 0 0 2 0 4 31 0 0 0 0 3 0 0 1 28 0 0 0 4 0 0 0 0 34 0 0 5 0 0 0 0 0 29 0 6 1 0 0 0 0 0 20
> cor(v,s) [1] 0.9568045
NOW DO 26
s=SPSSforag$s27 v=SPSSforag$V27 rs=10 rv=10
cor(v,s) [1] 0.964592
s
v 0 1 2 3 4 5 6 7 8 9 10 11
0 208 3 3 2 3 0 1 3 4 1 2 0 1 3 5 4 0 0 0 0 0 0 0 0 0 2 0 0 8 4 0 0 0 1 0 0 0 0 3 0 1 0 4 6 1 0 0 0 0 0 0 4 0 0 0 0 4 8 1 0 0 0 0 0 5 2 1 0 0 0 2 10 0 0 0 0 0 6 2 0 0 0 0 0 3 13 0 0 0 0 7 1 0 0 0 0 0 0 0 7 0 2 0 8 2 0 0 0 0 0 0 0 1 8 0 0 9 1 0 0 0 0 0 0 0 0 0 7 1 10 0 0 0 0 0 0 0 0 0 0 0 1
> cor(v,s) [1] 0.7966424 > s=SPSSforag$s28
v=SPSSforag$v28 rs=100 rv=100 s
v 0 1 2 3
0 56 0 0 1 1 0 101 6 0 2 0 7 118 2 3 2 0 2 49 SPSSforag$subsp [1] 1 3 1 1 3 2 3 1 1 2 1 1 1 1 2 2 2 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 2 2 2 2 1 2 1 1 1 2 1 1 2 2 1 1 1 2 [83] 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 3 1 3 3 3 1 1 1 1 1 3 1 1 1 1 1 1 1 1 1 1 1 1 1 3 1 1 1 3 1 1 1 3 1 3 1 1 3 1 3 1 3 1 3 3 3 1 1 1 3 3 3 3 3 3 3 3 3 3 3 3 2 2 2
[165] 2 3 2 2 2 2 2 2 2 3 2 2 2 2 2 2 2 2 3 3 3 2 3 2 2 2 3 2 2 3 3 3 2 2 2 2 2 2 2 2 3 3 2 3 3 3 2 2 2 3 3 2 3 3 3 3 2 3 3 2 2 1 2 3 2 3 2 2 2 3 3 2 3 3 2 2 2 2 2 2 2 2 [247] 2 2 2 2 3 2 2 2 3 3 3 2 2 2 3 2 3 3 2 2 2 2 2 2 3 3 3 2 3 3 3 1 3 2 3 1 2 3 3 2 2 2 2 1 1 2 1 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 [329] 3 3 3 3 3 3 3 3 3 3 3 attr(,"value.labels")
Herding Agriculture Aquatics Gathering Hunting
"5" "4" "3" "2" "1"
> table(VauxSPSSforag$subsp)
1 2 3 77 142 120
# setwd("C:/My Documents/sccs") #<-- changed
library(sccs)
data(sccs) #this works
# setwd("C:/My Documents/sccs") #<-- changed
# library(sccs)
# data(sccs) #this works
# Binford47vars339narrowCompleteOnlyNo5No7.xlsx
# Vauxforag <- Binford47vars339narrowCompleteOnlyNo5No7.csv")
# setwd("C:/My Documents/Binford/Best_files/Vaux5_7")
setwd('/Users/drwhite/Documents/3sls/sccs/')
Vauxforag <- read.csv("Binford47vars339extra1a.csv")
Vauxforag <- read.csv("Binford80vars339extra1a.csv")
> Vauxforag[1:12,1:91]
v1 v2 v3 v4 AtlasName Atlas. EA Lang v5 v6
1 NA NA SUBPOP VEGNU
2 NA NA p118 p118
3 1 Punan Indonesia ASIA PUNAN . . 1243 Mp suspect 1
4 2 Batek Philippines ASIA NA suspect 1
5 3 Kubu Indonesia ASIA KUBU. . . 1114 suspect 1
6 4 Shompen Nicobar Islands ASIA NA HGF only 1
7 5 Onge Andaman Islands ASIA NA HGF only 1
8 6 Jarwa Andaman Islands ASIA NA HGF only 1
9 7 Ayta-Pinatubo Philippines ASIA NA suspect 1
10 8 North Island Andaman Islands ASIA NA HGF only 1
11 9 Semang Malaysia ASIA SEMANG. . 148 suspect 1
12 10 Veddah Sri Lanka ASIA VEDDA . . 145 suspect 2
v7 v8 v9 v10 v11 v12 v13 v14
1 VEG SOIL SNOWAC GROWC CRR NAGP GATH HUNT
2 p118 p118 p86 p86 p60 p86 p118 p118
3 Tropical rain forest U 0 12 3444.31 4,738.19 65 30
4 Tropical rain forest H 0 12 2546.75 3,852.15 65 30
5 Tropical rain forest U 0 12 3138.13 4,603.80 70 25
6 Tropical rain forest O 0 12 3189.8 5,514.85 50 15
7 Tropical rain forest O 0 12 3081.84 4,400.17 35 20
8 Tropical rain forest O 0 12 2301.48 3,713.79 50 20
9 Tropical rain forest H 0 12 3051.2 2,922.91 87 8
10 Tropical rain forest O 0 12 1752.74 2,966.27 60 35
11 Tropical rain forest U 0 12 2560.7 3,315.37 50 40
12 equatorial savannah woodland/broadleaf tree savannah O 0 12 1728.16 3,251.70 65 30
v15 v16 v17 v18 v19 v20 v21 v22 v23 v24 v25 v26 v27
1 FISH SUBSP GRPPAT %polygyny HHSZ HHSZdry HHSZwet FAMSZ GROUP1 GROUP2 GROUP3 TLPOP DENSITY
2 p118 p118 p118 p281 p288 p288 p288 p288 p118 p118 p118 p118 p118
3 5 2 1 4 2.91 3.61 22 30 62 349 0.117905405
4 5 2 2 5 5.06 3.95 19 58 424 0.432653061
5 5 2 1 12 11800 0.092000624
6 35 2 1 7 3.05 3.62 19.3 31 342 0.397674419
7 45 3 1 0 14.04 2.08 26.00 3.11 10 23 70 393 0.401020408
8 30 2 1 0 17.86 3.72 32.00 8.6 25.8 65 255 0.447368421
9 5 2 2 8 11 35 200 1645 0.918994413
10 5 2 1 4.10 3.77 11 43 80 464 0.33381295
11 10 2 1 3 13.21 2.91 23.50 3.45 17 34 71 366 0.175961538
12 5 2 1 5 4.23 3.85 14 29 72 72 0.184615385
v28 v29 v30 v30.1 v31 v32 v33 v34 v36 v37 v38 v39 v40
1 AREAx1000 PET AE ET LAT LONG R HIGH R LOW CRR CMAT MCM MWM male Height
2 p118 p86 p86 p60 p60 p60 p60 p60 p60 p60 p60 p60 p183
3 2960 1637.66 1637.66 25.2727273 3 114 361.02 211.58 26.4 26 26.8 4 1532.5
4 980 1778.69 1445.65 25.1911021 10 119.11 445.26 26.04 27.84 26.89 28.78 5 1531
5 128260 1609.52 1609.52 24.3897764 3.04 102.69 378.42 141.98 26.2 25.5 26.89 1578
6 860 1794.49 1794.47 24.3272727 7 93.77 485.8 49.1 28.2 26.7 29.7 7 1594.5
7 980 1679.34 1566.25 24.275122 10.7 92.47 521.72 28.7 27.17 26.04 28.29 0
8 570 1650.56 1414.14 23.6888889 12.19 92.37 388.75 12.95 26.54 25.36 27.71 0
9 1790 1678.16 1224.17 23.3745455 15.5 120.33 753.87 9.14 26.89 25.39 28.39 8 1464
10 1390 1605.91 1235.08 23.2911628 13.32 92.89 391.92 0.02 26.49 25.11 27.86 1491
11 2080 1320.69 1320.69 23.2576419 5.86 101 349.44 144.08 24.6 24.02 25.18 3 1491
12 390 1779.68 1305.34 22.8852989 8.58 81.25 375.36 18.54 27.75 25.56 29.94 5 1484
v41 v42 v43 v44 v45 v46 v47 v48 CASE.. v49 v50
1 female Height male kg female kg MAAMm MAAMf TFR MALE PI AUSTRALIA CASE # AE BIO5
2 p183 p183 p183 p281 p281 MARLOWE MARLOWE? DOUG p86 BIOS
3 1468 0 1 1,637.66 56,660.51
4 1432 46.5 40.6 19.9 15.1 5.2 0 2 1,445.65 30,378.80
5 1462 0 3 1,609.52 56,157.12
6 1480 57.9 47.55 18 15 0 4 1,794.47 59,274.80
7 2.6 0 5 1,566.25 57,386.48
8 0 6 1,414.14 29,082.34
9 1388 14 12 0 7 1,224.17 57,366.23
10 1400 0 8 1,235.08 20,875.11
11 1408 19 14.5 4.85 80 0 9 1,320.69 50,350.34
12 1420 14 0 10 1,305.34 17,266.13
v51 v52 v53 v54 v55 v56 v57 v58 v339list X X.1 v59
1 BAR5 WATD WLTGRC WATRET DEFPER NOMOV DISMOV HYTYP NA NA NAME SYSTATE3
2 BARS p86 p86 p86 p86 p86 p288col 3 NA NA p320
3 11.96 0 0 174.52 0.00 45 250 (Fn) 1 1 Punan 2
4 7.89 333.05 3 84.98 25.00 6 50 (Fn) 2 2 Batek 2
5 12.12 0 0 174.52 0.00 7 140 3 3 Kubu 3
6 10.75 0 0 306.71 0.00 (N) 4 4 Shompen 3
7 13.05 113.09 0 259.95 25.00 (M) 5 5 Onge 4
8 7.83 236.42 2 196.74 41.67 M 6 6 Jarwa 4
9 19.63 453.99 4 65.80 41.67 7 7 Ayta-Pinatubo 2
10 7.04 370.84 3 178.00 41.67 8 8 North Island 4
11 15.19 0 0 174.52 0.00 N 9 9 Semang 4
12 5.31 474.32 2 129.28 41.67 Fm 10 10 Veddah 4
v60 v61 v62 v63 v64 v65 v66 v67 v68 v69 v70
1 MHSSET MHSSET2 FAMHOUS G2MHS G2HMSET2 G2HMSET3 G2BASORD PREVALUE PREDG2MH GIFAMSZ G2FAMSZ
2 p320 p320 p320 p320 p320 p320 p320 p320 p320 p320 p320
3 3 2 1.68 4.93 1 1 3 7.98 6.35 6.09 8.31
4 5 1 1.30 11.42 5 3 6 4.20 4.21 4.88 14.91
5 9
6 5 1 0.84 10.16 6 3 4 3.75 3.75 5.33 8.56
7 1 2 3.27 7.52
8 1 2
9 5 1 3.82 2.72 8.66
10 5 1 3.95 2.92 11.41
11 1 2 5.21 10.43
12 5 1 5.07 3.64 7.53
v71 v72 v73 v74 v75 v76 v77 v78 v79 c80 X.2
1 GIMHS POLYSCAL POLPOS CLASS PEROGAT MONEY OCCSPE COMMUN COMSTFUN OWNERS NO.
2 p320 p320 p320 p320 p320 p320 p320 p320 p320 p320 Case
3 2 3 1 2 2 1 4 1 1 1
4 3.752 2 3 1 1 1 1 2 1 4 2
5 2 3 1 1 1 1 5 1 1 3
6 6.332 2 3 1 1 1 1 3 1 2 4
7 0.882 2 1 1 1 1 1 1 1 1 5
8 0.272 2 1 1 1 1 1 1 1 1 6
9 2.622 2 3 1 1 1 1 1 1 3 7
10 4.392 2 1 1 1 1 1 4 1 1 8
11 0.592 2 3 1 1 1 1 4 1 1 9
12 3.312 2 3 1 1 1 1 5 1 1 10
X.3 X.4
1 STATE NAME
2
3 Indonesia Punan
4 Philippines Batek
5 Indonesia Kubu
6 Nicobar Islands Shompen
7 Andaman Islands Onge
8 Andaman Islands Jarwa
9 Philippines Ayta-Pinatubo
10 Andaman Islands North Island
11 Malaysia Semang
12 Sri Lanka Veddah
save(Vauxforag,file="BinfordVaux.RData") #<-- March 10, 2012 data(Vauxforag) #doesnt work but no mind for now (need with main program) data(BinfordVaux)
library(stats) princomp(Vauxforag) #works
#now make VauxB.csv from the principal components
setwd("C:/My Documents/Binford/Best_files/Vaux5_7")
VauxB <- read.csv("VauxB.csv")
save(VauxB,file="VauxB.RData")
save(VauxB,file="VauxB.RData")
Extract results from princomp
Std and Rotation #http://stat.ethz.ch/R-manual/R-devel/library/stats/html/princomp.html ##https://stat.ethz.ch/pipermail/r-devel/1998-August/018454.html 1998 revised ###http://www.stat.ualberta.ca/~kcarrier/STAT575/PCA_R.doc Labdsv library #Modify their example: princomp(Vauxforag, cor = TRUE) # =^= prcomp(Vauxforag, scale=TRUE) prcomp(Vauxforag, scale=TRUE) #THESE ARE ALL THE RESULTS NEEDED STOP HERE!!! ## Similar, but different: ## The standard deviations differ by a factor of sqrt(49/50) summary(pc.cr <- princomp(Vauxforag, cor = TRUE)) loadings(pc.cr) ## note that blank entries are small but not zero print(pc.cr, digits = 4, cutoff = 2.5) ## note that blank entries are small but not zero prcomp(Vauxforag, scale=TRUE) plot(pc.cr) # shows a screeplot. biplot(pc.cr) #loadings(x)
## S3 method for class 'loadings': #print(x, digits = 3, cutoff = 0.1, sort = FALSE, ...)
## S3 method for class 'factanal': #print(x, digits = 3, ...)
## Formula interface
princomp(~ ., data = Vauxforag, cor = TRUE)
## NA-handling
Vauxforag[1, 2] <- NA
pc.cr <- princomp(~ Murder + Assault + UrbanPop,
data = Vauxforag, na.action=na.exclude, cor = TRUE)
pc.cr$scores[1:5, ] #works
Chapter 2. Specific databases
Foragers
setwd("C:/My Documents/Binford/best_files")
forag <- read.csv("Binford47vars339.csv")
save(forag,file="Binford47vars339.RData")
#try:
forag <- read.csv("bin.csv")
save(forag,file="bin.RData")
save(forag,file="Binford47vars339.RData")
Hamilton, Marcus J., Bruce T. Milne, Robert S. Walker, Oskar Burger, and James H. Brown. 2007. The complex structure of hunter–gatherer social networks. Proceedings of the Royal Society B (UK). James H. Brown is known for The fractal nature of nature: power laws, ecological complexity and biodiversity with Vijay K. Gupta, Bai-Lian Li, Bruce T. Milne, Carla Restrepo and Geoffrey B. West.] Data supplement.
Read and change an entry in an Rdata file
edit the csv file
setwd("C:/My Documents/MIjaj")
jajman<-read.csv("jaj.csv")
save(jajman,file="jajman.Rdata")
EduMod85 - EduMod86
Chapter 3. Extending the database
Use Microsoft Access to