Anthon Eff

From InterSciWiki
Jump to: navigation, search

Doug, I created this page in your wiki: http://intersci.ss.uci.edu/wiki/index.php/Materials_for_cross-cultural_research Anthon

http://capone.mtsu.edu/eaeff/Dow-Eff%20functions.html for EA, SCCS, LRB, and WNAI, you will see that the codebooks appear: http://capone.mtsu.edu/eaeff/Dow-Eff%20functions.html

May 17 2017

Doug, that's a great idea. I remember editing a bit on your wiki some years ago, before I was locked out. So there are some older versions of EA, SCCS, etc. that I posted there. I can probably do that again, if I had access.

The problem right now appears to be that Dropbox has changed the way it handles publicly shared files. I think the best thing would be to try to offer people two paths--one through your excellent resources, the other through my Dropbox.

I'll try to figure out the Dropbox angle tomorrow.

Anthon

From: Doug White <douglas.white@uci.edu> Sent: Tuesday, April 18, 2017 6:30 PM To: Anthon Eff; Jeff Stern Subject: vulnerable online data at your personal site; could create

Dear Anthon,

Almost everything on the internet can be removed by others - this is beginning to affect our items online, including yours I think (e.g. for cross-cultural data, codebooks etc. unless appropriately protected; As far as I can see none of you sites are protected tho I may be wrong).

I wanted you to know that my wiki is unusual in that all its items are immune to destruction via a giant immune "stack" of vast size, donated by Jeff Stern at UCI (949 824 2326), contributed in a way that includes my wiki, with specific items protected. Saved in a protected stack it is immune to destruction. I my name for example, Douglas R. White, on our intersciwiki, and you can add your own password or I can loan you mine (or make you an Anthon Eff name and its password) so that you could create pages in the wiki (www) that are immune to destruction over the wikiweb (intersciwiki) once I give you the name and password that you want. Outside intersciwiki nearly everything in the open www is susceptible to destruction from outside and from outsiders without passwords.

Best, Doug

April 21, 2017

  • Doug: Couldn't figure out how to update the DropBox links, so I simply changed the links from DropBox to our university mainframe. I didn't have enough time to finish editing today, but if you check the pages for EA, SCCS, LRB, and WNAI, you will see that the codebooks appear: http://capone.mtsu.edu/eaeff/Dow-Eff%20functions.html

Tried to login to intersciwiki, but it didn't work. I'll try again tomorrow.

  • Anthon
  • Jon, can you help us with this? thanks, Doug
  • If you want to give Anthon his password too that's fine, I dont need to know it.

April 19, 2017a

Where now are the equivalents of your dropboxusercontent.com/u/9256203/SCCScodebook.txt , the dataset(s) for http://socscicompute.ss.uci.edu , and the CoSSci components?

I'm going to try to put them together again at intersciwiki (did you get my email this AM providing you access to the wiki with an internal password)? We've had five years of usage of "CoSSci" components in a SocSci class in China taught by one of my former TAs (Ren Feng), now a tenured professor (time flies). He's teaching again in the fall. Meanwhile I've completely recovered from the stomach cancer, gracias a dios.

Best, Doug

April 19, 2017b

Doug, that's a great idea. I remember editing a bit on your wiki some years ago, before I was locked out. So there are some older versions of EA, SCCS, etc. that I posted there. I can probably do that again, if I had access.

The problem right now appears to be that Dropbox has changed the way it handles publicly shared files. I think the best thing would be to try to offer people two paths--one through your excellent resources, the other through my Dropbox.

I'll try to figure out the Dropbox angle tomorrow.

Anthon


________________________________________ From: Doug White <douglas.white@uci.edu> Sent: Tuesday, April 18, 2017 6:30 PM To: Anthon Eff; Jeff Stern Subject: vulnerable online data at your personal site; could create

Dear Anthon,

Almost everything on the internet can be removed by others - this is beginning to affect our items online, including yours I think (e.g. for cross-cultural data, codebooks etc. unless appropriately protected; As far as I can see none of you sites are protected tho I may be wrong).

I wanted you to know that my wiki is unusual in that all its items are immune to destruction via a giant immune "stack" of vast size, constructed by Jeff Stern at UCI (949 824 2326), contributed in a way that includes my wiki, with specific items protected. Saved in a protected stack it is immune to destruction. I my name for example, Douglas R. White, on our intersciwiki, and you can add your own password or I can loan you mine (or make you an Anthon Eff name and its password) so that you could create pages in the wiki (www) that are immune to destruction over the wikiweb (intersciwiki) once I give you the name and password that you want. Outside intersciwiki nearly everything in the open www is susceptible to destruction from outside and from outsiders without passwords.

Best, Doug

April 18, 2017

  • Malcolm and Doug: The author appears to be a student of Gary King. The paper is in the spirit of the recent “replication” papers in psychology. From the abstract:

Political scientists increasingly recognize that multiple imputation represents a superior strategy for analyzing missing data to the widely used method of listwise deletion. However, there has been little systematic investigation of how multiple imputation affects existing empirical knowledge in the discipline. This article presents the first large-scale examination of the empirical effects of substituting multiple imputation for listwise deletion in political science. The examination focuses on research in the major subfield of comparative and international political economy (CIPE) as an illustrative example. Specifically, I use multiple imputation to reanalyze the results of almost every quantitative CIPE study published during a recent fiveyear period in International Organization and World Politics, two of the leading subfield journals in CIPE. The outcome is striking: in almost half of the studies, key results “disappear” (by conventional statistical standards) when reanalyzed.

Hope all is well…

Anthon

SCCS

TonyEffsm.jpg
TonyEff lrg.jpg

home page - Middle Tennessee State x2387, x2520 for Economics & Finance Chair -- CV

Where to put the R package: Anthon

Ben: Speaking for myself only, I think the more people who work with these data the better, so I welcome your proposal.

So far, we have not put our data in an R package, but instead put them in an R workspace, which can be loaded from the cloud or downloaded to ones own machine. The workspace contains not only the latest version of the data (for LRB, SCCS, EA, and WNAI) but also a set of functions that allow models to be estimated correcting for Galton's Problem, in a multiple imputation context.

The R command to load the workspace: load(url("http://capone.mtsu.edu/eaeff/downloads/DEf01f.Rdata")) The manual for the functions: https://dl.dropboxusercontent.com/u/9256203/Manual_DEf.pdf My webpage where this kind of thing is listed: http://capone.mtsu.edu/eaeff/Dow-Eff%20functions.html

--Anthon


Original Message-----

From: Benjamin Marwick [1] Sent: Sunday, July 31, 2016 6:42 PM To: ajohnson@truman.edu; drwhite@uci.edu; Anthon Eff <Anthon.Eff@mtsu.edu> Subject: reuse of Binford’s hunter-gatherer data in an R package

Dear Profs. Johnson, White, and Eff,

I'm writing about a public release of an R package containing the data used in Binford's 2001 book "Constructing Frames of Reference: An Analytical Method for Archaeological Theory Building Using Ethnographic and Environmental Data Sets". The package would be freely available and would make it easier for R users to work with these data. I'm motivated to do this by wanting to make it easier to use these data in my graduate seminar class.

I found the datasets on these web pages: http://ajohnson.sites.truman.edu/data-and-program/ & http://intersci.ss.uci.edu/ and I just wanted to check with you if it's ok to make a package like this. I would of course list you all as co-authors of the package because of your role in compiling the data.

The main reason I want to check is that I'm not sure about your intentions for the copyright status of these data files. I understand that in the US copyright law does not apply to data, but can apply to an original selection, arrangement, etc of data, such as what you provide on your websites.

I also understand that when data are provided without a license, then the default copyright laws apply, which usually means no permission to use, modify, or share the software. My guess is that this is an oversight in this case, since if you're making the data freely available, then you're happy for others to download and use the data.

If that's the case, may I suggest adding a license to the data? This can be as simple as including a statement like this near the link to download the data files:

"These data are released here with a CC0 license, for the full terms, see https://creativecommons.org/publicdomain/zero/1.0/ You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission."

There are many possible licenses (you may prefer one that requires attribution, eg. CC-BY), but CC0 is recommended for scientific datasets. Victoria Stodden is a good authority on these issues: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1362040

If I might further suggest, you could archive a copy of the data files at zenodo.com (or similar data repository, there are several to choose from). If you create a repository at zenodo (or similar), you can get a DOI, which is useful for citing data (they are more persistent that faculty or project websites, where the URL often changes every 5-10 years). A data repository also makes versioning explicit, so if you update the dataset (for example, with minor corrections), then you can release a new version to zenodo and get a new DOI. Then it's very clear who is using what version, because each version is tagged with a DOI.

Anyway, these are just some thoughts based on what I see others doing in related fields (linguistics, biology, etc.). They might help make these important data easier to maintain, more visible to potential users, and more accessible and usable.

To return to my main reason for writing, if you have any questions about my proposed R package, please let me know. I see there is already an R package for the SCCS data (https://github.com/drwhite/Statistical-Inference-SCCS) so I think a package of the data for Binford's book would be a good fit.

warm regards,

Ben

-- Ben Marwick, Associate Professor, Department of Anthropology Box 353100, University of Washington Seattle, WA 98195-3100 USA

(+1) 206.552.9450 bmarwick@uw.edu @benmarwick http://faculty.washington.edu/bmarwick/

Could do

ToDo Did *.csv: Added the headers on h[]

2013 Monogamy

Dow, Malcolm M., and E. Anthon Eff. 2013. "When One Wife is Enough: A Cross-Cultural Study of the Determinants of Monogamy." Journal of Social, Evolutionary, and Cultural Psychology 7(3):211-238. (Link)

2013 article with map app that shows autocorrelation regions

EA Eff, Abhradeep Maiti. 2013. A measure of technological level for the Standard Cross-Cultural Sample. Department of Economics and Finance Working Paper Series. http://capone.mtsu.edu/berc/working/SCCStechnology.pdf

Madagascar case: #81 Tanala http://eclectic.ss.uci.edu/~drwhite/worldcul/Sccs34.htm

Imputing missing data

Previous draft in pdf

  1. pp 94 E. Anthon Eff and Malcolm M. Dow. Updated scripts for R in Eff and Dow (2009) Issue 3#1 art 1. Structure_and_Dynamics_contents#Issue_5#2_2012

External war

Anthon Eff & Philip W. Routon. 2012. Farming and Fighting: An Empirical Analysis of the Ecological-Evolutionary Theory of the Incidence of Warfare. Structure and Dynamics 5(2). Structure_and_Dynamics_contents#Issue_5#2_2012

Routon Eff 1.R update

Parental Investment and per capita GDP

Eff, E. Anthon and Giuseppe Rionero. 2011 pdf. The Motor of Growth? Parental Investment and per capita GDP.World Cultures eJournal, 18(1).

Belief in Moralizing Gods

http://www.ehbonline.org/article/S1090-5138(02)00134-4/abstract

Markets and prosocial behavior

  • Henrich, Joseph, Robert Boyd, Samuel Bowles, Colin Camerer, Ernst Fehr, Herbert Gintis, et al. (2004). Foundations of Human Sociality: Economic Experiments and Ethnographic Evidence from Fifteen Small-Scale Societies. Oxford University Press. GROUP SELECTION
Henrich et al. (2004: 33-35) found that individual-level data such as age, sex, exposure to markets, and wealth do not explain individual-level variations in offers *(prosocial behaviors in cooperative games)* and rejections; the crucial determinants are group-level measures in market integration <fn 1> and payoffs to cooperation.
fn 1: The term “Market Integration,” as employed by Henrich et al. (2004: 28-29) refers to a composite measure containing three variables: 1) frequency of market exchange market integration); 2) amount of centralized decisionmaking taking place above the household (sociopolitical complexity); and 3) size of local settlements (settlement size). Each of these variables is formulated as the rank of a particular society (among all of the Henrich et al. societies) for that dimension. The composite is simply the mean of the three ranks.
On the other hand, querying players on their ideological or social preferences typically yields variables that explain game results quite well (Henrich et al. 799). What is new is the notion that results could vary so much across different societies, and that market integration of each society could explain a significant portion of that variation.
Market integration, as formulated by Henrich et al., reflects the frequency and importance of contact with strangers. Societies organized in semi-autarkic households, with scant need to interact with strangers might therefore exhibit little prosocial behavior when interacting with the anonymous other of a game (Henrich et al. 2004: 40). Likewise, cooperation with non-kin would be low in societies with semiautarkic households, since the household can furnish most of its own needs. From this perspective, prosocial behavior would be emphasized in societies where persons come in frequent contact with others, requiring that contact in order to gain their livelihood. For example, the society with the most prosocial behavior in the Henrich et al. sample (the Lamalera) engages in whale hunting, which requires that men work together in boat crews, dividing the catch (Henrich et al. 2004: 39).
Abstract: Recent experimental games conducted by ethnographers (Henrich et al. 2004) have shown that groups with higher levels of market integration exhibit higher levels of prosocial behavior. In order to see whether these results are confirmed in a broader ethnographic sample, this paper draws from the Standard Cross-Cultural Sample variables measuring the degree to which a culture seeks to inculcate generosity, honesty, and trust. Using these as dependent variables, models are developed where market-related variables are among the independent variables. The paper uses the methodology developed by Dow (2007) to correct for Galton’s problem, and uses multiple imputation to deal with the problem of missing data. The results fail to confirm a systematic association between generalized prosocial behavior and market integration.
  • Eff, E. Anthon, and Malcolm M. Dow. 2010. Market integration and pro-social behavior. In, Cooperation in Economy and Society, Robert C. Marshall, Editor. Cooperation in Economic and Social Life. Society for Economic Anthropology Monographs Vol 28. AltaMira Press: Walnut Creek, CA.

Galton's problem

Wikipedia:Galton's problem

Comparative_research_tools#Anthon Eff's SAR Procedures Simultaneous AutoRegression
Abstract: Ember, Ember, and Low (2007) recently reported male mortality in warfare and environmental pathogen stress as statistically significant predictors of nonsororal polygyny. Two sources of bias can be identified in their data analysis: 1) omitted variable bias due to not including a variable for cultural trait transmission, that is, Galton's Problem; and 2) bias caused by extensive deletion of cases when the basic assumption required by listwise deletion, that the missing data are missing completely at random, does not hold. We first re-estimated Ember et al.'s model after adding a trait transmission variable using the listwise deletion subsample, and then again after using contemporary multiple imputation procedures to deal with missing data. Our findings indicate that the significant effects reported for male mortality and pathogen stress are the result of these two sources of bias. The only significant predictor of the world-wide distribution of nonsororal polygyny in the current analyses is cultural trait transmission.
  • Dow, Malcolm M., and E. Anthon Eff. 2009 Multiple Imputation of Missing Data in Cross-Cultural Samples. Cross-Cultural Research, Vol. 43(3): 206-229 (2009) DOI: 10.1177/10693971093333. Abstract: Listwise deletion of cases with missing data prior to statistical analysis, the approach overwhelmingly used by cross-cultural survey researchers, requires the assumption that the missing data are missing completely at random. This assumption is not often likely to hold for cross-cultural sample data, and when it fails statistical analysis based only on complete-case subsamples introduces the possibility of biased estimates and standard errors. Over the past 20 or so years statisticians have made major advances in specifying the conditions under which missing data can be ignored when making inferences based on incomplete data. We review these conditions since they have a direct bearing on when the usual approaches to dealing with missing cross-cultural survey data are invalid.
  • Dow, Malcolm M. and E. Anthon Eff 2008 Global, Regional, and Local Network Autocorrelation in the Standard Cross-Cultural Sample Cross-Cultural Research 42(2):148-171 DOI: 10.1177/1069397107311186. Abstract. There is now considerable evidence in the cross-cultural literature that cultural networks need not be based strictly on spatial propinquity but may develop along other dimensions such as common language, religion, and levels of cultural complexity. In this article, the authors generate networks based on sociocultural distance metrics for these three network dimensions in addition to the usual geographical distance measure and a measure of overall ecological niche similarity. The authors report overall levels of autocorrelation for all five networks using 1,156 Standard Cross-Cultural Sample (SCCS) variables at the global level and for a subset of 422 variables within four regions. The extent to which cultural trait distributions appear to be influenced by combinations of network processes also are assessed. Results from an analysis based on a local autocorrelation statistic provide confirmation of the regional levels of autocorrelation within the SCCS data set.
  • Dow, Malcolm M, and E. Anthon Eff. 2009a. Cultural Trait Transmission and Missing Data as Sources of Bias in Cross-Cultural Survey Research: Explanations of Polygyny Re-examined. Cross-Cultural Research. 43(2): 134-151. Abstract: Ember, Ember, and Low (2007) recently reported male mortality in warfare and environmental pathogen stress as statistically significant predictors of nonsororal polygyny. Two sources of bias can be identified in their data analysis: 1) omitted variable bias due to not including a variable for cultural trait transmission, that is, Galton's Problem; and 2) bias caused by extensive deletion of cases when the basic assumption required by listwise deletion, that the missing data are missing completely at random, does not hold. We first re-estimated Ember et al.'s model after adding a trait transmission variable using the listwise deletion subsample, and then again after using contemporary multiple imputation procedures to deal with missing data. Our findings indicate that the significant effects reported for male mortality and pathogen stress are the result of these two sources of bias. The only significant predictor of the world-wide distribution of nonsororal polygyny in the current analyses is cultural trait transmission.
The Cultural Trait Transmission variables here are developed in Dow (2007) and correspond to vertical (language family proximity) and horizontal (special proximity) in the Standard Cross-Cultural Sample.
See: http://ccr.sagepub.com/cgi/reprint/41/4/428

World Cultures 15#2

Eff, E. Anthon. 2004. Does Mr. Galton Still Have a Problem? Autocorrelation in the Standard Cross-Cultural Sample. World Cultures 15(2):153-170. http://www.mtsu.edu/~eaeff/downloads/EffsWC15no2.pdf

There are three programs.

The article uses Eff's SAR Procedures

Structure and Dynamics 2008 and R programs

Eff, E. Anthon. 2008. "Weight Matrices for Cultural Proximity: Deriving Weights from a Language Phylogeny." Structure and Dynamics: eJournal of Anthropological and Related Sciences 3(2), Article 9. http://repositories.cdlib.org/imbs/socdyn/sdeas/vol3/iss2/art9

Bivand, Roger, with contributions by Luc Anselin, Olaf Berke, Andrew Bernat, Marilia Carvalho, Yongwan Chun, Carsten Dormann, Stéphane Dray, Rein Halbersma, Nicholas Lewin-Koh, Jielai Ma, Giovanni Millo,Werner Mueller, Hisaji Ono, Pedro Peres-Neto, Markus Reder, Michael Tiefelsdorf and and Danlin Yu. 2007. spdep: Spatial dependence: weighting schemes, statistics and models. R package version 0.4-4

http://sal.uiuc.edu/csiss/Rgeo/

Bivand, R. 2006: Implementing spatial data analysis software tools in R Geographical Analysis 38, 23—40.

Loftin, Colin 1972 Galton's problem as spatial autocorrelation: comments on Ember's empirical test Ethnology 11: 425-35.

Loftin, Colin, Robert H. Hill, Raoul Naroll, Enid Margolis 1976 Murdock-White Interdependence Alignment of Ethnographic Atlas Culture Clusters Cross-Cultural Research 11(3): 213-223.

Loftin, Colin, and Sally K. Ward. 1981 Spatial Autocorrelation Models for Galton's Problem Cross-Cultural Research, Vol. 16, No. 1-2, 105-141.

White, Douglas R., and Michael L. Burton, Malcolm M. Dow. 1981. “Sexual Division of Labor in African Agriculture: A Network Autocorrelation Analysis.” American Anthropologist. 83:824-849.

Proximity measures

The first program creates three proximity matrices: physical distance, language phylogeny, and cultural complexity. Cultural complexity is a Euclidean distance measure based on 10 SCCS variables, and one could easily rewrite the program a bit to make other proximity measures based on SCCS variables--I've tried things like subsistence and ecology. http://www.mtsu.edu/~eaeff/downloads/mkwmat.sas

The 186x186 language similarity matrix can be downloaded from http://intersci.ss.uci.edu/wiki/drw/AnthonEff/wmlang02112006.xls and earlier codes from http://intersci.ss.uci.edu/wiki/drw/AnthonEff/wmlngp.xls and http://intersci.ss.uci.edu/wiki/drw/AnthonEff/wmlngs.xls

Eff, E. Anthon. 2004. Spatial, Cultural, and Ecological Autocorrelation in U.S. Regional Data. MTSU Department of Economics and Finance Working Papers. September 2004. (Link)

Regression residuals

The second program tests regression residuals for autocorrelation, using the three weight matrices. IML is used to calculate the Moran statistic and its z value. I enter the IML code directly in the program, rather than calling it from a module--I think that makes it a bit easier to understand the program, though it is still pretty complicated. http://www.mtsu.edu/~eaeff/downloads/ac_resid.sas

Tests of SCCS variables for autocorrelation

The third program tests a set of SCCS variables for autocorrelation, using the three weight matrices. http://www.mtsu.edu/~eaeff/downloads/ac_variab.sas


The SAS dataset

I also include the SAS data set as it was when I last worked with the SCCS (probably two years ago). http://www.mtsu.edu/~eaeff/downloads/mrg.sas7bdat

I run SAS on a UNIX mainframe. By changing the path in the libname statement, you should be able to run these programs on any machine, but I haven't tried. -- E. Anthon Eff, Associate Professor Dept. of Economics, Middle Tennessee State University Box X050, Murfreesboro TN 37132 Phone: 615.898.2387 http://www.mtsu.edu/~eaeff/

The Spss dataset in R

through statehoodkoro and new combative sports (http://dl.dropbox.com/u/9256203/ccc.txt) codebook... at the end are environmental variables

Below is a link to the data I usually use, with the addition of the new data you sent me. The data frame includes some variables I pulled from GIS data, as well as my language phylogeny variables. Variables that only make sense as characters are formatted as characters, the others are numeric. None are formatted as factors.

http://dl.dropbox.com/u/9256203/sccsA.Rdata

I've written some R code that will produce a rough draft of a codebook. The headings include a few things that I thought would be useful--whether the variable is character or numeric, and whether it is categorical or ordinal (or unknown). Other information could be added fairly easily, such as the author of the code, to make citing easier. I think this would be the easiest way to produce the first draft of a codebook, and would be very glad to do this when the new data officially come out. Below is the codebook for the data above.

http://dl.dropbox.com/u/9256203/ccc.txt


sor.oth - numeric - ordinal - Dummy indicating sororal polygyny predominant 1 116 0 2 44 1 3 26 NaN

v0. - numeric - unknown - v0.StatesPaige 1 92 1 2 94 2

agricsystempryor - numeric - unknown - AgricSystemPryor 1 8 1 2 14 2 3 6 3 4 13 4 NA 145 NA

agricsysreliability - numeric - unknown - AgricSysReliability 1 31 1 2 5 2 3 4 3 4 1 4 NA 145 NA

statehoodkoro - numeric - unknown - statehoodKoro 1 45 0 2 17 1 3 19 2 4 10 3 5 11 4 NA 84 NA

id66 - numeric - unknown - id66 n 186 mean 94.5 sd 53.84 med. 94.5 min 2 max 187 skew 0 kur. -1.22

overwar - numeric - unknown - GC Combative Sports: overwar

... continues

Spatial geography blog r.geo=

http://blog.gmane.org/gmane.comp.lang.r.geo/

Syllabi

History of Economic Thought 2012 Seminar