Human Social Complexity and World Cultures 2009-2010

From InterSciWiki
Jump to: navigation, search
  • ANTH174-09 Class talk
  • Instructor Doug White drwhite(at)uci.edu 4-5893, in class noon-12:30 and office hour 2-2:30 New Gateway (SBSG) Building, Rm 3544 https://eee.uci.edu/
  • TA Janny Li jli18(at)uci.edu, office hour Mon 10-11 am Starbucks on campus (look for her inside!)
  • Winter 2010 Anthro 174A 60640 Human Social Complexity - World Cultures Survey 2010‎, 11:00-12:20 Rm 155 SSTower
  • Fall 2009 Anthro 174AW 60330 Human Social Complexity, World Cultures, 12:30-1:50 Rm 155 SSTower
  • Abstract: Run a powerful regression program that does estimates of the effects of causal factors for human culture variation. You select variables for study from a database of two thousand variables on features of the full range of diverse human cultures (the SCCS). Take the open source page R home with you and the database and software. Receive tutorials to learn these methods, learn from other student projects, and start from where they left off. Present an 10-minute powerpoint on your findings, and participate with the instructor, optionally, in preparing a journal publication. Learn how to understand complex human systems and write about them (the fall 2009 writing course, but not the winter course, has a proposal, draft, and 10 page paper). We build off the 2009 course in the winter 2010 sequel, which makes the course of extra value. Hats off to the 2009 students!
Unsolicited note from Anthon Eff, who wrote our R program: I had asked a recoding question and he answered re fall 2009:
> Thrilled that your students got into it--my more inquisitive students
> also get excited. These data can address big questions, and with good
> methods they will provide persuasive answers.
> 
> Anthon

Contents

Hints for the wiki

  • If you are looking for something, type, for example: SCCS (in the searchbox to the left).
  • Hint on printing these pages: Don't print wiki pages directly or your page will be cluttered with url's for external links. Rather, copy what you want to print (control-A for the whole page), paste into word, and reset the margins and type sizes if needed, then print from word.

Themes

The theme of last year's course, Human Social Complexity and World Cultures 2008, was the cross-cultural study of the social dynamics leading to war, the role of beliefs about conflict, and the more principles that support peaceful coexistence among some societies and warfare among others. Last year the educational exortation in learning valid and reliable research practices was that your generation would have to do a great deal of work in reinventing our social, political, and economic institutions in ways that actually work, and to learn how to develop research and scientific practices, including research writing, that are both honest and sane. In the midst of last year's class, your generation and its countrwide majority of allies, delivered on the promise of a new politics in the U.S. and in the U.S. role in the world order, distinct from the old "new world" order of the ideological neoconservatives.

The theme this year is "the big lies" leading to social and economic collapse, the supression of scientific evidence that might help in the current long-term crisis (gathered by methods that would be needed to pull human societies out of our current dilemmas), and the use of replication studies to validate scientific findings about the dynamics of human societies. "The big lie" over the past decade has been the use of systematic disinformation as government policy and political practice by the U.S. administration under Bush, Chaney, Rumsfeld, Rove and the party of the right (not that the Democrats weren't engaged in the same lobbyist and pork barrel economics), the feeding of doctored "news" from the administration to willing news organizations, the disbanding of financial regulation and total abandonment by our banks, investment houses, insurance companies and regulatory agencies of sane practical, ethical and moral principles of exchange, and similarly in the global economy, and, among other things:

Greenhouse gases year 0 to 2005
The Keeling curve 1958-2006

The censoring of scientific reports from government agencies by lobbyists hired to do so by the administration so as disguise the dangers of global warming, climate change, the devastating effects of pollution by land, sea, and air, and the indoctrination of the public in the U.S. that these problems didn't exist.


The academic focus this year is on research methods of scientific replication and accurate evaluation of evidence in the form of writing about research findings from projects initiated in this class.

Theme and Texts

We are in a race in your lifetime either for selfish entrepreneurs suffering from the money illusion to corral all the world's resources without regard to speedup of predatory destruction of the planet or to slow down the frenzy and the technology of greed and instead create sustainable practices. The core readings focus on the dynamics of growth, conflict and destruction using methods of historical dynamics (Turchin) and comparative study of what we can learn from the conditions of peaceful coexistence and the importance of examining and changing beliefs (Fry). Just at literature well in advance of World War I envisioned future horrors, you are already well equipped from our visual literature of images of our future although your vision of how fast nonlinear accelerating climate change will happen catastrophically is nowhere near accurate because of the doctored information that American have been fed. Now would be a good time to learn the techniques of investigative science.

Chapters 1-3 Imperiogenesis: Factors that explain the Rise of Empires
Chapters 4-6: The Life Cycles of Imperial Nations, Asabiya in the Desert
Missing: Chapters 7-9 [7] A Medieval Black Hole: The Rise of the Great European Powers [8] The Other Side of the Wheel of Fortune: From the Glorious Thirteenth Century into the Abyss of the Fourteenth [9] A New Idea of Renaissance: Why Human Conflict is Like a Forest Fire and an Epidemic
Ch. 10-12: Life Cycles of Imperial Nations. 10. The Matthew Principle. 11. Wheels Within Wheels. 12. War and Peace and Particles "Why the rich get richer and the poor get poorer. Secular cycles affect practically all facets of social life ..." ---- same chapters: another view
  • Fry, Douglas. 2007. Beyond War. Oxford: Oxford University press. Required but not in bookstore. Read weeks 3-4. Amazon $14

Background readings

Prospectus: Human Complexity

Day one Sept 24 exercise and discussion of key reading

  • For political science, anthropology, sociology and international studies students, see the example by Jahn Detlef of the approach we take (but we use a larger world cultures sample (SCCS). Please read Detlef from day one so as to discuss on day two.
  • Exercise: Pick a society from the SCCS. Open the Spss dataset used in class. Find and write down the latitude and longitude of the society. At home study the ArcMap Tutorial. Be prepared to map and explore the region of the society you have chosen in ArcMap, copy maps into a *.doc or *.rtf file, include the urls, and describe what you learned about the region and this culture in particular, if it still exists. Alternately, use Google maps, Google Earth, or Virtual Earth. Use Wikipedia to find out about the society and its current status.
  • View a classroom exercise with R: Free software you can install at home. To start R in the classroom click Start and then click the program list. Two R programs will be used for developing the models of cross-cultural causality you will use for developing a class project and paper. You create your own directory under "C:My Documents" and create a "C:My Documents/MI" for your downloads of the software and those files will be stored at one your computer for the quarter. Make a "C:My Documents/MI/YOUR-NAME-HERE" to store your impdat (.Rdata) results which are output by program 1 in the exercise, or you may want to bring a flash disk to save your work to store that files. You need to have the impdat (.Rdata) in "C:My Documents/MI" to run program 2. You can copy files to your desktop but they will be erased when the computer is booted up. Later you will modify (see EduMod) these programs to create instructions for your particular model.

Class notes

Day two Illustrative tools and projects for term papers Sept 29 (includes Eff and Dow)

Class notes

Readings and Discussions

Compare to Eff and Dow comments on Heinrich et al. 2004 in Markets and prosocial behavior (example of a potential term paper topic)
0. To download the R files for your analysis, if C:My Documents/MI does
1. Make the imputed datasets
2. Estimate model, combine results
SCCS Variables in R for your assignment
Human Social Complexity and World Cultures 2009 (Course website)
For an EduMod example of a student paper using a variant of this method see Sarah Baitzel's project - Finding a better model for Eff's study of Average Adult Female Contributions to Subsistence.

Day three - Understanding key problems Oct 1

Class notes

We only got this far

  • Read: Ember, M., C. R. Ember, and B. S Low (2007). 2007. Comparing Explanations of Polygyny. Cross-Cultural Research 41(4):428-440. ("Low’s (1988b, 1990) pathogen research, using the SCCS (Murdock & White, 1969), links polygyny to an ecological predictor: high exposure to pathogens meeting Hamilton’s criteria (Hamilton, 1980, 1982; Hamilton & Zuk, 1982), which are potentially lethal and leave marks apparent to others. These pathogens have been implicated in sexual selection (Hamilton, 1980, 1982; Hamilton & Zuk, 1982). Pathogen stress was coded by Low from medical and public health sources on the latitude and longitude of the sample societies, using data as close as possible to the defined dates for the sample societies’ SCCS data. A total of seven pathogens (leishmanias, trypanosomes, malaria, schistosomes, filariae, spirochetes, and leprosy) were each rated on a 3-point scale for frequency; the individual scores were summed for a total pathogen stress score." p.432).
Other sources for a possible term project on polygyny:
White, D. R. polygyny pages
White, D. R. (1988). Rethinking polygyny: Co-wives, codes, and cultural systems. Current Anthropology, 29: 529-558.

Day 4 project example Oct 5

Class notes: 2 sets needed each class

What does a completed project look like?

  • Let's now look at a potential polygyny project (polygyny as a dependent variables) that could be done by a student, EduMod-2: polygyny Imputation and Regression. We're going to start with the independent (predictor) variables used by Eff and Dow (2009) and we're going to use their program in R, as posted on this wiki. R is installed on or lab but you can also install it at home and work from the wiki pages. The amazing thing is that once you have your results, the program will tell you if your prediction (in this case, 43%, which is high, is wrong: misspecified, that is, which means that variables have been left out. To study this problem you need to read White and Burton (1988). They use some variables in the SCCS codebook which were not included in Eff and Dow's study, and the R program will tell us there was misspecification, so these omitted variables are likely candidates. The student project is to ADD THOSE TWO VARIABLES to the R code, rerun the analysis, and see if the prediction is better specified. Can we find the causes of polygyny? If so, there will be some surprises. Monogamy, for example, is not increasing with time as a causal variable. That is, the apparent increase in monogamy in modern nations is not an inevitable result of evolution, is is caused by other factors than time.
White, D. R., & Burton, M. L. (1988). Causes of polygyny: Ecology, economy, kinship, and warfare. American Anthropologist 90: 871-887.
But also: (Ember, Ember and Low say of the White & Burton model: "only two of the White/Burton predictors are significant, namely, fraternal interest groups (p = .002, one-tailed) and absence of the plow (p = .01, onetailed)." p. 436. "White and Burton (1988) found that various other conditions also predict polygyny in multiple regression analyses (simple amount of polygyny, not nonsororal polygyny). The significant predictors in their multiple regression were fraternal interest groups (as indicated by patrilocal residence and bridewealth), marriage of captives multiplied by small population, war for plunder, absence of the plow, fishing, climate zone ordered in terms of access to rich ecological resources, and female contribution to subsistence. The mechanisms involved are not always clear to us, but it behooves us to examine whether the White/Burton predictors also predict appreciable nonsororal polygyny when we control on our predictors. Bivariately, most of the White/Burton predictors are significantly related to appreciable nonsororal polygyny. Accordingly, we retained all of their significant predictors for the logistic regression analyses we will now describe." p.435). SO WHO IS RIGHT? White and Burton? Ember, Ember and Low? The new results from R? And what about Ember, Ember and Low's idea that "pathogens like parasites cause polygyny"? Can this be true? Or is it a spurious correlation? Here is anthropological critique at its best where actual anthropological evidence collected by ethnographers bears on the issues. This is a real detective story.

Day 5 your project Oct 8

Class notes

ANTH174-09 Lyndon Forrester's Day 5 class notes

Figure3.png

ANTH174-09 Jacqueline_Duong's Day 5 class notes

Day 5 problems and solutions

Polygyny example for term paper analysis

  • Today we will do the Polygyny example: EduMod 2: Polygyny (shortcut)
  • Now we'll take another detective story and potential student project: We need some articles on the subject so do a Google Scholar search: "Standard Cross-Cultural Sample"+your topic word, like polygyny. You now have the tools to do your quarter-long research and writing project:
Zotero research tools for PDFs, your bibliography, and notes
research literature on your topics, as treated in the Standard Cross-Cultural Sample
the codebook of variables for the Standard Cross-Cultural Sample
selection of one or several dependent variables from the codebook, with a reasonable number of cases (> one quarter of the sample ie 46+ better yet 93+ or more) each with a reasonable spread of variation
review your articles to see what variables might be related to your dependent variables
make a copy of programs 1-2, Eff and Dow, in EduMod, in your word processor, add some new independent variables, define a dependent variable
run, correct, and debug your changes, noting your edits with ###comments in those lines
get your results
evaluate R2 (R squared is the % of variation in the dependent that is explained if the model is well specified)
evaluate from the statistical results whether the model is well specified
evaluate other statistical results

Random ideas on topics

  • TOO FEW CASES: Types of Magico-religious practitioners should be interesting as a dependent variables (codebook: Winkelman codes). One sure independent variable is settlement patterns. Good test of the missing data imputation procedure because codes are for only 46 cases.
  • Finding tailored reading for student projects, e.g., from Google Scholar search above, and a very very preliminary beginning below (we can add to this list in class):
review of Lewis Binford's attempt, with only 4-8 peices successful, to link macroecology of forager behavior in a sample of 339 hunter-gatherer communities by Kim Hill (UNM), J. Anth. Res. 58:416-419. Such as:
climatic data and food storage. Latitudes N and S: p256 >35 latitude and food storage; p388 >42 latitude and formal leadership >42 latitude and weapons for land animals and aquatic resources; p389 COLDEST MONTH > -4 Centigrade and complex facilities to obtain food, low niche effectiveness and complex facilities to obtain food. p431 FISHING and storage of dried/salted fish.

Your ideas on topics

Day 6 Tues Oct 13: Your paper proposal due soon; studying clustering and cultural interdependence in Maps, Language phyla, Religion, and Ecology

Class notes

  • One to record for assignment due Thurs Anth174-09 Day 6 DepVar and IndVars Notes, by User:Bryanm take down the Depvars and Independent variables and add each contributors User ID [[User:Name]], Depvar; IndVars; 1 Main ref. Once we go quickly thru the class to fill these in we will go back thru for each person to correct their username. You can put more information at your own User pages. Everyone should be able to complete this short pass at specifying your problem by end of class or after class today.
  • And one other set of notes.

1 page proposal DUE TODAY combines your depvar with readings to do this:

  1. . Introduce the topic and depvar(s) and your questions about what the independent variables might be, relate this to your readings.
  2. . List the codebook numbers for your variables, and copy the relevant codes from the codebook as an appendix (can go beyong 1 page)
  3. . State some hypotheses about what might be the expected effect of the independent variables on the depvar(s).
  4. . For example, one student two years ago took frequency of crime and used variables on the support of mother in childhood, support of father in childhood. Lots of variables there to choose from. In a straight regression analysis (without autocorrelation or Eff and Dow missing data imputation) the finding is that crime frequency is reduced by positive support from mother AND from father, both have and effect.

Tools

  1. the codebook (and its index)
  2. Google Scholar search "Standard Cross-Cultural Sample" + "your depvar topic"
  3. Your EduMod-XX page -- add new headings at the end, and keep adding, alternating A| B} as in Depvars#Polygyny_series_1
  4. Put your depvar and bibliography of SCCS studies for your topic on your User page

Mac Version is working

Hi Prof., I was wondering if you could help me run the R-program properly for my Mac. Thank you, User:Christina Park = User talk:Christina Park - Christina Park --- Doug 15:59, 9 October 2009 --- I'll get to work on this on Saturday. Ok, its working now. SEE Mac users Make Directory. Doug 08:19AM, 10 October 2009. Follow those instructions and let me know if you have problems.

Leading spaces problem

Pc Users and the lab

  • Mike Migalski is installing TextPad on all the computers in the Lab. Open TextPad, add a space on line one, copy and paste your bbb and ccc results after the space in TextPad. There should now be a blank in front of every line. Use Ctrl-A to copy the TextPad page, then in the wiki use your new page (open with the edit tab and put at the top a new ==heading for results==), then under that heading paste in your results text from TextPad and don't forget to SAVE the wiki page.

Mac Users and at home

  • Use the OLSresults.csv file in MI to save to the wiki to create leading blanks for your results (as pasted into wiki). Or use Opensource editors that can be downloaded at home for leading blanks for your results (as pasted into wiki). I have asked our lab manager to install "SciTE" (open source, free) on our machines. YOU WILL NEED TO KEEP RESULTS ON THE WIKI TO COMPLETE THE PAPER REQUIREMENTS (that way I can check your results-DRW). Temporarilty you could paste your results into WORD and print at home. But your results might get mixed up with the programs you ran that way.

Warning messaages

Warnings at the end of the program writing the OLSresults.csv are just telling you it is appending to that file on MI. You can ignore.

Spatial clustering and MAKING MAPS WITH SPSS

If you have network effects of distance or language effects your cases are not independent:-- Then you need to understand the clustering of your variables

Making maps in Spss and GIS of the dependent and independent variables -- Instructions here. We will use Spss in the lab depending on how far we get with the following:

Tasks from Day 5

Day 5 catchup: Polygyny analysis using R at EduMod 2.

Your paper proposal due Thursday, Day 7

By today try to pick one or two possibilities for a dependent variable"

http://eclectic.ss.uci.edu/~drwhite/courses/stdsvars.html index of variables by topics
http://eclectic.ss.uci.edu/~drwhite/courses/stdsstud.html index of variables by study
http://eclectic.ss.uci.edu/~drwhite/courses/SCCCodes.htm codebook (only here do you see actual categories to help define your independent variables in the R program)

Then:

Get readings for your proposal with GOOGLE SCHOLAR SEARCH

Do a GOOGLE SCHOLAR SEARCH for "Standard Cross-Cultural Sample"+"your topic word" as we did in class for Day 5/6, e.g.,

"Standard Cross-Cultural Sample"+polygyny


Findings on Spatial autocorrelation in the SCCS

Dow, M.M., & Eff, E.A. 2008. Global, regional, and local network autocorrelation in the Standard Cross-Cultural Sample. Cross-Cultural Research 42: 148-171. pp 158-159 show variables in the SCCS with high spatial autocorrelation.

White, Douglas R. 1993. [http://eclectic.ss.uci.edu/~drwhite/pub/DRW1993Spatial.pdf Spatial Levels in Cultural Organization: An Empirical Study. Handbuch der Ethnologie, pp. 459-88. Edited by Thomas Schweizer, Margarete Schweizer, and Waltraud Kokot. Berlin: Reimer Verlag. pp 487-488 show variables in the SCCS with high spatial autocorrelation.

Day 7 discussion of your paper proposals - due Thursday Oct 15

Class notes

  • CLASS NOTES: 2 sets needed

Grades

depend on 100 points, summed from

  1. completion of each assignment, and revisions, improvements
  2. participation in class, wiki, providing class nodes
  3. drafts and final paper, and revisions, improvements
  4. use of literature, construction and logic of your argument

Discussion of paper proposals

Paper proposal due

(Our class notes on her user page)

Day 8 revising your proposals Tues Oct 20

Class notes

Student questions

PPT. More info? Are we presenting after we finish our papers complete? Or after our research and R^2 findings. DW: after your research and R^2 finding

- Correlating to the PPT question. Papers due when PPT is done? Or by the end of quarter? DW: end of quarter

I wonder how i'm going to use a map on my depvar... DW: looks like we have some fancy GIS software for that, a free GIS browser for each of you, permanent possession, and a link to a GIS server that has custom SCCS maps for our variables (as background see Wikipedia:Standard Cross-Cultural Sample#Cultures_in_the_standard_cross-cultural_sample.

Written feedback on paper topic proposals

Your feedback here Anth174-09 Day 6 DepVar and IndVars Notes, on that page, check also the link to your user or talk page - consult eee for your grade (you may resubmit).

How to make the program work

Practical lessons in getting your Eff and Dow code to run

INSTRUCTIONS FOR Modeling the changes fpr Amanda's example Day 7

Tomorrow: A talk about this course, UC electronic journals we use, human complexity wiki

Wed Oct 19 3:30 reception 4:00 talks (Notes)-- Come to my Open Access Day talk tomorrow Oct 21 at the new Graduate Student Center in the Student Center. I will discuss the Structure & Dynamics eJournal (where Eff & Dow were published), the new format of this journal and the World Cultures eJournal that we host, and will discuss this course, EduMod and our class projects, the InterSci Wiki and the multi-UC campus Human Complexity colloquia series. If you registered for 1.33 extra credits for Anth240A you can review this talk for your 240A paper credit.

Whats the philosophy for getting your program to run

Got it? -- Mod it -- copy successful programs (check that they run) modify to make the program do your project.

Thurs Day 9 paper draft and presenting your proposals and findings in 6-minute powerpoints Thurs Oct 22

Class notes

Good progress

Doug 14:36, 26 October 2009 (PDT) Three Cs went to Bs today

  • AND:
  • Amanda ready to present
  • Bryan could be ready to present
  • Patrick Kim could be ready to present
  • Bui, possibly

Startup for a new project

  • You create a new project whenever you change the depvar, and it should start with an xUR (unrestricted model). You can copy an existing example to your EduMod-XX, check that is runs, change the depvar and test if runs. E.g., run the part of the program down through the depvar, then run longer chunks of Program 1. After you have eliminated errors resulting from your edits, run the whole program and save the results to your EduMod-XX.
A way to do the latter is to edit the last module (segment) of your page, go to the bottom, create a new model headed ==A| My startup xUR program depvar="my var here" copied from "GOOD START"==. Save and check that this header is correct. Then EdiT this header, putting your cursor within the edit space, so you are ready to copy.
Now go to "GOOD START" at EduMod-11, click on edit, press cntrl-A inside that module, copy that program text, and paste back into your page where you left off, then save your module. Open R, copy the program (still in your copy buffer), and run the program. Make sure that program runs.
Now change the depvar and test again if it runs.
Once you get results create a ==B| header "xUR model with name of depvar"== below the program that just ran. Run your cursor from >bbb in the program output through the rest of the table, including the >ccc output, and copy. Open Textpad. type one space. Paste your copy of results into TextPad after the space. Do Cntrl-A and copy, then paste the results (with leading blanks in the right column) into your module with the ==B| header ...== and close.
  • Stage 2 of your project: Creating your xR. Now comes the hard work to make your xR (Restricted model). Copy your successful xUR page into a new page, well described in a ==A| new depvar header with "depvarname" and SCCS number==. Save the page. Check if it runs down to the end of the depvar. Print the results of this model from ==B| header== above, and note the significant variables.
Now add between one and three indepvars at the end of these sections, and when adding lines comment the end of the line with #3# so you can easily find each of these changes with the "find" option when editing this page again:
fx<-
indpv<-
x<-xUR
x<-xR (currently identical to XRU). Edit out the variables that are not significant, and add one nondepvar.
Save arun the program. Since you haven't changed the Program 1, you can run the whole program all at once.
Find the first error that occurs (the should occur in Program 2), correct the error.
Copy and paste only Program 2 that you have corrected.
Repeat the last two steps until your program runs.
This is your xR and you can paste the results to a new ==B| header "xR model with name of depvar"== into a next window.
  • Stage 3 of your project: Adding additional independent variables. Repeat Step 2 but adding at most three variables at a time, i.e., so they fit on the same line.

Avoid this error Amanda is at xR stage in her So drop everything page. However, instead of adding her indepvars to an xR model, she added them to the xUR model. This was not necessary and interferes with debugging. She already had the So drop everything xR page with only a few variables in the results, since all the nonsignificant variables had been dropped. There was no point in including those nonsignificant variables in the xR model, which should have only a few variables. It is to that small list of variables that new Indepvars should be added.

GOOD NEWS FOR THE AmandasEduMod-11 project

'Eventually we see it is male toughness and interpersonal violence that affects rape. (23% R2 causality), no significant autocorrelation.

Some more ways of checking for errors

  • Define depvar<-v9999 by short arrow <- not long arrow <--, as in depvar<--v9999 would be wrong, it the negative value of -(v9999)
  • if you define a new depvar<-v9999 begin to debug by pasting that part of Program 1 and 2 only down to where the depvar<- NUMBER AND NAME ARE DEFINED. That will tell you if you made an error in that change.
  • DONT USE A DASH within a variable name like marr-arrange=SCCS$v888
  • WHEN EDITING THE PROGRAM, work within an "edit window" and you wont get lost on your Edu-Mod page.
  • In general, test to see whether you are starting with a program that works, start only by changing the depvar, then run and debug if you get an error. Then add one-three indepvars that all fit on one line, no more. Otherwise you will get too many errors and be unable to debug them. Each time copy a working copy to a new module and ==A| descriptive header==
  • Always describe what you are doing, in the header, the first few lines, and ###comments. Then add a ==B| Results and its descriptive header== and paste the bbb and ccc table BUT ALSO add the name of the depva.
  • A number of the programs that are working use a variable for rape v667, but one uses v173 with fewer variables.

It will be expedient to define an indep var with this logic: depvar<-(SCCS$v173==1.or.SCCS$v667==2...but no missing values)*1 but this is more complicated so Anthon Eff is doing the example for the rape variables.

Potentially strong results

EduMod-28: (your dep var here) Imputation and Regression in xRU --> Patrick needs to cull down to xR

Day 10 First powerpoint presentation Tues Oct 27

If not then next time. This time we will exemplify what could go in them.

Class notes

Overview for anyone: YOUR ANALYSIS INVOLVES CAREFUL ATTENTION TO THESE DETAILS

  • Make a clear plan, keep it simple, dont create confusion for yourself. Dont skip any of these steps.
  • Copy from Amanda's page a GOOD START for your xUR model at the bottom of your EduMod page, clearly labeled for content, source, and objective. Keep your notes on progress and problems within each of this and the new series of Edit Windows you will make.
  • Always edit and run within and Edit window
  • Edit your depvar and run to test down to that part of the code. Then run the rest to make sure your GOOD START works before you go on. If you cant debug your code changing only the depvar check the instructions or email the instructor.
  • Compile all the helpful hints above to help you add just a very few indepvars and to debug to get new xUR results (thats the hard part: make only a few changes then run to see if it works)
  • Dont get overambitious and try to do too many changes at once since you are guanenteed NOT to be able to find the mistakes, not to back out what you have done wrong, and to make it very difficult to go further or to go back to what was running correctly.
  • For this reason every time you get a new A| program edit window running save the B| results in a new edit window, and then copy the A| edit window program below B| with a new ==Edit Window Header that describes what you are doing next==.
  • Dont change your depvar. Then you have to start all over again.
  • The best source to get good indepvars is in the publication of the authors who did the codes
  • Pay close attention to duplication as measured by the VIFs, cull variables out
  • For further indepvars, Add only one or a few at a time, run, debug
  • When you have xUR results then cull the xR variables down to the significant ones and fyll, fydd
  • Based on the experience of this class, the Eff and Dow EduMods for the future will have a prepreprocessor to do the naming and checking for the depvar and each new batch of a few indepvars to simplify the debugging. This is how successful organizations (e.g., Amazon) evolve in our new information economy. You are part of the pioneering period for successful survey and cross-cultural research.

Speaker Amanda McDonald

Powerpoint presentation

Discuss powerpoints and Paper outlines

First powerpoint presentations followed by rough drafts - all due by Nov 10th.

Day 11 Learn from powerpoint presentations of other students and DRW amendments Thurs Oct 29

Class notes

What's in the Powerpoints?

Discuss the Unrestricted model (xUR)

Your chosen independent variables: significant or not
Other significant independent variables
R2 in Unrestricted
Cannot show in full just discuss: how many significant, which ones?

Show the Restricted model (xR)

All the significant variables
R2 in Restricted (significant variables and others close to significance)

For the Restricted model (xR) only

Show and discuss the Diagnostics
Any variables "missing" according to the diagnostics?
Significance of the fyll (language) and fydd (distance) effects

For Dep and each signif Indep Code

Use one slide to display each pair of dep/indep vars and how related

Evaluate your hypotheses

and where you would go from here

The problem with Guttman Scales

Each set of variables below go into a scale and are not independent of that scale

  • if 657-662 then not 663
  • if 664-668 then not 669
  • if 849-855 then not 877
  • if 851-853 or 855 then not 878

Discoveries about Debugging

  • Be VERY careful that your indepvar or depvar isnt already an indepvar in Eff and Dow
For example "lrgfam" v80 and "FamSize" v68 ought not to co-occur in the same study
YOU CAN CHECK list of Eff-Dow Variables and names
  • If there var VIFs in the xUR (unrestricted model) way out of the range 1-3 (variable inflation, e.g., 50, 100, 1000) THEN THE SIGNIFICANCE IS NOT CALCULATED CORRECTLY. This happened in Stephen Kim's EduMod-16. Reduce the complexity of your xUR (the variables with VIFs over 50 for example) and rerun the model. In Stephen Kim's model this changed the significance of agr_late_boy to p=.05.
  • Although you cant have agr-late-boy (those are minuses) as a variable name you can use agr_late_boy
  • If your new indepvars arent working check what the variable looks like with the line
SCCS$v999 for example, and the problem may be that there are too few cases.
  • One indepvar gave the error "computationally singular" and the program worked after this was dropped.

Day 12 Learn from powerpoint presentations of other students, get help in class Tues Nov 3

Class notes

MAKING YOUR LIFE EASIER AND how to get caught up!

Doug 07:36, 3 November 2009 (PST) To get caught up I would suggest making a list of a 2-3 best dependent variables and 4-5 independent variables and use the R crosstab program from day 12 (below) for each pair (max 3 x 5 = 15 crosstabs) to see which depvar-indepvar pairs are the most significant, then start with your favored depvar. E.g., see EduMod-29: (your dep var here) Imputation and Regression.

Problems with DIVORCE as a topic: SEE Day 6 groups

PROBLEMS WITH DIVORCE AS DEPVAR

Example of learning to scan for errors in adding indepvars

http://intersci.ss.uci.edu/wiki/index.php/EduMod-26:_%28your_dep_var_here%29_Imputation_and_Regression#A.7C_after_eliminating_high_VIFs_and_nonsignificant

Making Maps

You might want to go back to Day 6: MAKING MAPS IN SPSS and add (1) maps for your dependent variable (are societies clustered by similarity?) (2) your statistically significant independent variables with p <.05.

But we will have a new GIS package for the SCCS and mapping your variables. These will be CDs with ArcGIS 9.3.1 with a 1 year license (till Sept or so) from ESRI. I will have a dowloadable Geodatabase for the SCCS from which you can make maps.

Results from the McDonald powerpoint (downloadable), interpretations and amendments re: Galton's problem

depvar rape (pptx) Amanda McDonald powerpoint

  • New Discussion: Inferential versus Descriptive Statistics (Solution of the Galton Problem)
Male Toughness (N=92)   Interpersonal violence (N=73)
 Rape  -    +               Rape -    + 
664 – 22 | 22           666 -   14 | 20
      ---+---                   ---+---
664 +  5 | 43 89%       666 +    2 | 37 95%
p = 0.0001             p = 0.0006 <--- DESCRIPTIVE STATISTICS ("there is also a moderate 
(R program below)      distance effect & missing data uncertainty that counteracts selection 
                       bias in searching for only the significant variables" - DRW)
P = 0.024              p = 0.007 <--- OUR EFF AND DOW INFERENTIAL STATISTICS deflate significance
     240 times deflated      11 times deflated (i.e., your Eff and Dow results are LESS significant)
  • The DESCRIPTIVE STATISTICS are 240 to 11 times more significant (inflated, moreso with larger N in this case) than our EFF and DOW inferential statistics. That is what makes RANDOM results often appear as significant, which is GALTON'S PROBLEM. The EFF and DOW estimates correct those spurious results in two ways:
(1) Controlling for Galton effects of language and distance clustering of similaries and
(2) Adding the uncertainties introduced by missing data as balanced by larger sample size.

Spuriously significant correlations can also result from the fact that the researcher is trying to find significant relationships from a large number of correlations, and the random variation that will occur in a large number of correlations will produce some random correlations that appear signifant when they are not. THE LARGER THE SAMPLE the less likely this is. Thus when missing data are imputed the full sample is used, which reduces the likelihood of this result.

So:

  • 1. Effectively, we have solved for Galton's problem.
  • 2. We have shown that in the huge literature of cross-cultural or other survey approaches, use of Chi-squared or null hypothesis significance tests may result in a HUGE NUMBER OF SPURIOUSLY significant results.
  • 3. So if you select independent variables from the results of previous or the original researchers, you are likely to be able to DISPROVE the statistical validity of many of their results, thus improving the scientific QUALITY of the literature with new and original results worthy of publication.

Computing crosstabs for rape and McDonald indepvars for rape by indvars (Interpersonal Violence, Ideology of Male Toughness)

Crosstabs ---- Amanda McDonald crosstabs (computation) You will see that these are very easy to compute (the R program runs in seconds) and you should include in your powerpoint results that look like or draw on the first R table below, as DRW has done succinctly for McDonald's powerpoint. (DRW edited the table a bit to provide labels for the variables, rows, and columns). See my additional McDonald slide for how to include this new information very succinctly in your powerpoint and term paper. (Your first reaction will be "Hey! Why didn't we do this in the first place" and that's one of the main points of the class and the reason for all of our society's challenger disasters: Detroit cars, vacuum cleaners, health systems and wars that don't work and all the rest. We are more and more technologically adept and scientifically stupid when it comes to the social, policy, and design sciences).

Cell Contents
|-------------------------|
|                       N |
|              Expected N |
| Chi-square contribution |
|-------------------------|
Total Observations in Table:  90   
           x | no rape   | rape=     | Row Total |
             | y =     1 |         2 |           | 
-------------|-----------|-----------|-----------| 664. Ideology of Male Toughness
no         1 |        22 |        22 |        44 | 
ideomaltough |    13.200 |    30.800 |           | 
SCCS$v664    |     5.867 |     2.514 |           | 
-------------|-----------|-----------|-----------|
yes        2 |         5 |        41 |        46 | <-- powerful effect here:
             |    13.800 |    32.200 |           | 41/46 = 89% of male toughness societies 
             |     5.612 |     2.405 |           |  have rape 
-------------|-----------|-----------|-----------|
Column Total |        27 |        63 |        90 | 
-------------|-----------|-----------|-----------|
Statistics for All Table Factors
Pearson's Chi-squared test 
------------------------------------------------------------
Chi^2 =  16.39752     d.f. =  1     p =  5.13525e-05 
                                      =  0.0000513525
Pearson's Chi-squared test with Yates' continuity correction 
------------------------------------------------------------
Chi^2 =  14.58710     d.f. =  1     p =  0.0001338277

Chi-squares (Chi^2) are converted by p values by the use of chisq tables and more precisely with Fisher exact test which also gives chi-squared, in the above case Chi-squared for Pearson's= 16.398 (p= 0.0000) uncorrected, which corresponds to the p = 0.0000513525 above. The Fisher exact Two sided p-values for p(O>=E|O<=E) give the p-value= 0.0000649501 but often give a much smaller value.

INTERPRETATION OF R TABLE OUTPUT
Total Observations in Table:  73 
            x | no rape   | rape=     | Row Total |
              | y     j=1 |       j=2 |           | 
 -------------|-----------|-----------|-----------| 666. Moderate or 
 no       i=1 |O   =   14 |        20 |        34 |      Frequent Interpersonal Violence 
 freintovio   |E   =7.452 |    26.548 |           | 
 SCCS$v666    |O-E =6.548 |           |           | Chi = \chi
ChiSq=6.548sq/7.452=5.754 |     1.615 |           | ChiSq = \chi^2
 -------------|-----------|-----------|-----------|
 yes      i=2 |         2 |        37 |        39 | <-- powerful effect here:
              |     8.548 |    30.452 |           | 37/39 = 95% of freq. IP violence societies
              |     5.016 |     1.408 |           |  have rape
 -------------|-----------|-----------|-----------|
 Column Total |        16 |        57 |    T = 73 | T is the Total Observations in Table
 -------------|-----------|-----------|-----------|
Statistics for All Table Factors
  Cell Contents T = Total N
|----------------------------|
|          O = Observed N    | These are descriptive statistics
|          E = Expected N    | = row total * col total / T
|----------------------------|
  Chi-square contribution i,j  =  \chi_{i,j}^2 = \frac {(O-E)_{i,j} ^2} {E_{i,j}}
|----------------------------|
 Pearson's Chi-squared test = TOTAL \chi^2 = \sum_{i,j}  \frac {(O-E)_{i,j} ^2} {E_{i,j}}
------------------------------------------------------------
Chi^2 =  13.79241     d.f. =  1     p =  0.0002041589 
Pearson's Chi-squared test with Yates' continuity correction 
------------------------------------------------------------
Chi^2 =  11.76646     d.f. =  1     p =  0.0006030748

Day 13 Learn from powerpoint presentations of other students, get help in class Thurs Nov 5

Class notes

YOU MAY POST NOTES HERE FOR PROBLEMS

Comments for EduMod Pages

THE KINDS OF COMMENTS I HAVE BEEN ADDING TO EDUMOD PAGES

Discuss paper draft and outline

  • Intro: set the problem
  • Relevant literature and view of authors
  • The hypotheses and coded variables
  • The xUR and xR model results (first introduce the method of Eff and Dow)
  • xR findings evaluation, significance and R2
  • Diagnostics (review Eff and Dow for these)
  • Conclusion

Review McDonald Powerpoint

depvar rape: The McDonald pptx

Day 14 Learn from powerpoint presentations of other students, get help in class Tues Nov 10

Class notes

A Roadmap and Tutorial to Eff and Dow (2009) projects

New Edition 11/10 impt changes p. 4 Inferential Statistics with Digital Learning Media: A Roadmap for SCCS Causal and Autocorrelation Estimates - pdf for this class - the doc file

Assigning powerpoint presentations

  • Today
-28 Patrick Kim* Kim powerpoint with a correction
  • 12th, add name here to move up - have results
-22 Peyman Taeidi*
-17 Nathan Gallinger*
-27 Lyndon Forrester*
 -6 Kat Plummer
-29 Kimberly Nguyen*
-10 Deborah*
  • 17th, add name here to move up
-21 Ralph D'Ignazio*
-13 Lawrence
-15 Ambrose
-18 Chiu - see warning above on divorce as a dependent variable
-20 Bui*
-40 Bryan*
-26 Abiha*
  • 19th - stiil needs clear model
-12 Villy*
-16 Stephen
-23 Jacqueline*
-24 Michael*
-30 Jessica
M-6 Ryan*
M-8, 41 Christina*
  • Must catch up
-32 Scott - tutorial with Ren Feng
-31 Alex - tutorial with Ren Feng
-33 Natasha - tutorial with Ren Feng

Problems working from home

Unzipping the Eff-Dow data download

A new review site for findings

Final Models and Commentary

Term paper rough drafts due date changed to Thursday

See Day 11 What's in the Powerpoints?

Day 15 TA help and tutorials Thurs - and rough paper drafts due Nov 12

Class notes

Patrick Kim's Powerpoint from day 14

Kim powerpoint with a correction

Today's Powerpoint presentations- Notes on talks Day 15

* agreed to day
? able to present
Day xx = notes on main page
Notes? - put A/Y/N at end of line === A-notes have ANTH174-09 search code === Y Notes in my personal page === N no notes posted
  • -22 Peyman Taeidi* Day 10
  • -27 Lyndon Forrester* Day 5
  • -10 User:Deborah Blumenthal* Day 1
  • -29 User:Kimberlynguyen* Notes===Y Notes in my personal page
  • -6 User:Kat Plummer* Day 9-10
  • -40 Bryan Martinez* Day 6 (Doug 09:17, 17 November 2009 (PST) Sorry, my mistake, I said to take out fydd but I corrected that by putting it back in to get a more final and successful set of results for "Male Toughness" To help interpret: While interpersonal violence is a a predictor: (1) so is polygyny and (2) the spatial clustering of Male Toughness is very high. That is, there are spatial clusters of more violent, tough societies (probably correlated with the Fraternal Interest Group complex identified by Paige and Paige, which includes males residing patrilocally with their own male kin and a capability for local violence (e.g. revenge) and warfare with neighboring societies).

Meaning of diagnostics in the Eff-Dow Restricted Model results

These are tests of null hypotheses. For an example see Peyman's powerpoint results page 3.

  • Nonlinear transformations of independent variables ~ 0 (if p>.05)
  • True coefficients of excluded variables ~ 0 (if p>.05)
  • Heteroscadicity ~ 0 (autocorrelation errors not bunched) (if p>.05)
  • Autocorrelation errors are normally distributed (if p>.05)
  • Additional network effects in this variable (if p>.05)
                 Fstat        df pvalue
RESET            4.209  1814.131  0.040 <-- Nonlinear transformations of independent variables ~ 0  (if p>.05)
Wald on restrs. 20.579    50.803  0.000 <-- True coefficients of excluded variables ~ 0 (if p>.05)
NCV             30.289   332.725  0.000 <-- Heteroscedasicity ~ 0 (autocorrelation errors not bunched) (if p>.05) 
SWnormal        34.676  2038.668  0.000 <-- Autocorrelation errors are normally distributed  (if p>.05)
lagll            2.197 16641.181  0.138 <-- Additional network effects in this variable (if p>.05)
lagdd           45.786   232.955  0.000 <-- Additional network effects in this variable (if p>.05)

In this study, then:

RESET            4.209  1814.131  0.040 <-- There are nonlinear transformations of independent variables
Wald on restrs. 20.579    50.803  0.000 <-- There are significant excluded variables 
NCV             30.289   332.725  0.000 <-- Autocorrelation errors ARE bunched
SWnormal        34.676  2038.668  0.000 <-- Autocorrelation errors are NOT normally distributed
lagll            2.197 16641.181  0.138 <-- No Additional network effects in this variable
lagdd           45.786   232.955  0.000 <-- There are Additional network effects in this variable

But for these tests to be valid, once you have a final final model, you have to include in the dropt<- list of independent that were dropped, all the variables in xUR<- (the full list) that are NOT in xR<- (the few left in your final Restricted Model).

You do not have to bother with this for this class. Just know what these diagnostics mean. Someone else can go into your EduMod page and EASILY run the last working version of your program that produced your final Restricted Model after fixing the dropt<- list and rerun the program to get the correct diagnostics.

You need not:

find the nonlinear transformations of independent variables that are better predictors if there are any.
find the significant excluded variables if any are missing.
dont obsess about this! You've done enough work already!
worry about Autocorrelation errors that are are NOT normally distributed.
worry about additional network effects.

You can, if you wish:

If your fyll and or fydd have pvalues > 0.50 run a varsion of your Restricted Model that excludes one or both accordingly, and see what are the resultant R2 (DRW did this for EduMod-40 only, as an experiment).

Day 16 Making maps for your final paper Tues Nov 17 - Notes on Talks Day 16 - Notes on Talks Day 16=

Class notes

Problems and solutions with Chisquared

Sorry, should have noted this earlier: Chisquared tables where the expected values are less that 5 are not accurate (will give greater significance than is justified, e.g. p=.008 when nonsignificant. So try to avoid chisq for variables with many categories, as with Hiu Kwan Chiu's last EduMod-18 table. If this happens, you may postpone your powerpoint. Instructions from a week or so ago told how to dichotomize or tricotomize a variable and this will allow a crosstab chisquared to be done. A crosstab with many categories on one or both variables can also be broken into dichotomies on each variables and the sums in each quadrant of the table entered into the Fisher exact test table to get a significance test. The significance, however, will usually be exaggerated tenfold or 100fold compared with what you might expect from the Eff and Dow regression.

Today's Powerpoint presentations

* agreed to day
+ trimming xR<-
? able to present
Notes? - put A/Y/N at end of line === A-notes have ANTH174-09 search code === Y Notes in my 
personal page === N no notes posted
http://eclectic.ss.uci.edu/~drwhite/courses/SCCCodes.htm
  • -17 Nathan Gallinger* Y - Notes for Day 2-3 in my personal page
  • -20 User:BL Le Bui* Y:in personal page Doug 10:23, 17 November 2009 (PST) I would think that @polygyny or one of the other variables like polygyny=v860 would predict extramarital sex. Its interesting that more settled communities with superordinat jh hierarchies have more extramarital sex while its those lower in population density. It is remarkable that it is the more MONOGAMOUS societies that have more extramarital sex!! Tho significant at pvalue = 0.10 not 0.05. If you can, you are able to see that I have obtained three different variables, and I have a negative regression coefficient for popdens. Please clarify, thanks! Ok, for depvar extramarital sex it means the denser the population the less extramarital sex
  • -21 User:Ralph D'Ignazio*+ Day 13-14 Doug 22:13, 16 November 2009 (PST) : you might want to prune a couple more xR<- variables
  • -18 Hiu Kwan Chiu+ === Y:Notes in my personal page Doug 22:07, 16 November 2009 see notes just above on chisquared
  • -15 User:Ambrose Lee+ Day 16 Doug 22:07, 16 November 2009 cull down your xR<- as shown in your last results page and possibly cull again and you will have results for your powerpoint.
  • -13 User:Lawrence Lam== Y Notes in my personal page Doug 22:07, 16 November 2009 (PST) 7 variables to keep here: keep on pruning down your xR<-

Day 17 Getting your final paper together Thurs Nov 19

Class notes

Pdfs of Powerpoint presentations from previous presentations

Quik news: OUR MAP SERVER IS UP

GIS maps must be accessed in FIREFOX MOZILLA, GOOGLE CHROME ETC NOT I-EXPLORER If each of you puts your variable numbers for depvar and indepvars into the file below you can make maps in a Jiffy:

Today's Powerpoint presentations

+ trimming xR<-
* agreed to day
Notes? - put A/Y/N at end of line === A-notes have ANTH174-09 search code === Y Notes in my personal page === N no notes posted

Day 18 How to write up your argument and evaluate and report your findings Tues Nov 24

Class notes

Make maps of your variables for your paper

Start with the Indep/Depvar list - All to choose variable to make a map.

Pdfs of Peyman's powerpoint from previous presentations

Use Peyman's powerpoint to review:

Review Meaning of diagnostics on the Eff-Dow Restricted Model

Today's Powerpoint presentations

  • Mac-8 - -41 - Christina Park* Day 3
  • -24 - User:Michael Grout*+Y Day 9 - professor I can't present without your help. I left questions on my edumod page, please look. Doug 22:20, 16 November 2009 (PST): Your depvarname<-"pre_mar_sex" v167 needs more variables that those in Eff and Dow, rather obviously: try adding a climate variable dealing with warmth, an environment variable dealing with tropical forests, absolute value of latitude, and other rather obvious factors discussed in class. ok but how do i do this???

XGiving -- Nov 26

Day 19 Feedback sessions on papers submitted early Tues Dec 1

Class notes

Christina Park's powerpoint from the 24th

Rape and women's political participation

Evolution of Evolution site

NSF launches Evolution of Evolution site Darwin's 150th centenial

Where we are in the finale

Edu-Mod 2009: The Individual Studies‎

Powerpoints

Day 20 Summing up: What have we learned? Thurs Dec 3 Last class Don't miss: Student evaluations of class

Class notes

The role of mothers

Here's a rather interesting inadvertent finding from Stephen Kim where Exclusive care by mother --> leads to the Highest level of control of Boys&Girls (no wonder matriliny is relatively rare). Spatial and language autocorrelation are also very high. Stephen's struggle to get results

Powerpoints

From Anthon Eff - congratulations

An unsolicited note from Anthon Eff, who wrote our R program: I had asked a recoding question and he answered re fall 2009:

> if one did wish to convert the values==18 to missing, here is the
> code:
> 
> war<-SCCS$v1648
> is.na(war[which(war==18)])<-TRUE
> 
> Thrilled that your students got into it--my more inquisitive students
> also get excited. These data can address big questions, and with good
> methods they will provide persuasive answers.
> 
> Anthon

Villy's powerpoint and the debate over warfare and violence

The network of causal relations for our causal and dependent variables

Indep/Depvar list - All -- click the Network and then click the square nodes for URLs. Maps of Ownership of dwellings show no tendency for spatial clustering, but might indicate ecological zones for male versus female ownership.

Diagram of the interactive effects model for human societies

Eff&Dow in CHINESE BOX OF INTERACTIVE EFFECTS2.png

Using the codebook, you specify a DV (dependent variable) for the 186 societies in the SCCS database. This involves changing two lines of computer code on the wiki page where you copy one of the available Eff and Dow 2009 models. That model contains the IVs (independent variables). The model gives you coefficients and significance for the following:

fydd - the effects (positive coeff.) of neighbors on the DV
fyll - shared history (positive coeff) or differentiation (negative coeff.) from the language family
each IV - positive or negative effect on the DV, in addition to those of the fydd and fyll.

Thurs Dec 10 Final papers due, email doc files or place in my mailbox SSPA 3rd floor

Final contributed papers

y = posted with permission of and (c) copyrighted by the author

Map of language families of 186 societies in the SCCS (Standard Cross-Cultural Sample)

Click image to the right to open the map. May be copied under the Creative Commons license.

Map of Language families of 186 societies in the SCCS (Standard Cross-Cultural Sample)
CCLic.png

Course comments

(please feel free to add!) Anth174AW-09 course comments

General

UCI courses

Miscellaneous

Subsample replication