NSF World Cultures Project work plan

From InterSciWiki
Jump to: navigation, search

"Detecting and Envisioning Sociocultural Processes in a Global Historical Anthropology: A Standard Cross-Cultural Sample NSF Proposal

Now get a CGI that links "Change Depvar" to EduMod-52
CGI and HTML tutorial

Project goal

The goal of the proposed research is to test as many plausible causal models as are feasible, given model selection and processing rules, and to investigate the largest feasible set of plausible causal relationships among variables coded in the SCCS and forager databases as well as environmental and other variables (e.g., weather station data, geographic features) that can be coded for the geocoordinates and timeframes of the ethnographic cases. Further, many of these selection processes will be automated in a programming team approach using Agile software development, a group of software development methodologies based on iterative development, where requirements and solutions evolve through self-organizing collaborations of cross-functional teams (anthropological, statistical, econometric, GIS and data-systems).


  • 1. Numbered variables in the SCCS codebook are assigned variable names and keyword attributes to classify them, such as conflict, warfare, behavior, attitudes, within community, between community, etc.
  • 2. Similar variables are automatically placed into clusters on the basis of cluster analysis of their keywords. Clusters of variables are analyzed by a modification of the Eff-Dow program that uses single-factor tests along with multiple imputation.
  • 3. Sets of variables that conform to single-factor clusters are automatically ranked by number of cases and factor covariance. In choosing combinations of independent (IV) and dependent (DV) variables for 2SLS, only one top-ranked variable from each factor structure will be allowed, thereby preventing multiple measures of the same IVs (i.e., to prevent colinearity and high VIFs or variable inflation) and preventing IVs that are not logically independent of the DV.
  • 4. All of the top-ranked variables from distinct factors (i.e., coded for more SCCS cases and having high single-factor scores) will be scanned by the PI for their suitability as DVs.
  • 5. An automated process will then perform the following selection process for constructing a 2SLS regression model for each DV:
    • a. Insert each suitable DV, in turn, to the R code.
    • b. Begin with the IVs used in the Eff-Dow R code, eliminating the lesser-ranked variable for which are higher-ranked variable is present in the model.
    • c. Run the model and eliminate all variables not significant at p <.10 or less, iteratively, until only significant IVs remain, then eliminate the language and or distance effect variables if the are not significant at p < .20.
    • d. Output pairs of high VIF variables to an alerts file reviewed by the PI.
    • e. Check, iteratively, whether R-squared is improved by the addition of any unused top-ranked potential IV that does not conflict with an existing IV ranked in the same factor, or in substituting for an existing same-factor lower rank IV.
    • f. Save each series of models that run successfully, and save the execution files in R that yield errors so that debugging criteria can be identified for the next set of runs in 6.
  • 6. When done with all DVs, add all new IVs in the final Restricted Model(s) for each run to the initial IV set used in 5, and rerun 5 for all DVs.

Then, to compare the 2SLS results, find those significant relationships among variables in previous SCCS studies that can be found by an automated process.

  • 7. Google scholar search for “Standard Cross-Cultural Sample”+”…keywords and keyword sets re used (as already begun at http://intersci.ss.uci.edu/wiki/index.php/Sources_for_codes_on_the_SCCS) to find all on-line cross-cultural literature sources which are then compiled in a reference list.
  • 8. Each reference is automatically searched for the locations of words that indicate hypotheses and hypotheses tests and chunks of text are saved in a concordance database that will help to identify the statistical or hypothesizes relations among variables among authors using the SCCS data to posit or test hypotheses.
  • 9. Comparison of findings in the literature (and in previous compilations of results from the literatures) with the findings of 2SLS.

Karim Chalak

Subject:   	Re: Consultant category G-3
From:   	"scott w" <scottblanc@gmail.com>
Date:   	Mon, January 11, 2010 12:34 pm
To:   	"Karim Chalak" <chalak@bc.edu> (more)
Cc:   	"hal white" <drhalwhite@yahoo.com>

Hi guys --

Looking forward to working with everyone as well. Personally, for me this
will be a great opportunity not only to familiarize myself with some more
advanced causal methods but also see it applied to deeply interesting
problems in anthropology.


On Mon, Jan 11, 2010 at 11:14 AM, Karim Chalak <chalak@bc.edu> wrote:

> Thanks Doug! I look forward to this.
> Best regards,
> Karim