CoSSci Background, Screenshots and Instructions
Doug will be away till May 9. Meanwhile Tolga Oztan <firstname.lastname@example.org> is willing to take questions and can made corrections in the wikipages. Thanks, Tolga -- Doug - Email invitation: Potential participants in the Wiley Companion to Cross-Cultural Research and the accompanying Complex Social Science Galaxy (CoSSci)
The http://socscicompute.ss.uci.edu now has the two instructional youtubes (saved histories 2min and overview of CoSSci Galaxy 20min) and http://capone.mtsu.edu/eaeff/DEf01SCCS.html has the codebooks and R gui code used in CoSSci. Let me know about doing a chapter for the Wiley Companion based on one or more CoSSci-derived models. The trick about improving a model from the depvar and a few independent plus Independent UNrestricted model variables (used to help with covariates for imputation) is to inspect the *.csv output for the "To Try" variables which may be tested as additions; each alone tends to increase predictive variance (if taken one at a time) and may provide meaningful additions to the model. The "Galaxy" framework is commonly used in many of the Science Gateways listed at http://intersci.ss.uci.edu/wiki/index.php/Main_Page (https://www.xsede.org/gateways-listing).
-- Tolga will respond to requests for help, you can pepper me with more general questions. You and students will find CoSSci Galaxy much easier to use than the R gui, which R afficionados will like.
Douglas R. White
http://intersci.ss.uci.edu/wiki/index.php/DW_home Editor: ________________________________________________________________________ Structure and Dynamics: eJournal of Anthropological and Related Sciences https://submit.escholarship.org/ojs/index.php/imbs_socdyn_sdeas
What (click here) is DEf? and (next) CoSSci? http://socscicompute.ss.uci.edu and the http://capone.mtsu.edu/eaeff/DEf01SCCS.html codebooks < - - - These are the entry point for the Complex Social Science Gateway (CoSSci) at UCI, its also the startup of the for Wiley Companion to Cross-Cultural Research Chapters. Its been used in a fall 2013 class by Ren Feng and by Doug and 9 prospective authors at the session in Albuquerque for Celebrating, a site that now contains all the power points and two youtubes with instructions (1) See you tubes How to share CoSSci histories with Students: see your Published Histories and the 20 min clickable CoSSci overview. See also Gateway Instructions. We may be building (2) "An inventory of CoSSci data entry variables" for users to see the kinds of entries to put into the screen for a given model. (3) Better screenshots of the site with explanations of steps in analysis (4) a series of screenshots that "open the site", items 6 - NoVA - and 7 are new Jan 25th 2014 and will be edited for better detail what the screenshots show by way of instructions. (5) Output will come in the form of a *.csv file as a download on your computer (just click the image of a diskette to open *.csv results).
- Users can login, set their password, then change the name of the upper left "History" file, share the history on CoSSci or publicly by pressing the upper right *. That history can include models saved at that time, and instructors can then seen students' work, or students can share with other students, etc. Don't rely much on the screenshots below, this was day one, all this will be edited (and gotten ready for online courses, or use of CoSSci from classrooms), each student can open their own CoSSci site.
- What you'll see when CoSSci starts is a yellow window above the green (upper right), when finished the yellow window turns green and eventually a "Diskette" images appears. Click that for your *.csv output
- All the work can be done now at CoSSci but for some can be replicated at home once certain R packages are installed. Some can be downloaded from the http://intersci.ss.uci.edu/ Main Page site, go to Examples and then List of Rgui to CoSSci Models (in transition). Also check StartSimple.
- This page consists of Instructions for Use in Cross-Cultural Research."
- 1 Other Screenshots 1: Using a Gateway shell (one used by biologists)
- 2 Screenshots 2 getting output
- 3 Screenshots 3 setting input for a new model
- 4 Screenshots 4: How to add the Named variables that are from recodes of the original dataset dx$v000 numbers?
- 5 Temporal Sequence of Screenshots
- 6 THIS YOU CAN IGNORE - Illustrating NoVA - Networks of Variables Analysis
- 7 Sharing, Publishing, and viewing Shared Data
- 8 From Tom Uram (our Argonne Labs Galaxy -python- Programmer) Jan 24 2013
Other Screenshots 1: Using a Gateway shell (one used by biologists)
- The site we took as a shell can be studied at http://wiki.galaxyproject.org/Learn/Screencasts and https://main.g2.bx.psu.edu/u/aun1/p/galaxy101.
Tom and I were discussing the statistical packages used by biologists - note they are using lm() without missing data imputation of 2SLS detection of autocorrelation (Galton's problem).
Our tools are further up on this page. We will leave the biology tools up until we learn to use some of them, like text manipulation, filter and sort, statistics, graphing and sorting data, etc., at the upper left of this screen.
On the upper right is a slideable window of Tom and Doug's review of Tom's History site where 57:EAF1 is his 57th use of the "EAF1" tool which is one of Anthon's scripts that takes variables of interest, maps them into Dow-Eff functions format, prints the compilation of input variables (where we see "Dependent variable='valchild' from the article Eff and Dow (2009) in Structure and Dynamics, an early prototype now obsolete. So for the time being ignore the biology-specific tools.
This is the top half of that page:
the X and Y and lm are from the biologists's tools for regression, not ours. We cannot use their tools for our data and Dow-Eff functions. But we can get some ideas from exploring their use of a Galaxy site, which is a common format for building [https://portal.xsede.org/science-gateways Science Gateways).
Now lets go to our own CoSSci Galaxy, http://socscigate.oit.uci.edu/uci/root, installed at UCI, thanks to OIT's John Saska and Francisco Lopez. The "Gateway" was installed at UC Irvine but links to XSEDE, which is part of UCSD's San Diego Supercomputer Center, SDSC, right near Doug's home.
Screenshots 2 getting output
At the top dark row of the previous screens you'll see "Analyze data" at one end and "User" at the other. I just logged in with my email id and made up a password. Now we see 56: EAF1 under 57: EAF1, so we're working backward. At the bottom of that green text-filled rectangle there is an image of a whirling blue arrow. When clicked it says "Run this job again."
Now the history that Tom has been sharing with me disappears, and the screen is filled with text and its not aligned. The file that is downloaded to your pc, however, is a a *.csv file and it looks like this
Thats the top of one of the olsresults.csv ouputs that you would have gotten if you ran this script at home in R. To see how this was generated click the (static image of) the whirling blue arrow that will say "Run this job again." When done, click the diskette image in the row of three and you get the output on directly the white screen.
Screenshots 3 setting input for a new model
Click the little i in the row of three and you get the input specifications of the job just run, on the screen. (BTW if after pressing i Filesize: ? rather than KB, the run is waiting for output)
But where did the definition of the depvar disappear to? Well, lets do the data definitions again: click the swirling blue arrow to the right. We see the screen below: Now we see where to define the dependent variables as (dx$v473+dx$v474+dx$v475+dx$v476)
We can see now that the entire model can be defined by the
- depvar name
- all UNRESTRICTED indep var definitions, e.g.. v1260,v203,v204,
- all RESTRICTED indep var definitions, e.g., v1260,v155,v233d4,
- the DATASET Name of the depvar dx$valchild
- the DATASET Definitions of the depvar (dx$v473+dx$v474+dx$v475+dx$v476)
But what about other variables that require DATASET Names and DATASET Definitions?
Ingeniously, Tom Uram as observed that these are defined IN THE SAME WAY as the depvar, so JUST CLICK |Add a new variable| and you will see:
- the same combination of a DATASET Name and DATASET Definition for each new variable
- and then you simply name these added variables to the list of UNRESTRICTED or RESTRICTED independent variables.
Screenshots 4: How to add the Named variables that are from recodes of the original dataset dx$v000 numbers?
Well, here you have it: these added Named variables can be continued indefinitely but when adding to the RESTRICTED variables list you must also add to the UNRESTRICTED list (altho not necessarily in the same order).
Now, when you're done with Added variables (as many as needed) then press EXECUTE Once EXECUTE IS FINISHED click the diskette image and receive *.csv as a download -- once downloaded, just click to open the *.csv results.
So print, memorize, split screen or annotate these pages and go to http://socscigate.oit.uci.edu/uci/ or http://socscigate.oit.uci.edu/uci/root Doug (talk) 8:55, 25 January 2013 (PST) (Previous 18:55 the 24th)
Temporal Sequence of Screenshots
Sequence of Screenshots This what what you see as a few minutes go by -- system works but we need more explanation
THIS YOU CAN IGNORE - Illustrating NoVA - Networks of Variables Analysis
These are put together from four dependent variables models. Illustrations from Wiley Companion Chapter 5 Diagrams 4 & 5 <-- click for the pdf of the draft article - Invitations to Critique
<-- Legend: Arrows point to Dependent variables. Blue arrows are consistent positive regressions among variables that are anti-female. Red arrows are consistent negative regressions among variables that are anti-female. Together they form a cluster of anti-female variables. Below: --> Names of variables reversed to form the same cluster of variables, now pro-female. All dotted (red) variables show negative regression coefficients, solid arrows (black) show positive coefficients.
In the middle of the top black bar you'll see -->Shared Data<-- at the middle. If you click there the first drop down item is:
- Published Histories
And there you will see one shared history, that of Thomas Uram. You may run any of his models, although 56 Eff1 and 57 Eff1 are the most recent (working) models.
From Tom Uram (our Argonne Labs Galaxy -python- Programmer) Jan 24 2013
You can learn plenty about Galaxy from these screencasts:
Some of them will be very bio-centric. If you can ignore the bio details and concentrate instead on datasets and tools, you'll see how it can support your work with your tools.
There's also this Galaxy101 page, which is a good example, but you'll have to gloss over the bio details here, too.
And lest you think it's only about biology, it's being applied in several other domains: at Argonne, we are using it for high energy physics simulations.
From an early Wiley Chapter draft these were some drafts of tables but there are better ways now: Illustrations from Wiley Companion Chapter 5 Diagrams 4 & 5 <-- click for the pdf of a draft article