Galaxy

From InterSciWiki
Jump to: navigation, search

CoSSci Galaxy

... is the name of the computational site that computes Dow-Eff analysis: http://socscicompute.ss.uci.edu. Galaxy CoSSci is the name given at that site. Galaxy has been co-developed by James' group and his collaborator Anton Nekrutenko at Penn State. They each have 5-6 people working on Galaxy in our (2015?) labs currently. You can see more information about the team here:

https://wiki.galaxyproject.org/GalaxyTeam

How to Organize beyond the 3rd year October 1 - Jan 22 2015

Yes, James Taylor and Anton are total stars in the genomics world – very much in demand in speaking circuits and such. They’ve changed the world with Galaxy.

There’s been more and more pressure in ECSS to not continue to grant support for projects that have received it year after year. The idea behind all ECSS projects is that they’re temporary – work with a group for a year, jumpstart a new activity, leave the group in a place where the group can carry on afterward. This month a spreadsheet was circulated among the management team indicating where’d provided multi-year support – in some cases since the inception of XSEDE 3.5 years ago. The thinking of course is that if we continue to work with the same people, we have no cycles for something new. So we’ve always approached this conundrum by supporting returning projects only if we had spare folks who weren’t otherwise assigned. That has worked well, but it hasn’t stopped the insistence that we shouldn’t, year after year, support the same project.

I say all that so that you understand some of the background for how these projects (and not just yours) are viewed. In an ideal world the CosSci gateway receives some small amount of its own funding – enough to pay a part time person to maintain the website, add new features, collect accounting data and such. Sometimes very large gateways serving thousands can limp along with half an FTE or less for some years. The CIPRES gateway is an example of this.

It would be good to keep thinking about a plan for once this third ECSS award concludes in October. I know it’s hard and I’m not saying we absolutely won’t support the work, I’m just saying there is a lot of building pressure on the multi-year awards.

Nancy

From: James Taylor <james@taylorlab.org>Jan 21 2014 AM

cc: Anthon Nekrutenki <anthon@nekrut.org>  nekrutenkoLab at Galaxy project - http://www.bx.psu.edu/~anton/labSite/

Doug,

We will shortly be submitting our renewal application for the NIH grant that support the bulk of core Galaxy development. I was wondering if you would be able to provide a support letter for our application. The Complex Social Science Gateway (CoSSci) is one of the most interesting applications of the Galaxy framework outside of genomics, and I think it really demonstrates the versatility of Galaxy. A support letter from you expressing this would be very helpful.

Thanks, James

--
James Taylor
Ralph S. O'Connor Associate Professor of Biology
Associate Professor of Computer Science
Johns Hopkins University
Taylorlab Johns Hopkins University

Answer by Doug to From: James Taylor 1:30 PM

Dear James: Sure. I found http://taylorlab.org but how would you describe, shortly, or with a url, what size etc is your

"the bulk of core Galaxy development"

e.g., a core of all Galaxy development? That's very exciting to get to know those doing this. I've tried to do workflows through the Kepler project headed at SDSC now by Ilkay Altintas for the user phase, but have trouble figuring out how to do the workflows we might use e.g., through our Argonne programmer who hasnt worked with other Galaxy sites so far as I know, although the first year of the project my Argonne programmer was Thomas Uram who had worked with your type of Galaxy in the biological sciences. Maybe we could talk about this on the phone.

Doug White 858 457 0077 in La Jolla -- all the best to you and Anthon; I'm sure once we talk Nancy will have some advice and ideas. As emeritus I have plenty of time on my hands and live 7 minutes from Nancy's SDSC.

ps my problem is to keep our project running for an emerging community (our Wiley Companion to Cross-Cultural Research) comes out in 2 years or more and it will support a portion of the host of researchers, and instructional use (my former TA Ren Feng has taught courses using CoSSci in two successive years, fall 2013 had 13 students, all but one did well; this last fall of 2014 80 students signed up, 40 completed the work - and did well. That's much better than most MOOCS but is a much more hand's on class. I have a VM at my university (UCI) and support the VM computing option at $25/month. But when my ECSS and similar support runs out -- what do I do for funding at SDSC's Trestles HPC for basic low-end instructional provision? As we develop we will use HPC more timewise as aspects of our analysis become more complex -- and as those projects and uses develop those who develop them with our help can apply for grants. But we need some more basic support or mechanism to fund instructors. And I need a platform where I can teach the use of CoSSci they way that Ren Feng does (he's an assistant professor headed for tenure; I'm emeritus professor in a postmodernist anthropology department that doesn't want to spend the $6,000/per year funding (that's not really necessary, I can do without as I have a good retirement income) or, more importantly allow me to teach the course (that I taught before retirement but now with infinitely better Galaxy software because they are opposed to scientific work in anthropology). Point is I need some collateral colleagues and ability to teach from some campus in the world that I can teach from remotely without leaving home or the nearby SDSC facility. Nancy has provided two opportunities for me to give hour-long talks to the ECSS group but any ideas for help in solving these two problems would be much appreciated.

Suresh Marru got me started in implementing Galaxy CoSSci (this is its 3rd year) but gave me full awareness that it is a user community willing to chip in to maintain a new Galaxy project that has to be thought about as it develops. After our Wiley Companion to Cross-Cultural Research comes out in 2-3 years with half the authors writing about examples of their use of CoSSci for specific examples we should be able to attract the kind of user community that Surest is talking about, but we need to get through the the next four years to make this a user-community success.

Nancy's reply

Yes, Doug, it would be great if you could write a letter of support for James Taylor.

Basically, the decision to go with Galaxy as a framework for your gateway was made by Tom Uram at U Chicago when your project first came through ECSS. That framework exists because of James Taylor and Anton. If you like what you’ve been able to do using it, that’s what you’d want to convey in a letter of support. You haven’t doe the actual programming, but as PI you’ve used the results. James is right, yours is one of the more unique projects using Galaxy – both for the subject matter and also the incorporation into the classroom and eventual online Wiley book.

As far as the compute, Doug, you can continue to run on XSEDE computing resources (and XSEDE could even host the front end VM – on Indiana’s Quarry system, on Comet at SDSC, on Jetstream). People get compute allocations for years and years. It is only the ECSS support that is more limited in nature.

So, hopefully you’re in a good place to keep computing into the future.

Nancy

James' reply after reading Nancy's

From: James Taylor <james@taylorlab.org> Date: Tuesday, January 20, 2015 at 1:52 PM To: Doug White <douglas.white@uci.edu> Cc: Nancy Wilkins-Diehr <wilkinsn@sdsc.edu>, Anton Nekrutenko <anton@nekrut.org> Subject: Re: Support Letter for Galaxy

Hi Doug,

Galaxy has been co-developed by my group and my collaborator Anton Nekrutenko at Penn State. We each have 5-6 people working on Galaxy in our labs currently. You can see more information about the team here:

 https://wiki.galaxyproject.org/GalaxyTeam

For the last four years we have been supported by a U41 from NHGRI titled "Democratization of Data Analysis in Life Sciences Through Galaxy", we are now attempting to renew that grant. The funding opportunity is described here:

 http://grants.nih.gov/grants/guide/pa-files/PAR-14-191.html

As far as scaling up goes, we were successful in getting a reasonable Gateway allocation from XSEDE, I imagine you could do the same. There will also be other new resources coming along if you really want a VM environment. Jetstream will be online in a year and available through the XSEDE allocations process.

Honestly, funding compute is hard. We've never actually managed to do it, unless Jetstream counts but it has many constituencies. All of our funding has been to support development, outreach, etc. The main Galaxy instance was first funded using borrowed/left-over hardware, some support from Penn State, and most recently support from TACC and iPlant.

-- James Taylor

Letter of Support for core Galaxy Jan 22, 2015

The creation of core Galaxy at Johns Hopkins Taylor Lab and Penn State Nekrutenko lab was intended for Comparative Genomics and Bioinformatics but its spread to fields in Biology and other sciences is invaluable. We developed Galaxy CoSSci (Complex Social Science: High Performance Computing for Anthropology and the Social Sciences) with ECSS collaborators from the Argonne Labs not only for researchers but classroom instructors and students. Complexly entwined coded datasets on ethnographically described societies with additional ecological and climate variables keep growing through researcher contributions. The Galaxy interface allows many more people to run these complex calculations, which focus on results with exogeneity, equivalent to Bayesian causality with networks of variables further analyzed with R library bnlearn. We’re publishing a Wiley Companion to Cross-Cultural Research that features the Galaxy interface. Students and researchers need only specify their dependent variable name from the codebooks, the independent variables and other covariates, choices of what background variables (language phyla, distances, etc.) affect interdependencies between observations (the controls for autocorrelation), maps to output, and model histories to be saved among researchers and between instructors and students. Our Galaxy makes possible orders of magnitude greater classroom and research usages of these HPC resources and the R code by our PIs Anthon Eff and Malcolm Dow that underlies them.

Recruiting Ben Jester jan 23 2014

Dear Ben, rj2004@fastmail.fm

I think it was Norma who told me a couple of months ago that you were interesting in joining our team using Galaxy CoSSci at our UCI VM and the SDSC supercomputer -- its a pretty interesting program. I'm attaching a letter I wrote yesterday for the core Galaxy project that is our Galaxy's grandfather project.

Also attaching my introduction to the Wiley Companion. The last section describes Ren Feng's 1st and 2nd course using this software at CoSSci (his student's results are available as shared histories, I can send those as well if you're interested). There is the statement (also due to Nancy) that the small amounts of SDSC funding for education and researchers will go on and on indefinitely but starting October 1 we need "a part time person to maintain the website, add new features, collect accounting data and such." Dow and Jonathan maintain the website at Irvine (a VM or low cost machine from user services and pay about $26/month for the VM; the connection to the SDSC machine is free). Lots of this stuff is on my intersciwiki.

If you are interested, please check out Nancy's statement below and give me a call at 858 457 0077 or email. I don't have funds for a part time person to "collect accounting data and such" but that might be something you would like to do in order to explore the supercomputer system that we use for the project. And once Ren Feng finishes translating his powerpoints, which he is sharing with us, this might be a basis to find a college that would pay you for a course modeled on his (he's an asst professor in Sociology at Xiamen). Maybe there are others you know who want to get into this line of instruction. There is no cost to Ren Feng or his university.

Anyway I am having great fun with this system (have lots of examples at "Clips" on my wiki) and the Wiley companion is moving along. Thing is, Tolga has participated quite a bit with the SDSC group but is finishing his dissertation this spring and will be moving on after that into teaching, industry or a postdoc elsewhere.

All the best, Doug