Peter McMahan

From InterSciWiki
Jump to: navigation, search

Cohesive blocking using iGraph in R - latest version - Gabor Csardi pairwise cohesion

Copy http://intersci.ss.uci.edu/wiki/pub/mwExample1.net to your directory, then 
Cut and paste into R as a command
source("http://intersci.ss.uci.edu/wiki/pub/MW.R")
#See:  Cohesive blocking
You can also copy http://intersci.ss.uci.edu/wiki/pub/mwExample1.net to your directory, getwd() to get and getwd() to set the wd then change the source of the input file to the wd in your copy of MW.R

Doug's R package installation list -- R FAQ

The R code referred to will soon appear in Network_tools with documentation.

PetersenGraph.jpg

Cohesive blocking

Cut and paste into R

source("http://intersci.ss.uci.edu/wiki/pub/MW.R")

On 11 Oct 2007 at 12:16, Peter McMahan wrote:

Hello,
I am a research assistant for John Padgett at the University of
Chicago. I've been working recently on implementing in the R language
the cohesive blocking algorithm you put forth in "Structural cohesion
and embeddedness: A hierarchical concept of social groups" (2003).
I've recently finished work on the code, and I am planning on
submitting it for inclusion in the next version of R's "igraph"
package. However, I want to send the code to both of you first to let
you have first go at it. I've checked it against the SAS algorithm on
a number of small examples as well as on John Padgett's current
Florentine marriage network data, and the results are consistent.
(One caveat is that while the SAS code returns all cohesive blocks,
the R version only returns those with higher cohesion than any
containing subgraphs).

The code requires a few R packages to run: recent versions of "igraph", 
"digest" and optionally "RBGL", all available through CRAN - from 
http://www.stathy.com/cran/ 
I'd love to hear any feedback or comments you have. In particular I
think that the concept of cohesive blocking is very useful, but not
widely implemented. Any thoughts on how to get this algorithm and
idea to a wider audience would be very much appreciated.
Thank you,
Peter McMahan

iGraph news Jan 1 2008

News - posted from http://cneurocvs.rmki.kfki.hu/igraph/ igraph 0.4.5 (igraph reference manual) See for ex. igraph_write_graph_pajek(gBlocks) --> how to do the converstion to pajek and to a Fruchterman spring embedding? Doug 19:36, 8 January 2008 (PST)?

Released January 1, 2008

  1. New: Cohesive block finding in the R interface, thanks to Peter McMahan for contributing his code. See James Moody and Douglas R. White, 2003, in Structural Cohesion and Embeddedness: A Hierarchical Conception of Social Groups American Sociological Review 68(1):1-25
  2. Biconnected components and articulation points.
  3. R interface: better printing of attributes.
  4. R interface: graph attributes can be used via '$'.
  • New in the C library: igraph_vector_bool_t data type.
  • Bug fixed: Erdos-Renyi random graph generators rewritten.

2008 Cohesive blocking in R from Peter

Peter McMahan's page, that is Tips in R

After installing igraph you may need to explicitly load it with:

library(igraph) #---- "built under R 2.2.1" and runs with 2.6.1  Doug :) 16:12, 8 January 2008 (PST) thanks Peter

You can tell if igraph is loaded by typing:

search() #---- successful

and making sure that "package:igraph" is somewhere in that list.

Also, an updated version of the code I sent out before is now included in the package itself, so you no longer need to source any files. After loading igraph simply using:

example(cohesive.blocks) #---- successful, installing only igraph, digest

will run a quick example of the function's use, creating an object called gBlocks

To see information and usage, type

help(cohesive.blocks) #---- 

To print some info about number of blocks etc. Entering:

plot(gBlocks) #---- 

after you've run the example this will give a visual display of the block structure. For better layouts, use one of these:

plot.bgraph(gBlocks,layout=layout.kamada.kawai)
plot.bgraph(gBlocks,layout=layout.fruchterman.reingold)

For larger graphs install 'RSQlite' (all so far from PACKAGES/INSTALL PACKAGE(s))

Pdf output

for cleaner-looking (antialiased) output from R you can write directly to a PDF or postscript file. Once you have the commands written to do the output you can use something like this:

    pdf(file="Rfigure.pdf",width=6,height=4)
    ## plotting commands here
    dev.off()

If your setup doesn't have the pdf() function, try replacing it with something like "postscript(file="Rfigure.ps",width-=6,height=4)"

 Peter

Cohesive blocks analysis

Cohesive blocking

Optional: Parallel Computing

The Snow package from Cran speeds up computationally demanding statistical procedures, such as Bootstrapping, Markov Chain Monte Carlo, and cohesive blocking sped up significantly by using several connected computers in parallel.

Let me know if you still can't get it running, because at this point it should be completely cross-platform and built in to the igraph package. Gabor Csardi, the package maintainer, uses Windows and he has been able to run the code.

Thanks, Peter

More: Using igraph

Show objects and plot gBlocks and g (the original graph)

plot.bgraph(gBlocks) 
ls()
plot(g)
help (plot)
plot.bgraph(gBlocks,layout=layout.kamada.kawai)
plot.bgraph(gBlocks,layout=layout.fruchterman.reingold)
see: http://cneurocvs.rmki.kfki.hu/igraph/doc/R/plot.bgraph.html
plot(g,layout=layout.fruchterman.reingold)
plot(g,layout=layout.kamada.kawai)

More: R and Pajek

3.15 Tools: Pajek Manual • R

– Send to R – Call statistical package R [49] with one vector/network, vectors/networks
  selected by cluster or all currently available vectors and/or networks.
– Locate R – locate position of statistical program R (Rgui.exe or Rterm.exe) on the disk.

igraph has very good abilities to read *.net files with the function "read.graph". Calling:

g <- read.graph("filename.net","pajek")

should load the .net file as an igraph object named "g".

(By the way, a nice shortcut to specifying files in R is to replace the quoted file path with the unquoted function call "file.choose()". This should bring up a familiar OS-specific file chooser so you can locate the file more easily. So the above command would be ' g <- read.graph(file.choose(),"pajek") '.)

The "read.graph" function does a good job of getting all of the graph, edge and vertex attributes from the pajek file, so no information should be lost ("?read.graph" will give you more details on this). Note that Pajek doesn't have a good way to represent overlapping subsets like cohesive blocks, so if you import a file "original.net", run cohesive.blocks on it, then export it, it will end up as a set of files instead of just one. This is inevitable — I explain it a bit more in the help file "?write.pajek.bgraph". -- Peter McMahan

write.pajek.bgraph(gBlocks, filename="g", hierarchy = TRUE) 
write.pajek.bgraph(gBlocks, "g", hierarchy = TRUE)

Connecting to Pajek

Your output file from the command above was written to file "g.net" in your directory. You can just open it in Pajek. If you dont know what directory this is type

getwd()
list.files()

to print your current "working directory". If you don't specify a full path ("C:\\...") in R functions that interact with the file system, they assume you're talking about the current working directory. All of the graphical R versions have a menu command somewhere to change or set the working directory. -- Peter McMahan

Now, putting TRUE in the commands above will save "g.clu" in addition to the Pajek "g.net" file, but "g.clu" will have ascii symbol 13 (musical note) which is an extra end-of-file marker that cannot be read by Pajek (DRW has a program in DOS, "Lite.exe" editor to get rid of these but -- help!! -- there must be an easier way.

Probably some clever R programmer can write an R routine using [ http://tolstoy.newcastle.edu.au/R/help/01a/1877.html ascii character recognition] to input an ascii file and output the same file minus an ascii 13 at the end of each line, if any. This program could then be run right after the *.clu file is created. OR Peter McMahan could change the write.pajek.bgraph code.

Ok, I think I got it. It looks like in an attempt to be more cross-platform, R for Windows translates "\n" into "\r\n", so each of my "\r\n"s were being turned into "\r\r\n", which was (unsurprisingly) confusing to Pajek. I was redoing work R already took care of. The code below should fix it. Unless you tell me otherwise I'll ask Gabor to include this in the new version of igraph. -- Peter 2-28-08
in the meantime, here is the code http://intersci.ss.uci.edu/wiki/pub/write.pajek.bgraph.R WikiSysopWikiSysop 20:51, 18 February 2008 (PST)

Files FROM Pajek TO R require that you first click Pajek/Tools/Find program... on your directory. Then in Pajek click Tools/R/All Networks & Vectors/ which generates "C:\\Program Files\\R\\R-2.6.1\\PajekR.r". Open that in R and you get

  • >source("C:\\Program Files\\R\\R-2.6.1\\PajekR.r") which says
Use objects() to get list of available objects           
Use comment(?) to get information about selected object  
Use savevector(v?,'???.vec') to save vector to Pajek input file  
Use savematrix(n?,'???.net') to save matrix to Pajek input file (.MAT) 
    savematrix(n?,'???.net',2) to request a 2-mode matrix (.MAT) 
Use savenetwork(n?,'???.net') to save matrix to Pajek input file (.NET) 
    savenetwork(n?,'???.net',2) to request a 2-mode network (.NET) 
Use v?<-loadvector('???.vec') to load vector(s) from Pajek input file  
Use n?<-loadmatrix('???.mat') to load matrix from Pajek input file  

http://pajek.imfm.si/doku.php?id=how_to is not much help

Convert a Pajek file to R

Read a Pajek Project or Network File and Convert to an R 'Network' Object Says read.paj {network} so you have to

install(network)
library(RSQLite)
read.paj(file="Garfa.net", verbose = FALSE, debug = FALSE, edge.name = NULL, simplify = FALSE)

Here I got

Error in set.edge.attribute(temp, network.names[nnetworks], dyads[, 3]) : STRING_ELT() can only be applied to a 'character vector', not a 'NULL' #so asked for
help(read.paj) # and used the examples:
test.net.1 <- read.paj("http://vlado.fmf.uni-lj.si/pub/networks/data/GD/gd98/A98.net")
plot(test.net.1,main=test.net.1$gal$title)
test.net.2 <- read.paj("http://vlado.fmf.uni-lj.si/pub/networks/data/mix/USAir97.net")
plot(test.net.2,main=test.net.2$gal$title)  #then I tried
test.net.3 <- read.paj("Garfa.net")
plot(test.net.3,main=test.net.3$gal$title)
plot(test.net.3,layout=layout.fruchterman.reingold)
plot(test.net.3,layout=layout.kamada.kawai)
cohesive.blocks(test.net.3)

If you dont find your file try checking and resetting with:

getwd()
setwd("C:/Program Files/R//R-2.6.1/") #or
setwd("C:/Program Files/R//R-2.6.2/")

Mark S. Handcock

How to read a Pajek file in igraph

g <- read.graph("filename.net","pajek") --> This code doesn't work in R interface. I got stuck, getting the error message "Cannot read Pajek file, File operation error".


It can be load the .net file as an igraph object named "g".

More: Using SNA

---- not working (need help) how do I transfer the g format into one for sna?
help(package=sna)
library(sna)
help (gplot)
gplot(g) ---- not working (need help) 
gplot(g,mode = "fruchtermanreingold") ---- not working (need help)

Using SNA ---- not working (need help) how do I transfer the g format into one for sna?:

There are three major network/graph packages included in R (`igraph`, `sna`, and `network`), each with a totally different philosophy and totally incompatible data structures (although sna can natively use `network` objects). The easiest way to get between them is to convert to adjacency matrices, so something like

g.adj <- get.adjacency(g)
gplot(g.adj)

will work, but you'll lose all the cohesive blocking data in the process (sna doesn't have an easy way to represent the blocks in a graph object). -- Peter McMahan

Separate program obsolete

THE SEPARATE PROGRAM IS IS NOW OBSOLETE - SEE BELOW - NOW INSIDE 'igraph'

(Peter: I'm attaching two files, "CohesiveBlocks.R" (the functional code) and
"CohesiveBlocksExamples.R") (with some examples of use). -- YOU DONT NEED THIS NOW

Earlier problems

DRW: RSQlite needed for larger graphs

MAY NO LONGER BE NEEDED

For Windows, RBGL is under Repositories/Bioconductor
http://www.bioconductor.org/checkResults/1.9/bioc-20060912/derby-RBGL-checksrc.html

is that the place to download from....? Still useful (this was from the 2007 version of igraph

http://www.stathy.com/cran/doc/packages/RBGL.pdf
http://www.stathy.com/cran/src/contrib/Descriptions/RBGL.html
plot(mwBlocks)

>THIS NO LONGER NEEDED IN 2008: The updated code (with some documentation) is at: https://webshare.uchicago.edu/users/jpadgett/Public/data_and_code/structuralcohesion.zip I'm not sure how to upload that to the intersci wiki, but feel free to put it up there. I think it should work with the currently available version of igraph, without the need for RBGL, but the example code currently on the wiki won't work with the updated source. Instead use the examples in the included cohesive.blocks.rd file.

Thanks for adding so much content to the intersci wiki page. That's a really good resource to have, for this project and others.

And one more note regarding required packages: The code should run with just igraph, but utilizing `RSQLite' and `snow' help with memory usage and speed, respectively, for larger graphs.

Thanks again, and let me know if you have any further questions. I'll be sure to send you an email when the igraph package with the code built in is released.

Peter

Attachments: NOW OBSOLETE IN 2008

On Oct 11, 2007, at 11:24, Doug White wrote:

Peter, this is marvelous that you are doing this ... I will get right on it. 
Also check out http://intersci.ss.uci.edu/wiki where under the portal and tools 
I have begun to show students how to use R and Python with Networkx to do 
comparative analysis, network simulation and the like. This code could be 
discussed there and if you login you can edit pages, post code as text, or as  
download, and add documentation. Feel free to utilize this wiki resource to 
your heart's content. I now teach my courses "on" the wiki.

On Oct 11, 2007, at 1:23 PM, James Moody wrote:

Hi Peter -

This is fantastic!  Glad to hear someone is putting that together.   
I thought Igraph had some functions for identifying multiple
connectivity, but I wasn't sure of the extent.
    (Doug’s comment: only for pairwise multiconnectivity.)
   
I think a note to SocNet announcing that it is there would help  
spread the word.  I'll also forward this on to the people doing STATNET,
as it would make a great diagnostic tool for the fit of their micro-models.

Again, thanks for implementing this.
Best,
Jim

P.S.

Just so I'm clear, if a node is embedded in both a 2 component and  
a 4 component, your routine would return "4", and for all 4 components
(say) you wouldn't know how they are nested.  Is this right?

I think part of the interest in cohesive blocking is being able to  
know if sets of nodes are mutually connected at multiple levels,
so as a future extension, getting the full nesting would be great.

On 11 Oct 2007 at 14:33, Peter McMahan wrote:

> Hi Jim, > Sorry, I guess I wasn't quite clear. The model returns a number of things, including the nestedness structure of the blocks (and a tree representing this). So a node will be counted in both the 4-cohesive component and in the 2-cohesive component that is a superset. My comment was just to note that if there was also a 1-cohesive component between the two (a superset of the 4-component and a subset of the 2-component), it won't be recorded as it is in the SAS algorithm. (This is easily changed, though, in the code.) Thanks for the references to SocNet and STATNET. I look forward to any other input you may have. Thanks, Peter

On Oct 12, 2007, at 11:14, James Moody wrote:

Ahh, that makes more sense.  This is not a major discrepancy.  There's really no strong 
theory need to look at the lower-order k-cohesion "between" two embeddedness levels.  This 
really reflects "extra" density in the lower level (See Doug and Frank's paper on conditional  
density).
Again, really great that you did this.  I forwarded it to the list for the STATNET folks...
PTs,
Jim

Requests

Dania Cordaz

Fixes

Re: slowdown problems for the San Juan network in the de Nooy, Mrvar, Batagelj book, as noted in Cohesive blocking

Hi Doug,

I finally had a chance to dive in to the code, and have identified the slowdown. The function is getting hung up inside of a subfunction that finds all k-cutsets between two nodes. Instead of using Kanevsky's algorithm for this I used a simpler and often faster algorithm that finds all minimal cutsets and simply discarded those larger than k. (C. Patvardhan, V. C. Prasad, and V. P. Pyara. Vertex cutsets of undirected graphs. Reliability, IEEE Transactions on, 44(2):347–353, June 1995)

For some reason on this specific graph it hangs for a very long time -- I'm not sure what it is specifically but it is having to iterate over far too many possible cutsets. I tried reimplementing the algorithm with the same results, so I don't think it's a bug but a flaw in my heuristics. However it's easily fixed. The function kComponents that does all this work has a "type" argument that can be set to "old" or "new". Changing this to "old" allowed the function to run in just a few seconds.

For now, changing this means getting the igraph source code, hard coding in "old" as the default argument to kCutsets (replace 'type="new"' with 'type="old"') in the file "igraph/R/CohesiveBlocks.R", and sourcing the whole file. I'll ask Gabor if he can feed it through as an argument to cohesive.blocks(), so there would be an argument like "cutsetAlgorithm" that can be set while calling the function.

Thanks for finding this! User:Peter McMahan 09:59, 21 July 2008 (PDT)Peter