Python for networks

From InterSciWiki
Jump to: navigation, search


http://www.awaretek.com/tutorials.html More than 300 Tutorials

Pandas

http://pandas.pydata.org/

http://pypi.python.org/pypi/networkx

Networkx

https://networkx.lanl.gov/wiki Networkx for feedback networks

Networkx is a Python program for networks, programmed at LANL, 2006.

It also works with the SAGE package, explicated in the Sage wiki.

For Absolute Beginners to Programming

If you are a non-programmer, you may be interested to checkout the video: First 5 mins with Python at http://showmedo.com/videos/video?name=990000&fromSeriesID=98

This is also an excellent tutorial that will get you started on Python in about one week, assuming you have other work to do. http://www.freenetpages.co.uk/hp/alan.gauld/

Networkx is a module (aka 'plugin') for Python so you will also need to read this after you get the basics. http://networkx.lanl.gov/tutorial/index.html

Installing Python and networkx for Python

If you have not installed Python, you will need to install Python from here. http://wiki.python.org/moin/BeginnersGuide/Download

It will install in \Python26 or a similar directory. Go to that directory and type "python" to test whether it is installed.

After you have Python running, then go to http://cheeseshop.python.org/pypi/networkx/0.34 Networkx including windows installation (that is the recommended download. In my windows installation, Nataša Kejžar's two programs for the feedback networks simulation, feedback.py and to_Pajek.py, didnt recognize networkx to be a valid module until I renamed C:\python25\share\doc\networkx-0.34 to networkx-x.xx. Then they ran, and left to_Pajek.pyc, a compiled file that Python uses to speed execution.

Second time through I get the error "ImportError: No module named networkx" Doug 08:50, 26 March 2009 (PDT)

Illustration: Generating Social circles feedback networks

The two programs we are going to use are feedback.py and to_Pajek.py, new implementations of our code for the 2005 article written by Nataša Kejžar for her dissertation. They are zipped for can be download from http://intersci.ss.uci.edu/wiki/pub/Feedback4Python.zip. Unzip and place them in your Python directory, then run feedback first, to_Pajek second. Here are the steps.

From Nataša Kejžar natasa.kejzar@fdv.uni-lj.si Sat Jun 30 04:28:17 2007 Date: Sat, 30 Jun 2007 13:32:21 +0200 (CEST) To: Doug White <drwhite@uci.edu>

Dear Doug.

The 2 files of Python functions, that I have sent, were made for some other simulations, that I did for my PhD (I have to warn about that, because they were not made robust on peculiar inputs, that a user, who sees them for the first time, might try).

When I run them, I do the following things (I actually did all steps that you explained in the previous mail to install Python on Windows - I usually use it in Linux - and everything works without renaming any networkx files).

With Python 2.5, you got also the IDLE Interface program (Python GUI).

  1. First, place feedback.py and to_Pajek.py in the same folder.
  2. Open feedback.py with IDLE (you get a new window with code)
  3. On the end of the code, there are few lines, that start with "#" (a

Python symbol for comment). The first line is

  1. G = simulate_feedback(1,1.2,0,5000,1)

This is actually the line with which the main function (simulate_feedback) of the program is run. The parameters of the function are:

  • a ... parameter alpha
  • b ... parameter beta
  • g ... parameter gamma
  • n ... number of vertices, when the model stops running

seed=None (random seed - if used, then everybody should obtain the same result, when running the model)

So, if you uncomment the commented line (simulate_feedback), you run a model, which stops with a network of 5000 vertices, a model with alpha=1, beta = 1.2 and gamma = 0 is run. The network is saved in variable G (a network in networkx format).--Doug June 2007 (take care here and below not to indent this line when uncommenting!)

The second line (if uncommented):

to_Pajek.Graph_to_Pajek(G,"feedback1120.net")

runs a function to_Pajek (from the to_Pajek.py file) and outputs a file in which network G (in networkx format) is saved as a Pajek file. So it can be read in Pajek and analyzed or drawn (if a suitable number of vertices, of course).

4. - so if you uncomment the above described 2 lines, save the file and run it (with command F5 or via menues on the top of IDLE window), you get a new Pajek file feedback1120.net.

>>> run

now wait a bit for the program to make the network, and you get two blank lines

>>>

>>>

Option to run in DOS or Linux

In Win7: search box CMD

The other possibility is, to run python files directly from the MSdos window (or shell in Linux). In this case, you'd have to uncomment the two lines described before (for simulation and to_Pajek) and then run:

python feedback.py

In this case, the network G is lost, because the program terminated in the end.--Doug June 2007 (but when followed by

python to_Pajek.py --the output files appear: feedback1120.net, with a name that indicates a model with alpha=1, beta = 1.2 and gamma = 0. What also appears is

feedback_za_R which has fdegree <- c(0, 3316, 690, 283, 174, 119, 73, 44, 40, 32, 26, 29,... which is the degree distribution for frequencies of nodes with 0, 1, 2, 3 etc. edges.

Analysis in Python

(If you have run feedback.py and to_Pajek.py...)

(4) in the interactive window ("Python shell"), you have the G (network in networkx format) saved and you can analyze it further with networkx commands (such as to get degree "histogram"; the commands are:

>>> import networkx

>>> hist = networkx.degree_histogram(G)

>>> hist

[0, 3316, 690, 283, 174, 119, 73, 44, 40, 32, 26, 29, 23, 13, 9, 12, 4, 12, 7, 1, 6, 3, 7, 4, 8, 2, 3, 8, 2, 5, 1, 1, 6, 3, 1, ... truncated at 0

drop the leading 0

degree= [3316, 690, 283, 174, 119, 73, 44, 40, 32, 26, 29, 23, 13, 9, 12, 4, 12, 7, 1, 6, 3, 7, 4, 8, 2, 3, 8, 2, 5, 1, 1, 6, 3, 1]

Switch to R

drop the leading 0 and add curvy brackets and a variable name in R format because you will paste it there.

> deg= c(3316, 690, 283, 174, 119, 73, 44, 40, 32, 26, 29, 23, 13, 9, 12, 4, 12, 7, 1, 6, 3, 7, 4, 8, 2, 3, 8, 2, 5, 1, 1, 6, 3, 1)

> x=c(1:34)

 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22
[23] 23 24 25 26 27 28 29 30 31 32 33 34

> plot(x,deg)

> plot(log(x),deg)

> plot(log(x),log(deg))

Your plot for α=1 and γ=0 (β=1.2) is approximately Pareto (Straight line in log log, noise in the tail)

Try MLE Pareto II

Now try fitting with MLE Pareto in R and MLE Pareto II in R (Cosma Shalizi is currently helping Mark Handcock create an MLE procedure for Pareto II degree distributions)

Load Tsal.R (Shalizi will be fixing and documenting this for anomalies in estimates of scale parameters but the shape parameters are correct)

Load Pareto.R (used here for graphics; Shalizi is documenting the program for Pareto MLE generally)

Paste all of this (not the >)

> degree <- c(3316, 690, 283, 174, 119, 73, 44, 40, 32, 26, 29, 23, 13, 9, 12, 4, 12, 7, 1, 6, 3, 7, 4, 8, 2, 3, 8, 2, 5, 1, 1, 6, 3, 1)

degree.tsal.fit <- tsal.fit(degree,xmin=1) # Assigns the results of the fit to the object

degree.tsal.fit # Displays the estimated parameters and information about the fit

degree.tsal.errors <- tsal.bootstrap.errors(degree.tsal.fit,reps=50)

degree.tsal.errors # Displays the bootstrapped error estimates

temp.x<-c(1:5000)

degree.y<-(1-(1-degree.tsal.fit$q)*temp.x/degree.tsal.fit$kappa)^(1/(1-degree.tsal.fit$q))

plot.survival.loglog(degree.y)

tsal.sample <- rtsal(1e4,degree.tsal.fit$shape,degree.tsal.fit$scale)

plot.survival.loglog(tsal.sample, ylab="Pr(X >= x)") #, add=TRUE)

curve(ptsal(x, degree.tsal.fit$shape,degree.tsal.fit$scale,lower.tail=FALSE),add=TRUE,col="blue")

  1. curve(ppareto(x,threshold=1,exponent=4,lower.tail=FALSE),add=TRUE,col="red")

tsal.total.magnitude(degree.tsal.fit,mult=1000) #Magnitude=Y(0) Count=#Cities

ks.test(tsal.sample,ptsal,3,200)

WHAT'S WRONG HERE IS THAT THERE IS NO degree array x=(1,2,3,...,34) AND COSMA'S CODE DOESN'T YET INCLUDE THE DEGREE DISTRIBUTION CASE, SMALL INTEGERS ON X. IT WAS MADE FOR CITY SIZES ENTERED BY RANK FROM LARGE TO SMALL. POSSIBLY THE dNET IN StatNet can do this!

https://networkx.lanl.gov/Reference/ has all the routines

Then some instructions for networkx are needed ...

Miscellaneous

Also - if you want to simulate many different realizations of the same model, you have to write a short function (or just for loop) to do that (and the random seed should be None!)

The file_Pajek.pyc is made as soon as you import this file from somewhere. It is just a shorter description of an original file (for Python to run it a bit faster).

There are no executables needed, if you have Python installed on the computer. However, executables can also be done (if a more robust code is done, and some GUI maybe)...

Hope this helps, Nataša



lanl.gov/wiki also recommends http://peak.telecommunity.com/DevCenter/EasyInstall http://peak.telecommunity.com/DevCenter/PackageNotes?action=highlight&value=networkx there seems to be a problem there

http://math.lanl.gov/Research/Highlights/networkx.shtml NetworkX: Python Software for the Analysis of Networks

http://packages.debian.org/unstable/graphics/python-networkx tool to manipulate and study more than complex networks -- Other Packages Related to python-networkx