Udell exploration

Jon

I used your data to whack out a square matrix[1] and , ran it through a social network analysis program called ucinet, and made some pictures.

 

Two dimensional multi-dimensional scaling gives you this.

 

 

Three-D factor analysis yielded this table, which I visualized using VPython [2]

This looks much better if you run the program and animate the display (Rt-Mouse drag on the screen to rotate).  At that point (only) the natural groupings seem to stand out. 

 

But are they natural groupings?  You know these people?

 

A heirarchical cluster analysis is interesting may be the best and simplest visualization of all, since the “only reality” is the similarity of nodes to nodes and groups of nodes.

 

Ucinet details  follow:

METRIC MULTIDIMENSIONAL SCALING

Picture

Data

--------------------------------------------------------------------------------

 

Starting config:              GOWER'S PRINCIPAL COORDINATES

 

Type of Data:                 Similarities

Input dataset:                C:\Program Files\Ucinet 5\DataFiles\udellsim

 

15 items

Initial Stress = 0.894

Final Stress = 0.266 after 6 iterations.

 

                          1      2

                     ------ ------

  1 Thierry Lalinne   0.398  0.550

  2    Paul Snively   0.308  0.059

  3      CE Granier   0.055 -0.068

  4     David Brown   0.117  0.032

  5      Joe Jennet   0.082  0.546

  6       Jim McGee   0.107  0.329

  7    Jenny Levine  -0.049  0.095

  8        Sam Ruby   0.635  0.270

  9     Jiri Ludvik   0.425 -0.175

 10     Jon's Radio   0.383  0.441

 11 Olivier Travers  -0.082  0.294

 12 Gordon Weakliem   0.629  0.200

 13   Peter Drayton   0.526  0.207

 14   Dann Sheridan   0.157 -0.189

 15     Marc Barrot   0.569 -0.058

 

Coordinates saved as dataset MetricMdsCoord

 

----------------------------------------

Running time:  00:00:01

Output generated:  30 May 02 16:55:34

Copyright (c) 1999-2000 Analytic Technologies

JOHNSON'S HIERARCHICAL CLUSTERING

Data

--------------------------------------------------------------------------------

 

Input dataset:                C:\PROGRAM FILES\UCINET 5\DATAFILES\udellsim

Method:                       AVERAGE

Type of Data:                 Similarities

 

HIERARCHICAL CLUSTERING

 

           O             T   G       

           l             h   o       

           i           D i   r   P   

           v P       J a e   d   e   

           i a   D   e n r J o   t J M

         J e u C a   n n r o n   e i a

         o r l E v J n   y n     r r r

         e       i i y S   ' W S   i c

           T S G d m   h L s e a D   

         J r n r     L e a   a m r L B

         e a i a B M e r l R k   a u a

         n v v n r c v i i a l R y d r

         n e e i o G i d n d i u t v r

         e r l e w e n a n i e b o i o

         t s y r n e e n e o m y n k t

 

           1           1   1 1   1   1

 Level   5 1 2 3 4 6 7 4 1 0 2 8 3 9 5

------   - - - - - - - - - - - - - - -

46.154   . . . . . . . . . . XXX . . .

36.551   . . . . . . . . . . XXXXX . .

33.333   . . . XXX . . . . . XXXXX . .

22.437   . . XXXXX . . . . . XXXXX . .

21.858   . . XXXXX XXX . . . XXXXX . .

20.000   . . XXXXX XXX . . . XXXXX XXX

19.524   . . XXXXXXXXX . . . XXXXX XXX

18.182   . . XXXXXXXXX . XXX XXXXX XXX

17.449   . . XXXXXXXXXXX XXX XXXXX XXX

14.121   . . XXXXXXXXXXX XXX XXXXXXXXX

12.173   . XXXXXXXXXXXXX XXX XXXXXXXXX

11.405   . XXXXXXXXXXXXX XXXXXXXXXXXXX

 8.730   . XXXXXXXXXXXXXXXXXXXXXXXXXXX

 6.017   XXXXXXXXXXXXXXXXXXXXXXXXXXXXX

 

 

Clustering permutation saved as dataset hicluspermutation

 

 

                           5      11       2       3       4       6       7      14       1      10      12       8      13       9      15

                     Joe Jen Olivier Paul Sn CE Gran David B Jim McG Jenny L Dann Sh Thierry Jon's R Gordon  Sam Rub Peter D Jiri Lu Marc Ba

                     ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- -------

  5      Joe Jennet  100.000   5.578   4.348   8.791  10.501  15.789  10.314   8.617  15.000   8.252   3.731   5.479   4.826   2.857   6.017

 11 Olivier Travers    5.578 100.000  12.552  10.924  12.017  16.722  15.969  12.173   6.167  11.439   5.808   6.364   9.382   7.373   7.424

  2    Paul Snively    4.348  12.552 100.000  20.253  22.437  15.714  21.138  15.873  17.647   9.009  12.361  16.393  16.194  17.241  15.385

  3      CE Granier    8.791  10.924  20.253 100.000  33.333  15.827  21.311  25.806   8.955   3.636   0.000   0.000   7.579  14.035   7.843

  4     David Brown   10.501  12.017  22.437  33.333 100.000  15.472  19.524  19.243  12.130   7.472   6.253   6.061   9.570  14.168  13.272

  6       Jim McGee   15.789  16.722  15.714  15.827  15.472 100.000  21.858  11.382  10.938  18.713   5.594   6.612   9.161   6.780   5.357

  7    Jenny Levine   10.314  15.969  21.138  21.311  19.524  21.858 100.000  17.449   9.830   9.187   2.602   3.846   7.370   6.220   8.771

 14   Dann Sheridan    8.617  12.173  15.873  25.806  19.243  11.382  17.449 100.000   3.922   8.415   4.767   4.545   6.047  14.634   8.730

  1 Thierry Lalinne   15.000   6.167  17.647   8.955  12.130  10.938   9.830   3.922 100.000  18.182   9.879  12.245   8.237   8.696   9.565

 10     Jon's Radio    8.252  11.439   9.009   3.636   7.472  18.713   9.187   8.415  18.182 100.000  11.887   8.696  12.845   8.989  11.405

 12 Gordon Weakliem    3.731   5.808  12.361   0.000   6.253   5.594   2.602   4.767   9.879  11.887 100.000  46.154  36.551  10.826  17.374

  8        Sam Ruby    5.479   6.364  16.393   0.000   6.061   6.612   3.846   4.545  12.245   8.696  46.154 100.000  28.571  10.256  12.121

 13   Peter Drayton    4.826   9.382  16.194   7.579   9.570   9.161   7.370   6.047   8.237  12.845  36.551  28.571 100.000  10.817  14.121

  9     Jiri Ludvik    2.857   7.373  17.241  14.035  14.168   6.780   6.220  14.634   8.696   8.989  10.826  10.256  10.817 100.000  20.000

 15     Marc Barrot    6.017   7.424  15.385   7.843  13.272   5.357   8.771   8.730   9.565  11.405  17.374  12.121  14.121  20.000 100.000

 

Partition-by-actor indicator matrix saved as dataset Part

 

----------------------------------------

Running time:  00:00:01

Output generated:  30 May 02 17:03:52

Copyright (c) 1999-2000 Analytic Technologies

Picture

K-CLUSTERS VIA TABU SEARCH

Data

--------------------------------------------------------------------------------

 

Diagonal valid?               NO

Number of clusters:           2

Type of data:                 Proximities

Method:                       correlation

Input dataset:                C:\Program Files\Ucinet 5\DataFiles\udellsim

 

Starting fit: 0.819

Starting fit: 0.484

Fit: 0.484

Fit: 0.484

Fit: 0.484

Fit: 0.484         (smaller values indicate better fit.

r-square = 0.266

 

Clusters:

 

    1:  Paul Snively CE Granier David Brown Joe Jennet Jim McGee Jenny Levine Olivier Travers Dann Sheridan

    2:  Thierry Lalinne Sam Ruby Jiri Ludvik Jon's Radio Gordon Weakliem Peter Drayton Marc Barrot

 

Clustered Data Matrix

 

                           4      2      3      7      5      6     11     14        1      9     10     12     13      8     15 

                      David  Paul S CE Gra Jenny  Joe Je Jim Mc Olivie Dann S   Thierr Jiri L Jon's  Gordon Peter  Sam Ru Marc B 

                     ------------------------------------------------------------------------------------------------------------

  4     David Brown | 46.154 23.529 33.333 21.875 14.433 15.172 12.295 17.647 | 10.959 12.698  8.621  6.349  7.921  6.061 14.035 |

  2    Paul Snively | 23.529 46.154 20.253 21.138  4.348 15.714 12.552 15.873 | 17.647 17.241  9.009 10.345 18.750 16.393 15.385 |

  3      CE Granier | 33.333 20.253 46.154 21.311  8.791 15.827 10.924 25.806 |  8.955 14.035  3.636        12.632         7.843 |

  7    Jenny Levine | 21.875 21.138 21.311 46.154  7.407 21.858 19.149 18.868 |  7.207  5.941  5.195  1.980  5.755  3.846  6.316 |

  5      Joe Jennet | 14.433  4.348  8.791  7.407 46.154 15.789  5.578  8.000 | 15.000  2.857  4.878  2.857  5.556  5.479  3.125 |

  6       Jim McGee | 15.172 15.714 15.827 21.858 15.789 46.154 16.722 11.382 | 10.938  6.780 18.713  5.085 11.538  6.612  5.357 |

 11 Olivier Travers | 12.295 12.552 10.924 19.149  5.578 16.722 46.154  9.009 |  6.167  7.373 14.074  5.530 11.765  6.364  5.687 |

 14   Dann Sheridan | 17.647 15.873 25.806 18.868  8.000 11.382  9.009 46.154 |  3.922 14.634  8.511  4.878  5.063  4.545 11.429 |

                    --------------------------------------------------------------------------------------------------------------

  1 Thierry Lalinne | 10.959 17.647  8.955  7.207 15.000 10.938  6.167  3.922 | 46.154  8.696 18.182  8.696  7.143 12.245 10.000 |

  9     Jiri Ludvik | 12.698 17.241 14.035  5.941  2.857  6.780  7.373 14.634 |  8.696 46.154  8.989 11.111 10.811 10.256 20.000 |

 10     Jon's Radio |  8.621  9.009  3.636  5.195  4.878 18.713 14.074  8.511 | 18.182  8.989 46.154 13.483 17.323  8.696 12.048 |

 12 Gordon Weakliem |  6.349 10.345         1.980  2.857  5.085  5.530  4.878 |  8.696 11.111 13.483 46.154 40.541 46.154 20.000 |

 13   Peter Drayton |  7.921 18.750 12.632  5.755  5.556 11.538 11.765  5.063 |  7.143 10.811 17.323 40.541 46.154 28.571 14.706 |

  8        Sam Ruby |  6.061 16.393         3.846  5.479  6.612  6.364  4.545 | 12.245 10.256  8.696 46.154 28.571 46.154 12.121 |

 15     Marc Barrot | 14.035 15.385  7.843  6.316  3.125  5.357  5.687 11.429 | 10.000 20.000 12.048 20.000 14.706 12.121 46.154 |

                     -------------------------------------------------------------------------------------------------------------

 

 

Partition saved as dataset TabuCluster

 

----------------------------------------

Running time:  00:00:01

Output generated:  30 May 02 17:10:28

Copyright (c) 1999-2000 Analytic Technologies

 



[1] print '---------'

print ' '.rjust(20),

for k1 in d.keys(): #colheads

  (k1Url, k1Name) = k1.split(',')

  print k1Name.rjust(20),

 

for k1 in d.keys():

  (k1Url, k1Name) = k1.split(',')

  print

  print k1Name.rjust(20),  #row head

  for k2 in d.keys():

               (k2Url, k2Name) = k2.split(',')

               s.set_seqs(d[k1],d[k2])

               print `s.ratio()*100`.rjust(20),

 

print

[2] People=[

("Thierry Lalinne",   -0.57,   0.24,  -0.18),

("Paul Snively",    0.42,             0.02,  -0.07),

("CE Granier",               0.56,   0.44,   0.05),

("David Brown",    0.44,            0.41,   0.00),

("Joe Jennet",   -0.45,   0.38,   0.11),

("Jim McGee",               -0.11,               0.20,   0.58),

("Jenny Levine",    0.44,            0.28,   0.48),

("Sam Ruby",     -0.09,              -0.78,  -0.14),

("Jiri Ludvik" ,   0.17,   0.09,  -0.60),

("Jon's Radio",   -0.49,             -0.07,   0.12),

("Olivier Travers", 0.07,          -0.02,   0.57),

("Gordon Weakliem",   -0.08,  -0.86,  -0.19),

("Peter Drayton",    0.04,           -0.76,   0.01),

("Dann Sheridan",    0.44,     0.34,  -0.08),

("Marc Barrot",    0.04,             -0.11,  -0.60)

]

 

from visual import *

 

for p in People:

            (name,x,y,z)=p

            label(pos=(x,y,z),text=name,height=14,box=0,opacity=0,color=(0,1,1))

            sphere(pos=(x,y,z),label=name,radius=0.05)