Monthly Archives: January 2011

Acquiring Social Network Structure – Results Soon

I have completed the analysis of the first three (three!) experiments. Currently I am putting these results together for a talk here at UCSD. I will also be submitting a paper to CogSci 2011.

That means very soon I’ll have something to talk about here on the blog. Until then, enjoy the introductory slide for Friday’s talk.

Coming Soon: Acquiring Social Network Structure

Coming Soon: Acquiring Social Network Structure

Advertisements

Ceiling and Floor Computation in SPSS

Someone forgot to add CEILING and FLOOR functions to SPSS. Thankfully with a little logic and rounding, we can accomplish the same thing.

Antti Nevanlinna posted this solution on mathkb.com:

floor(x) = rnd(x) – 1*(rnd(x) > x)
ceil(x) = rnd(x) + 1*(rnd(x) < x)

The function rnd is a rounding function that does what you would expect: up for >= .50 and down for < .50.
The expression (rnd(x) < x) is a logical statement that will evaluate to 1 if true and 0 if false.

I am posting it here as a reminder for myself and as a service to anyone else looking to do the same thing.

Acquiring Social Network Knowledge

The shotgun approach isn’t just the name of the blog. I live it. I have so many projects going that I buy file folders by the pallet.

The one I’m really excited about at the moment is an attempt to bring together the two worlds I’ve been living in for the past year – cognitive psychology and social networks – and keep myself working on my dissertation at the same time.

For this project, I’ll be running several online experiments. The main goal is to characterize exactly how humans acquire and retain social network information.

I’ll be posting links to experiments and results here as time allows. For now, I’ll list the first few hypotheses I’ll be testing:

  • Human subjects will acquire a network’s structure more quickly if it resembles a true human social network rather than an arbitrary network. To operationalize this, I will measure learning curves as subjects learn the structure of random or scale-free graphs.
  • Human subjects will acquire a network’s structure more quickly if it is framed as a social network as opposed to the same network framed in some other manner (e.g. a computer or transport network.)
  • Some forms of representation of the network will lead to faster acquisition than others. For example, you might represent a network as a series of edges between vertices (e.g. friendships between people) or you might represent a network as a traversal of the links within it (think of following links in the Kevin Bacon Game). Some forms will lead to faster acquisition than others, and this will allow us to draw conclusions about how graph information is represented in the brain.

Check back for updates on this project, and please leave feedback and questions.

Politics or Sports

When tweeting, what words do people use when they are talking about politics? I did a fast analysis of the last 1000 tweets from the 16 most popular political bloggers and the last 1000 tweets from the 16 most popular sports bloggers.

Here are the overall word counts, the counts for political and sports tweets separately, and a measure of the politics/sports diagnosticity.

Word Count Political Count Sports Count Chi Square
obama 1137 1132 5 558.54
gop 659 659 0 329.50
game 758 28 730 325.07
house 551 540 11 253.94
yankees 500 2 498 246.02
senate 452 452 0 226.00
party 432 413 19 179.67
vs 598 69 529 176.92
democrats 319 319 0 159.50
health 346 336 10 153.58
tea 324 319 5 152.15
president 370 349 21 145.38
ufc 293 1 292 144.51
angels 282 0 282 141.00
election 271 271 0 135.50
political 265 262 3 126.57
vote 345 319 26 124.42
rangers 256 3 253 122.07
reform 242 242 0 121.00
update 568 100 468 119.21
#yankees 232 0 232 116.00
giants 248 5 243 114.20
#ufc 225 0 225 112.50
lakers 225 2 223 108.54
#mma 216 0 216 108.00
on 4265 2605 1660 104.69
watch 446 374 72 102.25
government 204 204 0 102.00
campaign 217 213 4 100.65
law 230 222 8 99.56
obama’s 199 199 0 99.50
us 406 343 63 96.55
dodgers 197 1 196 96.51
bowl 200 2 198 96.04
(video) 211 206 5 95.74
rally 258 240 18 95.51
football 205 6 199 90.85
republicans 181 181 0 90.50
tax 194 189 5 87.26
today 516 407 109 86.05
polls 185 181 4 84.67
obamacare 169 169 0 84.50
republican 169 169 0 84.50
palin 168 168 0 84.00
coach 175 2 173 83.55
bush 184 179 5 82.27
dems 163 163 0 81.50
nfl 189 8 181 79.18
voters 180 174 6 78.40
basketball 160 1 159 78.01
o’donnell 153 153 0 76.50
fans 178 7 171 75.55
players 162 3 159 75.11
news 389 315 74 74.65
race 214 196 18 74.03
[delicious] 147 0 147 73.50
obamas 143 143 0 71.50
kings 158 4 154 71.20
team 307 49 258 71.14
#rangers 139 0 139 69.50
care 228 202 26 67.93
play 230 27 203 67.34
congress 139 137 2 65.56
season 199 19 180 65.13
kobe 130 0 130 65.00
bill 253 217 36 64.75
america 170 159 11 64.42
politics 124 124 0 62.00
in 5766 3304 2462 61.48
et 149 142 7 61.16
sox 122 0 122 61.00
jobs 137 133 4 60.73
economy 125 124 1 60.52
notes 178 16 162 59.88
debate 163 151 12 59.27
cam 116 0 116 58.00
democratic 116 116 0 58.00
125 115 0 115 57.50
brandon 115 0 115 57.50
cbs 125 122 3 56.64
christine 112 112 0 56.00
player 123 3 120 55.65
preview 169 16 153 55.53
elections 115 114 1 55.52
democrat 111 111 0 55.50
sen 110 110 0 55.00
americans 121 118 3 54.65
supreme 113 112 1 54.52
dem 111 110 1 53.52
rep 122 118 4 53.26
speech 110 109 1 53.02
pelosi 106 106 0 53.00
fox 143 133 10 52.90
its 300 239 61 52.81
#playoffs 105 0 105 52.50
security 116 113 3 52.16
espn 113 3 110 50.66
usc 108 2 106 50.07
deal 199 29 170 49.95
mosque 99 99 0 49.50
american 194 166 28 49.08

Chi-square (the last column) was calculated with the chi-square formula: (Observed frequency – Expected Frequency)^2 / Expected Frequency. The “Expected Frequency” in this case was half the total number of times the word appeared. In other words, we assume each word has an equal chance of appearing in a sports tweet or political tweet, and then measure how much that assumption was violated.

This table lists only the top most diagnostic words. Of course there were tens of thousands more. However, if you ever need to build a quick-and-dirty classifier or settle a bet on which words separate the pols from the jocks, here’s your answer. 🙂