Monthly Archives: July 2011

Learning a Social Graph Does NOT Depend on Method of Training

Learning a social graph does NOT depend on method of training. So far, at least.

One of my initial hypotheses when I began to study the acquisition of social network structure was this:

Some forms of representation of the network will lead to faster acquisition than others.

What I meant was that there are many ways you can represent a social network. You can put people’s names in circles and draw lines between the circles to represent relationships. You could simply list all the dyads – pairs of people who are friends, for example. Formally, you might create an adjacency matrix. Some of these should make learning who is connected to whom easy and some should make it less easy, right?

Well, not so far. However, I’m just beginning to explore the space, and the manipulations I’ve implemented so far may be too subtly different. For instance, I first trained subjects by either Random Edge or Network Walk training.

In Random Edge training, subjects were told they would be shown one dyad at a time – two names representing a pair of people who were friends. The friendships shown to the subjects were randomly drawn from the set of existing friendships. (An “edge” in a graph is a connection between two nodes.)

In Network Walk training, subjects similarly saw pairs of friends’ names. However, one name in each pair was always the same as a name in the previous pair. For example, a subject would see the sequence Frank-Bob, Bob-Alice, Alice-Cindy. Unlike Random Edge training, this type of training emphasizes the larger structure of the graph by taking you on a “walk” through the “network.”

This does not seem to make much difference to learners, however. In the graph below, you will notice that the type of training (Edge = Random Edge and Walk = Network Walk) makes no difference in how quickly subjects acquire information.

Random Edge vs. Network Walk Training

The type of graph affects speed of acquisition as expected. Random graphs take longer to learn than scale-free graphs.

The similarity in the pace of acquisition can also be conveyed in the form of learning curves. Here we see the number of errors decline as subjects are given more and more of each type of training.

Learning Curves - Random Edge vs. Network Walk

The difference between the two curves is not statistically reliable. Most of the confidence intervals around the data points overlap.

What should we make of the very similar performance for these two types of training? Being presented the connections in a graph in a systematic way (walking from edge to edge) seems to present no advantage over being given the list of edges in a random order. Of course, there is the usual caution about not accepting the null hypothesis. It may be the effect is just very small, and the experiment lacked power. However, the same experiment was powerful enough to detect the difference between random graph and scale-free graph acquistion rates, so if nothing else we can assume any possible effect is smaller than the graph structure effect.

I am still convinced there must be better and worse ways to learn the structure of a graph, and I’ll be experimenting with various training methods over the coming months. Check the blog for new results!