Visualizing Time in Social Networks with TeCFlow

Peter A. Gloor,
Center for Coordination Science, MIT, Cambridge MA, pgloor@mit.edu

Yan Zhao
Center for Digital Strategies, Dartmouth College, Hanover NH, yan.zhao@dartmouth.edu

Abstract

This paper introduces TeCFlow – A Temporal Communication Flow Visualizer for Social Network Analysis. TeCFlow automatically generates interactive movies of communication flows among individuals by mining e-mail log files and other communication archives. Combining those movies with measures of social network analysis such as the change over time in group betweenness centrality and group density leads to insights into organizational dynamics. In addition we have defined a contribution index, which measures the activity of individual actors as senders and receivers of messages relative to a group.

We have applied our tool to the analysis of different organizational scenarios such as management of large software projects, sales force effectiveness, mergers of groups, and research and development teams. Through this analysis we have gained an intuitive understanding of the inner working of these virtual teams, which are hard to obtain by conventional means.

 

1.    Introduction

In this paper we are proposing an approach for visual identification and analysis of the temporal dynamics of communication flows in social networks. Similarly to a movie displaying the weather pattern of satellite images, our tool plays back an interactive movie depicting the interaction between members of a team based on their e-mail traffic (figure 1).

 

Figure 1. Movie illustrating parallels between weather patterns and temporal communication patterns (click here for QuickTime movie)

 

Weather forecasting allows making predictions about weather phenomena based on observations of weather related factors such as atmospheric pressure, wind speed and direction, precipitation, cloud cover, temperature, and humidity. In a weather-forecasting system, weather patterns consisting of time series of recently collected data are fed into a computer model to predict the weather patterns of the next five to fifteen days. Analyzing social networks over time is very similar to weather forecasting. By comparing dynamic interaction flows with the output produced by virtual teams, we hope to identify characteristic dynamic communication patterns of high-performing groups of knowledge workers based on the work done, among others, by (Bulkley & Van Alstyne, 2004) and (Cross & Cummings, 2003).

 

2.    Related Work

Our Temporal Communication Flow Visualizer for the temporal analysis of social networks (TeCFlow ) addresses three areas of related research: (1) visualization of social networks, (2) temporal analysis of social networks in animated visualizations, and (3) analysis of e-mail networks. This section briefly reviews the three areas.

(1) visualization of social networks – In the 2000 inaugural issue of this journal Linton Freeman gave an overview of existing tools for the visualization of social networks (Freeman, 2000). There has been substantial research dedicated to computing and visualizing static social networks as directed graphs or adjacency matrices. AGNI (Varghese & Allen, 1993), ucinet (Borgatti, Everett & Freeman, 1992), pajek (Batagelj & Mrvar, 1998), KrackPlot (Krackhardt, Blythe & McGrath 1995), VisOne (Brandes & Wagner, 2003), Agora (Mazzocchi, 2003) and NetMiner (Netminer, 2004) are some of many tools for analysis and visualization of social networks. Zoomgraph (Adar & Tyler, 2003) is a recent system for social network analysis that combines graph visualization with a built-in programming language for graph manipulation and analysis. While SNA visualization tools such as pajek and ucinet also include the option to animate the graph, they do not allow for temporal visualization of changes in the network structure over time.

(2) Temporal analysis and visualization by movies of social networks – Sonia (Moody, McFarland & Bender-deMoll, 2004) is one of the few tools that temporally visualize social networks by creating movies of animated graph structures. It has been originally created to visualize observations of social interactions in high school classrooms. It models and animates dynamic network representations of social interaction over time. (McGrath & Blythe, 2004) have shown that motion has a positive effect on the accuracy of viewers' perceptions of change in status in a social network. PieSpy visualizes the evolution of social networks of chat users over time (Mutton, 2003).

(3) Analysis of e-mail networks – In a frequently quoted early study, (Freeman & Freeman, 1980) studied the impact of e-mail on a social community. Recently, researchers have started to analyze email communication flow to investigate the community structure (Smith & Fiore, 2001; Tyler, Wilkinson & Huberman, 2002; Van Alstyne & Zhang, 2003; Danah, Potter & Viegas, 2003; Ebel, Mielsch & Bornholdt, 2002). Others are automatically identifying communities, solely based on the frequency of the message exchange between individuals, by partitioning the total graph (Girvan & Newman, 2001; Guimera et. al. 2002). Further research has used email data to map communication patterns from the perspective of the individual. This work typically creates representations of past messages that allow an individual to see the ego-network implied by their prior e-mail traffic (Bulkley & Van Alstyne, 2004).

3.    Design and Architecture of TeCFlow

Our tool for the visual temporal analysis of social networks takes as input communication archives such as e-mail logs, mailing lists, phone logs, and chat transcripts and automatically generates static and dynamic visualizations of the calculated communication networks. The static visualizations allow users to step through a chosen time period by looking at communication networks at subsequent time intervals. The dynamic visualization consists of an interactive movie showing the evolution over time of the communication network within the group. Active relationships are displayed in a sliding time window, with inactive relationships decaying over time. TeCFlow also calculates and plots the evolution of group betweenness centrality and density over time to discover interesting events in the lifetime of a virtual team. The interactive movie can be stopped anytime to drill-down into the messages that are currently exchanged between actors. Multiple e-mail addresses can be combined into an online personality, reflecting the fact that people frequently use different e-mail addresses.

We have implemented an open architecture. Communication messages are processed locally in three steps (figure 2). In the first step, the messages are parsed and stored in decomposed format in a mysql database. In the second step the database can be queried to select messages sent or received by a group in a given time period. In the third step the selected communication flows can be represented in our visual browser using our own netgraph (Varghese & Allen, 1993) and static and dynamic views (Gloor et. al. 2003). TeCFlow has been developed in Java and is available for free download from http://www.ickn.org/ickndemo.

Figure 2. TeCFlow System Architecture

This architecture provides a testbed of high scalability and flexibility. The number of actors, ties, and messages to be analyzed is only limited by the size of the database and the amount of RAM available, and temporal queries can be run in an ad hoc way. We are also able to experiment with different visualizations of the retrieved structure.

4.    The “Sliding Time Frame” Algorithm for Temporal Visualization

We base our algorithm on the Fruchterman-Reingold graph drawing algorithm (Fruchterman & Reingold, 1993) for force-directed placement, which is commonly used to visualize graphs of social networks. This method compares a graph to a mechanical collection of electrically charged rings (the vertices) and connecting springs (the edges). Every two vertices reject each other with a repulsive force, and adjacent vertices (connected by an edge) are pulled together by an attractive force. Over a number of iterations, the forces modeled by the springs are calculated and the nodes are moved in a bid to minimize the forces felt.

In our algorithm, we treat the exchanges of messages between actors as an approximation of social ties. In our visualization a communication initiated by actor A to actor B is represented as a directed edge from A to B, i.e. a message sent from A to B is depicted as an arc. The more interactions between actors A and B occur, the closer the two representing vertices will be placed. The most connected actors are positioned in the center of the graph. This means that the actors who send and receive the largest number of e-mail messages in a given time frame are placed in the center of the graph. Similarly, the more messages A and B exchange, the shorter their connecting arc becomes.

To display the evolution of communication patterns over time, we developed a dynamic visualization algorithm where the layout of the graph is automatically recalculated every day, resulting in an interactive movie. The simplistic approach would have been, for any given day, to base the graph structure on the communications that occurred during this day. However, this approach does not take into account interactions that happened before this particular day, and would result in a jerky animation of low quality. For our dynamic visualization, we therefore propose a new algorithm: the sliding time frame algorithm, where we are always looking at a time interval consisting of a flexibly chosen number of days.

Figure 3. Sliding time frame algorithm in “with history” mode

The basic idea of the sliding time frame algorithm is to display active ties between actors in a sliding time frame covering a flexibly selected interval of n days starting from the current day d the visualization is showing (figure 3). The window frame moves forward day by day, and new ties (i.e. e-mail messages exchanged) are subsequently added to the graph each day until the desired width of n days of the sliding time frame is reached. This time frame window allows users to see all activities happening inside the time frame after the current day. In “with history” mode, old communication activities before the current time frame window are accumulated in the layout of the graph. This reflects persistent ties that stay active once they have been established for the remainder of the lifetime of the team. Once an e-mail message has been sent, it will influence positioning of the actors in the graph for the rest of the animation, meaning that a link does not decay. Rather, after it moves out of the n-days wide time frame, it is displayed in the visualization as a dimmed out arc.

In figure 3 the time frame begins at day d. The algorithm is looking n days ahead. Thus, day d is the current day that the visualization is showing and the current time frame is [d, d+n]. All communications through day (d+n) are calculated and displayed, while communications taking place before day d are displayed as dimmed out arcs. We call this the “with history” version of the algorithm, because the accumulated “history” of the interaction is used for the visualization.

In “no history” mode, only the edges in the current time frame are included (figure 4), i.e. only the e-mail messages exchanged within the time frame are used to calculate the graph. This means that the lifetime of ties (i.e., e-mail messages) is limited to the width of n days of the time frame, afterwards links decay and disappear. In this version of our system the tie decay function is binary, ties have a fixed lifespan of n days and then cease to exist. Day d is the current day that the visualization is showing and the current time frame is [d, d+n]. Only communications inside the current time frame are calculated and displayed.

Figure 4. Sliding time frame algorithm in “no history” mode

 

Figure 5 gives an example, comparing “with history” and “no history” views of the same data set. It displays a snapshot of the communication flow of a subset of the online community described in section 6. In this example, messages exchanged between Nov 30, 2001 and May 28, 2002 are shown. Figure 5 shows the “with history” and “no history” screen shots of May 28, 2002. The left view of figure 5 displays accumulated links since Nov 30, 2001 “with history”. The time frame is set to 30 days, which means that all the links between April 28, 2002, and May 28, 2002, are shown in full color, while the links older than 30 days are dimmed out. The right side of figure 5 shows the same data sets, but “without history.” This means that only the links within the 30-day time window April 28, 2002, and May 28, 2002 are calculated and shown. Note the differences in group betweenness centrality in the small window!

Figure 5. Temporal view with history (left) and without history (right)

click here for “with history” and “no history” QuickTime movies

To define the amount of “animated action” and optimize the animation speed we are using “keyframes”. The distance between keyframes is calculated based on the number of new messages exchanged. A frame is treated as a keyframe if a predefined number of messages has been accrued since the previous keyframe. The graph layout for the keyframe is calculated using the FR algorithm. The animation of the changing layout is interpolated between the keyframes. This process of getting a smooth transition between two keyframes is called “inbetweening” in computer animation.

This means that all locations of the actors and all arcs between actors are preprocessed for the entire time period of the animation and cached in arrays in main memory, allowing for fast animation and drill down into the visualization in real time.

 

5.    Treatment of Individual Actors

In addition, TeCFlow also allows the user to define personalities consisting of multiple e-mail addresses, and to create groups consisting of multiple personalities.

Figure 6. Merging multiple e-mail addresses of persons and groups

In figure 6, actor yan.zhao@dartmouth.edu consists of three e-mail addresses. The group devlab@cs.dartmouth.edu is composed of three actors. Maintaining the organization and domain parts of the e-mail addresses permits automatic analysis on the domain and organizational level.

We also looked at the frequency with which individuals send and receive messages. We have defined a measure, which we call the “contribution index” (Gloor et al, 2003, Gloor 2004):

The contribution index is +1, if somebody only sends messages and does not receive any message. The contribution index is –1, if somebody only receives messages, and never sends any message. The contribution index is 0, if somebody has a totally balanced communication behavior, sending and receiving the same number of messages. We then plotted the contribution index against the total number of messages sent and received of each participant.

Figure 7. Contribution Index

In figure 7, actor A only sent n messages, never receiving any, actor B only received p messages, never sending any, while actor C sent and received m messages (C is located on the x-axis, because C sent and received the same number of messages).

Looking at the evolution of actor betwenness centrality over time produces a three-dimensional "social surface" . Sorting the centrality vector for each day by increasing centrality makes for a smooth surface at the price of losing the capability to track individual actors over time. The social surface allows us to get a quick overview of the temporal dynamics of a group. Figure 8 shows the changes in betwenness centrality of 80 actors over a period of 300 days. This picture has been produced in “no history” mode, with a time window of 50 days. We are easily able to identify peaks in activity of individuals and of large parts of the group as “elevated planes” and “slopes” in the surface.


Figure 8. Social Surface of a Group – Evolution of Actor Betwenness Centrality (n=80)

 

In figure 8 we can see that the group is growing steadily over time by actors subsequently joining the group (betwenness centrality has been set to -0.1 for inactive actors). There is a large group of peripheral actors (with betweenness centrality 0), and there are three recognizable levels of activity in the group, marked by the "plateaus" in the social surface.

 

6.    The Diffusion of Innovation in Collaborative Knowledge Networks (CKN)

We have been applying our TeCFlow tool to the evolution of social knowledge networks over time in online communities. We call such communities “Collaborative Knowledge Networks” (CKNs) (Gloor, 2004). Our work generalizes what Peters (1983) calls "skunkworks" and Leavitt and Lipman-Blumen (1995) describe as "Hot Groups". The diffusion of innovation in collaborative knowledge networks follows a “ripple effect.” Collaborative Innovation Networks (COINs) are at the center of a set of concentric communities, where each community is included in the subsequent, larger community. The dissemination of new ideas in online communities is very similar to the ripple effect when a pebble drops into water. Innovations ripple from the innermost COIN circle to the next larger Collaborative Learning Network (CLN) circle, and then to the surrounding Collaborative Interest Network (CIN) community.

Figure 9. The ripple effect of CKN-based innovation diffusion

Figure 9 illustrates the ripple effect of CKN-based innovation diffusion by the example of the Linux Open Source developers.

We are aiming to distinguish temporal communication patterns typical of these different types of Collaborative Knowledge Networks. These CKNs are core/periphery structures (Borgatti & Everett, 1999) with small world properties (Watts, 1999). They consist of a central cluster of people, the core team, forming a high-density network with low group betweenness centrality (GBC). The external part is a network forming a ring around the core team. It has comparatively low density, but high group betweenness centrality, thanks to the central core team. The actors in the outer ring (CLN/CIN) have a low betweenness, as they are only connected to core team members, but not among themselves.

The TeCFlow tool is used in three steps:

(1)  Watch social interaction pattern movies to find dense clusters indicating potential emergence of COINs.

(2)  Look for peaks and troughs in the temporal evolution of group betweenness centrality and density to find changes in the collaboration patterns of groups.

(3)  Look at the contribution index to better understand the roles of the individuals in groups.

Combining steps (1) to (3) leads to new insights by giving an understanding of temporal evolution of group dynamics during the chosen time interval. In the second part of this paper we illustrate the use of TeCFlow by describing four different social interaction scenarios:

Pattern 1:  Innovation in research and development: Visualization of a globally active research and development community of a global management consulting firm.

Pattern 2:  Learning through online innovation dissemination: Visualization of preparation and execution of a Web conference (“Webinar”).

Pattern 3:  Project management: Visualization of communication in a distributed software development team.

Pattern 4:  Sales force: Visualization of account management processes in a consulting practice focusing on one large Fortune 500 client.

Our sample data set consists of an e-mail archive of a virtual consulting practice with 200 members of a global consulting firm covering the time period from mid-2000 to early 2002. It is composed of the ego-networks of the practice leader and the practice coordinator (i.e. their e-mailboxes). Those e-mailboxes are taken as an approximation of the organizational memory of the consulting practice, as the practice leader and the coordinator were informed of all major events in the practice. The mailboxes were partitioned manually into mail folders by subject areas. Mail folders included one of eight service offerings, a folder for each project currently active, sales efforts, marketing activities, and the organization of two practice-wide seminars conducted over the Web (“Webinars”). The major advantage of this data set is that one of us was intimately involved in the analyzed work processes. The disadvantage is that the mailboxes of the practice coordinator and the practice leader do not include the direct one-to-one communication among the practice members bypassing the practice coordinator or the practice leader.

7.    Innovation Pattern: Creation of a New Service Offering

This example illustrates the interaction pattern of an innovation community. It shows how to recognize the signal of a new innovation embedded into the general noise of communication. It includes 502 messages exchanged between 615 actors on general topics in the consulting practice over a period of 800 days. To spot innovation, messages have been searched and tagged for the first-time occurrence of the name of a new service offering with a unique name. The first time a message mentioning the name of the new service offering is sent to an actor, the actor is considered “infected” and remains colored red for the rest of the movie.

 

 

Figure 10. Creating an innovation (click here for QuickTime movie)

 

Figure 10 displays four movie snapshots visualizing four steps of the communication flow of the diffusion of the innovation. The cluster nurturing the new idea is clearly recognizable in all four snapshots, it takes about 400 days until the innovation is named for the first time and the cluster turns red. Afterwards the movie permits to track diffusion of the innovation from the original cluster to the remainder of the consulting practice.

 

Figure 11. GBC plot

Figure 11 illustrates evolution of group betweenness centrality (GBC) of the consulting practice. The first peak in GBC in figure 10 appears when the practice was officially started, GBC went up because of the broadcast announcements made by the leaders and coordinators to the entire practice. The second little spike is when the innovation team is at its peak, working feverishly and communicating with high frequency among themselves creating the new idea, which means that in relation to the entire social network, centrality of the innovation subgroup is higher.

Figure 12. Contribution Index

 

The contribution index plot in figure 12 identifies the different roles in the practice. The colored circles represent locations of generic roles identified in previous work (Gloor et. al, 2003): the grey circle identifies the typical location of coordinators, the yellow circle the location of creators, the pink circle of knowledge experts, and the green circle of communicators. There are three individual roles of “practice leader”, innovation “creator”, and “practice coordinator” that stand out. There are also two clusters with similar contribution indices: the “marketing team” members, and the members of the “innovation team”. The “practice leader” is the most active actor, getting many more e-mails than he is sending. The “creator” of the new service offering comes second, demonstrating her immense efforts into the creation of this new service offering. The “coordinator” displays typical coordination behavior, sending more than he receives. The “marketing team” displays a “sales” pattern, sending more than they receive. The “innovation team” exhibits a “knowledge expert” attitude, receiving more than they send.

Main results of this visual analysis are:

 

8.    Learning Pattern: Information Dissemination by Webinar

The next example illustrates how new concepts are taught to an online audience forming a Collaborative Learning Network (CLN) as introduced in section 6. The dataset for this example consists of e-mail messages on the subject of organizing a global Web-based seminar (“Webinar”); allocation of messages to the dataset was done manually. The archive includes 607 messages exchanged among 197 actors, covering a time period of about 190 days. The Webinar was prepared by a small team of self-selected members of the consulting practice over a multi-month period. One main speaker (the “practice leader”) then delivered the Webinar during one hour, assisted by his team members. The audience of the Webinar was spread out globally, and had the opportunity to ask questions to the speakers via e-mail during the talk. Questions that could not be covered during the talk were answered in the next few days. Because of overwhelming demand, the team decided to revise and rerun the Webinar a few weeks later. The team worked together on some minor changes, until the seminar was delivered again, this time coordinated by another main speaker, the “practice coordinator”.

The TeCFlow movie illustrates this switch between “innovation community” pattern and “learning community” pattern. In the preparation phase, the core team collaborated as a COIN, in the delivery phase, speakers and audience collaborated as a CLN.

LEAD Technologies Inc. V1.01LEAD Technologies Inc. V1.01

LEAD Technologies Inc. V1.01LEAD Technologies Inc. V1.01

 

Figure 13. Four screen shots of movie of Webinar communication flow (click here for QuickTime movie)

 

Figure 13 illustrates the changes in communication flow from preparing to conducting the Webinar. The picture in the top left of figure 13 shows the structure of the team preparing the Webinar. This group is operating as a COIN, with high density and low group betweenness centrality. The picture in the top right of figure 13 shows a screen shot of the communication pattern during the first time the Webinar was delivered. The “practice leader” (black dot) is sending and receiving information in a star structure. During and after the first run of the Webinar, questions are asked to and answered mostly by the “practice leader.” The third picture at the lower left displays the team preparing a rerun of the Webinar, again working as a COIN and communicating with lower group betweenness centrality. The final screen shot in the lower right of figure 13 displays the “practice coordinator” (blue dot) rerunning the Webinar, communicating with his audience and answering their questions in a star structure with high group betweenness centrality.

Figure 14. GBC of Webinar

Figure 14 illustrates changes in group betweenness centrality. The three phases in the organization of the Webinar can be recognized, with a decline in GBC (red line) for the organization of the first and second run of the Webinar, and a spike in GBC when the speaker delivers the event. In the preparation phase GBC is high because of the core/periphery structure of the core group involved in a dialogue with potential speakers for the event.

Figure 15. "Social Surface" for actors participating in Webinar

 

Compared with the temporal evolution of group betweenness centrality displayed in figure 14, the social surface shown in figure 15 conveys much richer information. The four phases of initial preparation, first run, second preparation, and wrap-up can again be distinguished, but now we can also see that the second peak in attendance (the actors on the first plateau) is smaller than the number of attendants of the first run of the Webinar. We also see that during the first run there was a considerable number of active actors involved (the red slope at the foot of the yellow peaks). The final wrap-up was done by a few actors with high centrality.

 

 

Software: Microsoft Office

Figure 16. Contribution Index of Webinar

Figure 16 shows that the “practice coordinator” is the most active actor, followed by the “practice leader”. The cluster of innovation team members working together to prepare the Webinar can also be identified.

 

9.    Project Management Pattern – Communication in a Software Development Team

The next example visualizes a software project, where the consulting firm was acting as a general contractor to develop a bespoke software application on behalf of a client firm. The core team of the consulting company consisted of about 20 consultants, led by a project manager and a project partner. The client team consisted of the client project manager, a senior manager, and six subject matter experts.

The analysis described in this section (317 messages, 93 actors) focuses on the phase of the project right before “go live” and software handover to the client, During this period there were intensive negotiations between client project management and senior managers of the consulting practice. The following four snapshots illustrate the communication flow within the team in critical phases of this period. The top left of figure 15 shows the communication flow during the first phase when an addition to the original contract was negotiated by the “legal team”, while the technicians were working on implementing the technical system.

Software: Microsoft Office

Figure 17. Four snapshots of project movie (click here for QuickTime movie)

In the top left window of figure 17 the “legal team” of the consulting firm forms a dense cluster, intensively discussing contractual details. The “consulting project leaders” as well as the client project leaders are in the center of the structure, communicating with everybody else. The technical team members are forming another cluster at the top, mostly communicating among themselves. During the subsequent testing phase shown in the top right of figure 17, the “testing coordinator” has a centralized role coordinating the developers. The legal team is still tightly clustered, while the clients (green dots) are (too) peripheral.

During the next phase depicted in the bottom left of figure 17, legacy data is converted from the old to the new system. The “database administrator” is clearly recognizable in the center of the technical core team. The “legal team” is not active anymore, and client and consulting leaders collaborate somewhat more intensively. In the handover phase (bottom right) client and consulting leaders collaborate closely, while the technical team forms a separate cluster. The client technicians and the consulting technicians are not collaborating very much.

Figure 18. Group betweenness centrality of project team over 150 days

Figure 18 illustrates the main activities during this phase of the project. The first barely noticeable reduction in GBC happens when testing starts. The next change in GBC is an activity increase during legacy data conversion while testing is still going on in parallel, leading to a decline in GBC. The final increase in group activity, leading to a reduction in GBC is during handover.

Figure 19. Contribution Index of different people in the e-banking project

Figure 19 illustrates the communication patterns of the different actors in the project. The consulting project leaders receive more messages than they send, and are also the most active senders. The programmers receive more than they send, while the administrative staff sends more messages that they receive. This is a fairly typical behavior for a commercial software project.

Findings from this communication analysis are:

As this example illustrates, TeCFlow could help in identifying critical issues in project communication, assisting project managers to better manage their projects.

 

10.    Sales Force Pattern – Large Account Management in a Consulting Firm

Our final example illustrates the use of TeCFlow to analyze communication flow of a sales team. A new sales manager of an account management team of the consulting practice focusing on serving a single Fortune 500 client was taking over this responsibility from the previous sales manager who went into retirement. As his final assignment, the previous sales manager introduced the new sales manager to the client. Figure 18 displays four snapshots of the TeCFlow movie automatically generated from the e-mail archive (455 mails, 202 actors, 10 months).

LEAD Technologies Inc. V1.01LEAD Technologies Inc. V1.01

LEAD Technologies Inc. V1.01LEAD Technologies Inc. V1.01

Figure 20. “Sales” movie snapshots (click here for QuickTime movie)

The top left of figure 20 illustrates the old sales manager (blue dot) introducing the new sales manager (black dot) to the internal account management team. The top right of figure 20 shows a snapshot of the TeCFlow movie where the old sales manager is introducing the new sales manager to client executives. Consultants are red dots, clients are green dots. At this stage the old sales manager is still more and better connected to the client executives than the new sales manager, although new ties between the new sales manager and client executives are building up.

The bottom left of figure 20 illustrates coordination of a large proposal for the client where the new sales manager is not involved, as he is only peripheral to this cluster. The bottom right shows another proposal preparation where the new sales manager is in the center of the proposal team, while the old sales manager is in a more peripheral role. At the same time, a group of consultants and client executives is trying to arrange a social event, represented by the cluster of green dots at right. The new sales manager is too preoccupied with proposal preparation and misses this opportunity to connect with a new group of potential customers.

Figure 21. Group betweenness centrality of account management team

Figure 21 displays a history of changes in group betweenness centrality and density, identifying the major sales opportunities of the consulting sales team. Troughs in the group betweenness centrality curve indicate gatherings of a proposal team. As it turns out, the new sales manager only found out about some of these proposals through the TeCFlow analysis.

Figure 22. Contribution index of account management team

The contribution index plot (figure 22) shows that the most active consultant is a mid-level consultant working as a project manager at a client site. Figure 22 also illustrates the somewhat passive communication behavior of the new sales manager. The old sales manager, although officially retiring from his function, is still communicating more actively. In addition, figure 22 also points out the two most active clients.

Findings of the TeCFlow analysis are:

A system such as TeCFlow might have helped to identify key clients and key proposal opportunities, thus potentially increasing the number of new opportunities for submitting proposals.

 

11.    Future Work and Conclusions

As has been shown in these four scenarios, the temporal visualization of social networks by movies of communication flow offers a novel visual way to discover different phases in the life cycle of an online community. It conveys insights that might be difficult to obtain by other means. The visual approach permits to find periods of low and high group betweenness centrality, and to identify potential periods of high productivity and information dissemination. It needs to be complemented by other contextual cues to obtain a full understanding of the activities, such as interviews with community members and a content analysis of the messages exchanged.

Users have found our movies intuitively useful to gain a quick overview of the dynamics of communication flow in groups. We are currently working on a more systematic study comparing analysis of longitudinal social networks by conventional means with our dynamic movie-based approach to get a more in-depth understanding of strength and weaknesses of our method.

We have created a multiuser version of our system, where users can upload anonymized communication data sets to a “Global Social Web” under strict privacy and anonymity. We hope that this will encourage users to share their communication data such that we can get a much broader view on social interactions than is possible until now. We are also extending our tools for other types of communication activities. Because TeCFlow runs on top of a database, it is straightforward to import, for example, phone logs, instant messaging logs, and blogs into the database instead of e-mail archives.

Another area of future work is to refine the decay function of ties. While our current decay function is binary, meaning that a tie is either there or it is gone, we plan to explore using more realistic decay functions, where the strength of a tie gradually decreases over time. We also intend to add subcategories for different genres of communication such as inform, direct, commit, discuss, express, and request, (Yoshioka et al, 2001), to further study the structure of different communication networks. In addition we intend exploring a more fine-grained model of diffusion of innovation based on k-cores or n-plexes.

Our continuing goals are to gain deeper insights into the evolution of online group dynamics and developing a theory of member roles in virtual communities using more detailed communication pattern analysis.

 

Acknowledgements

The authors are grateful to Thomas J. Allen, Hans Brechbuhl, Scott Dynes, M. Eric Johnson, Rob Laubacher, Fillia Makedon, and Thomas W. Malone for their help and encouragement. This project has been supported by the MIT Center for Coordination Science, the Center for Digital Strategies at Tuck at Dartmouth, and the Devlab at Dartmouth.

 

References

Adar, E. & Tyler, J. (2003), Zoomgraph, HP Laboratories, Palo Alto, www.hpl.hp.com/research/idl/projects/graphs/zg2003.pdf

Ahuja, M. & Carley, K. (1999), Network Structure in Virtual Organizations. Organization Science, Vol. 10, No 6. 741-757.

Batagelj , V. & Mrvar, A. (1998), Pajek - Program for Large Network Analysis. Connections Vol. 21, No. 2, 47-57.

Borgatti, S., Everett, M. & Freeman, L.C. (1992), UCINET IV, Version 1.0, Columbia: Analytic Technologies

Borgatti, S.P. & Everett, M.G. (1999). Models of Core/Periphery Structures. Social Networks 21: 375-395.

Brandes, U. & Wagner, D. (2003), visone - Analysis and Visualization of Social Networks. In Michael Jünger and Petra Mutzel (Eds.): Graph Drawing Software, pp. 321-340.  Springer-Verlag.

Bulkley, N. & Van Alstyne, M. (2004), Why Information Should Influence Productivity. (forthcoming) In The Network Society: A Cross-Cultural Perspective. Manuel Castells, ed. Edward Elgar Publishing.

Cross, R. & Cummings, J. (2003), Relational and Structural Network Correlates of Performance in Knowledge Intensive Work. Academy of Management, Seattle, WA. Paper published in Proceedings.

Danah, B. Potter, J. & Viegas, F. (2003), Fragmentation of identity through structural holes in email contacts. Proc. Sunbelt XII. 2003

Ebel, H. Mielsch, L. & Bornholdt, S. (2002), Scale-free topology of e-mail networks. arXiv:cond-mat/0201476v2 12 Feb.

Freeman, L. (2000), Visualizing Social Networks. Journal of Social Structure (Vol. 1, No. 1), http://www.cmu.edu/joss/content/articles/volume1/Freeman.html

Freeman, L.C. & Freeman, S.C. (1980), A semi-visible college: Structural effects of seven months of EIES participation by a social networks community. In Henderson, M. M., and M. J. MacNaughton, eds. Electronic Communication: Technology and Impacts, 77-85. AAAS Symposium 52. Washington DC: American Association for the Advancement of Science.

Fruchterman, T.M.J & Reingold, E.M. (1991), Graph drawing by force directed placement. Software: Practice and Experience, 21(11), 1991.

Girvan, M. & Newman, M.E.J. (2001) Community structure in social and biological networks. arXiv:cond-mat/0112110v1, 7 Dec.

Gloor, P. Laubacher, R. Dynes, S. & Zhao, Y. (2003) Visualization of Communication Patterns in Collaborative Innovation Networks: Analysis of some W3C working groups. Proc. ACM CIKM 2003, New Orleans, Nov. 5-6.

Gloor, P. (2004) Net.Creators, online book manuscript, http://www.ickn.org/book/COINs.html

Guimera, R. Danon, L. Diaz-Guilera, A. Giralt, F. & Arenas, A. (2002), Self-similar community structure in organizations. ArXiv:cond-mat/0211498 v1, 22 Nov.

Krackhardt, D. Blythe, J. & McGrath, C. (1995), KrackPlot 3.0 User's Manual. Pittsburgh: Carnegie-Mellon University.

Leavitt, H.J. Lipman-Blumen, J. (1995) Hot Groups. Harvard Business Review. 73: 109-116.

Mazzocchi, S. (2003), Virtual Community Dynamics http://www.betaversion.org/~stefano/papers/AC2003-agora.pdf

McGrath, C. Blythe, J. (2004), Do You See What I Want You to See? The Effects of Motion and Spatial Layout on Viewers' Perceptions of Graph Structure Authors. Journal of Social Structure JoSS, Vol. 5, No. 2.

Moody, J. McFarland, D. Bender-deMoll, S. (2004) Dynamic Network Visualization: Methods for Meaning with Longitudinal Network Movies. http://www.sociology.ohio-state.edu/jwm/NetMovies/Sub_CD/dynamic_nets_public.html

Mutton, P. (2003), PieSpy Social Network Bot. http://www.jibble.org/piespy/ (Accessed 14 October 2003)

NetMiner, Cyram Co. Ltd., (2003), http://www.netminer.com

Peter, T.J. 1983 A Skunkworks Tale. reprinted in Katz, R. (ed). (2004) The Human Side of Managing Technological Innovation. Oxford University Press

Smith, M. & Fiore, A. (2001), Visualization components for persistent conversations, Proc ACM CHI

Tyler, J. Wilkinson, D. & Huberman, B (2002), Email as Spectroscopy: Automated Discovery of Community Structure within Organizations. http://www.hpl.hp.com/shl/papers/email/index.html

Van Alstyne, M. & Zhang, J. (2003), EmailNet: A System for Mining Social Influence & Network Topology in Communication. University of Michigan working paper.

Varghese, G. & Allen, T. (1993), Relational Data in Organizational Settings: An Introductory Note for Using AGNI and Netgraphs to Analyze Nodes, Relationships, Partitions and Boundaries. Connections, Volume XVI, Number 1 & 2, Spring.

Watts, D. (1999), Small Worlds, Princeton University Press.

Yoshioka, T. Herman, G. Yates, J. Orlikowski, W. (2001), Genre taxonomy: A knowledge repository of communicative actions. ACM Transactions on Information Systems, Vol. 19, No 4, Oct, 431-45