At EuroOSCON, Schuyler Erle presented the Gutenkarte project, which takes out of copyright texts, uses MetaCarta to find geolocatable words and maps them onto a map of the world as done here for War and Peace by Leo Tolstoy. The immediate comment from a lot of people in the audience was "but what about the flow of time?". If you want to get that right it's tricky, but the good suggestion from Gervase Markham was to just use storytime - just find the line number.
I thought that was a good idea and did a quick script to check just the story time idea out. I extracted all capitalized words that weren't suspiciously normal and built - not a tag cloud - but a tag flow, that you can find here. What you're looking at is each word listed in line number order based on when it occurs in the novel. As the tag builds up in use the size of letters grows. For convenience the current distance (in pixels/lines of text) from the top is listed to the left.
This gives a surprisingly fast overview of the structure of the novel and of where Tolstoy is investing your attention:
We start off around Anna Pavlovna until from approx line 1K to line 5K Tolstoy builds up Pierre. He then leaves Pierre to focus on Prince Andrew. Pierre only reenters the story after Prince Andrew has been built to the same size as Pierre (which you can tell from the font size of course). Tolstoy briefly builds the epic back story (Kutuzov, Rostov, Bonaparte, French, German, Russian) whereafter the epic back story and Prince Andrew collide as Prince Andrew joins the battle field (Rostov and Kutuzov).
While all this has been going on the heroine Natasha has been slowly building in the background.
It seems as if Pierre joins the battle field as well and meets Prince Andrew (it is really a duel involving Pierre and some of the military men) and around line 21K Pierre and Prince Andrew meet. At around line 25K Tolstoy starts to develop Natacha seriously and at around 26K the love triangle of Andrew, Natasha and Pierre is established. Natasha is further developed in family scenes with her brother Nicholas (who we met previously at the military camp) whereafter Andrew's interest in Natasha is allowed to develop. Pierre is pretty absent until forcefully reintroduced around line 33K where the triangle is again at the center.
Having fed our interest in the personal destinies of the main characters Tolstoy has time for a lengthy history lesson on the Napoleonic Wars until it becomes time around 37K to develop Natasha and Pierre's relationship. It doesn't last long as war reinserts itself in the story - even Pierre enters battle around 41K and intermixed with the personal fate of Andrew and Pierre the war now takes over until Natasha is reintroduced at 47K. What follows is a longish double track of war and personal destiny. From 54500 to 55500 Napoleon is at Moscow and stays at the heart of the story until he is fought back at Baradino from 59K to 60K. It is only after this point that the triangle of Natasha, Andrew and Pierre is once again at center and the story finally resolved as Natasha and Pierre are married and completely take over the story. The novel ends with a lengthy historical epilog until at the very end - with great effect - God gets the final word.
I found this exercise stimulating. Clearly the presentation leaves much to be desired. I'm envisioning something along the lines of this, only with a tag cloud and story-accurate timing of fadeins and fadeouts. Ideally however, I would do leitmotifs for each tag and turn the novel into accidental music.