Tubewhacking
Paul Clarke introduced me today to the pastime of Tubewhacking. Similar to Googlewhacking, it involves finding an English word none of whose letters appears in the name of a Tube station, and for that station to be unique in that quality. The most famous example I know is St. John’s Wood, none of the letters of the word mackerel appearing in its name, a claim that no other station can make.
I wondered whether any stations were themselves Tubewhacks of other stations. So I got to work.
Fortunately, the number of columns in Excel has increased recently—I needed 7,616 columns to complete my logic, along with a tidy 11.8Mb. And below is a summary of the results.
There are 59 stations that have Tubewhacks, although their Tubewhacks come from only 22 unique stations. Bank is the most common, being the Tubewhack of a whopping (not Wapping) nine stations. Each of Vauxhall and Woodford accounts for seven Tubewhacks.
Below is the full list—station on the left, Tubewhack on the right.
Barbican: Temple
Becontree: Vauxhall
Bermondsey: Vauxhall
Bond Street: Vauxhall
Boston Manor: Chigwell
Brent Cross: Vauxhall
Burnt Oak: Chigwell
Camden Town: Ruislip
Canary Wharf: Temple
Canning Town: Shepherd’s Bush
Charing Cross: Temple
Chorleywood: Bank
Colliers Wood: Bank
Dagenham Heathway: Ruislip
East Putney: Woodford
Eastcote: Kilburn
Edgware: Pimlico
Elephant & Castle: Woodford
Elm Park: St. John’s Wood
Farringdon: Temple
Fulham Broadway: Epping
Gants Hill: Woodford
Goldhawk Road: Upney
Green Park: Dollis Hill
Hainault: Woodford
Hampstead: Kilburn
Highgate: Woodford
Holloway Road: Epping
Ickenham: Woodford
Kew Gardens: Pimlico
Knightsbridge: Oval
Leyton: Chiswick Park
Limehouse: Bank
Liverpool Street: Bank
Mill Hill East: Woodford
Mornington Crescent: Vauxhall
Newbury Park: Dollis Hill
Perivale: St. John’s Wood
Piccadilly Circus: Kenton
Plaistow: Debden
Putney Bridge: Oval
Queensway: Pimlico
Richmond: St. Pauls
Rotherhithe: Bank
Royal Victoria: Debden
Shadwell: Brixton
Shoreditch: Bank
Southfields: Bank
St. James’s Park: Hillingdon
Tooting Bec: Vauxhall
Tower Hill: Bank
Upminster Bridge: Oval
Upton Park: Chigwell
Warren Street: Pimlico
West Brompton: Vauxhall
West Finchley: Moor Park
West Ham: Kilburn
West Hampstead: Kilburn
West Ruislip: Bank
Liverpool Street and Bank form the only pairing in the above list that are one stop away from one another.
Formatted hashtags
Paul Clarke speculates here about the usefulness of Twitter, with a specific focus on whether its penetration can be used to local effect: in reporting weather conditions or traffic disruption, for example.
He cites a couple of examples of hashtags (#uksnow, #uktraffic), both of which require(d) a specific syntax following the hashtag. For example, #uktraffic needs to of the form:
#uktraffic [road] [where: jcts or place] [direction] [description] RT&seewhathappens
By submitting hashtags in that form, it enables them to be parsed for use in other applications, allowing them to be mapped, summarised etc.
With all due respect, I don’t think the Twitter community is sufficiently diligent to obey what seem like simple instructions such as this. Yes, they can probably remember the hashtag. But I’m sceptical as to whether they can remember the specific items that follow, let alone the order in which they need to appear.
For syntax-dependent hashtags like this to work, there needs to be a mechanism by which they can be registered, complete with the required syntax. That way, end users can subscribe to such hashtags, allowing their client applications to prompt them with the required syntax once they’ve entered that hashtag. Much in the way that Excel prompts you with the arguments needed for a formula that you’ve started to use. (Hell, even I never remember the order in which SUMIF arguments should be entered.)
Without this, the hashtags might be used, but the data they generate will be next-to-useless.
Monday 6 July, 2009. Statistically significant
I’m sure you’re all aware of the double significance of Monday. But I thought I’d document it for completeness and for the heathens among you.
Monday 6 July, 2009 will mark the fifth anniversary of this humble little blog. And it will also be the 40,000th day since the start of the 20th century. *
The latter fact is, of course, significant given that Excel calculates time in days using 1 January, 1900 as its point of reference. (50,000 will occur on 21 November, 2036, in case you want to book a venue in advance.)
Happy fifth birthday for Monday, little blog. And happy 40,000th birthday, Excel.
* Actually, Tuesday 7 July, 1900 is the 40,000th day since the start of the 20th century. But Excel purposely included a bug making 1900 a leap year (it wasn’t; remember?) to comply with other flawed date systems. So Excel thinks 6 July, 2009 is represented as 40,000, even though it should be 39,999.
Re-categorising: I’m scared
I’ve been putting Google Spreadsheets posts in the All things Excel category for want of a better place to put them. Should I change the name of the category to Spreadsheets? It seems a bit like the end of an era to get rid of the Excel word. But times are achangin’, and I think I need to change with them. Thoughts?
Maps and Gantts in Google Docs. FTW
I’m not sure when the feature was introduced, but you can now incorporate maps within Google Spreadsheets, maps that display icons driven from mapping search terms (e.g. postcodes) as opposed to latitude/longitude combinations. You do so by inserting a Gadget. Thanks to Tom Viner for pointing me to this.
This is a great step forward, as you can publish the spreadsheet and put the script tag into any html page to display it there. The only issue is: it ain’t working for me. Apparently, my page is missing the Google Maps API. Google doesn’t give any insight into this error message, either in its Google Documents help or through what I thought was its rather comprehensive search engine.
It’s also got a bunch of other very cool gadgets to insert. The most useful I’ve found is the Gantt chart one. You put your MS Project data into one sheet and you can have a very good-looking Gantt chart in another, drawing from that data. Very cool indeed. There are some seemingly powerful chart gadgets too. Lots to explore.
But as for the maps, I’m flummoxed. And I don’t know what to do.
Twoogle enhanced: Twitter to Google Docs
Google Spreadsheets introduced new functionality the other day allowing sheets to be protected individually. Previously, you either protected a workbook in its entirety or you didn’t.
In my earlier version of the Twitter to Google Docs feed (now named Twoogle in deference to the continuing trend for Twitter apps to begin with the letters TW), I separated the Friends list from the Feed to save people breaking the latter.
I’ve just created a new, enhanced version, combining the two spreadsheets into a single one, and locking down the Feeds sheet to save people from themselves.
Click here to access the new version. Please leave comments here to let me hear your feedback.
Google Docs: getting round the IMPORTDATA limitation
There is such power to be harnessed in the link between Google Spreadsheets and Google Maps. But to me, the power is limited by a fundamental constraint.
For Google Maps to map a location from a spreadsheet, it needs two values: a latitude and a longitude. It can’t map places directly from the more human address and postcode data that you and I use on a daily basis.
Historically, I’ve used the following formula to get coordinate data into a Google Spreadsheet where all I have is a postcode.
=IMPORTDATA(”http://maps.google.co.uk/maps/geo?output=csv&q=”&A1)
With the postcode in A1, a mini-CSV file is created containing four fields that can each be accessed via the CONTINUE function that I’ve recently lamented: HTTP status code, accuracy level, latitude and longitude.
The trouble is, you’re limited to using a maximum of 50 IMPORTDATA functions in any one spreadsheet. So if I wanted to create a spreadsheet containing the postcodes of the 500+ Starbucks in the UK to drive a map, I’d have to find the latitudes and longitudes in batches of 50, pasting their values over their formulae to create static data. Hardly ideal.
Does anyone know of a way of surmounting this problem? If so, I’d love to know about it. I’d love to be able to generate maps showing things in common, but having a limit of 50 or having to use the ridiculous manual workaround described makes it not worth the bother.
Twitter to Google Docs: complete
Having done some more playing with Google Spreadsheets, I’ve got something close to a finished product: Twitter integrated with Google Spreadsheets.
Permissions in Google Spreadsheets are global. People can either view or edit the spreadsheet. You can’t specify which bits they can and can’t touch. So I made the tough decision to split the friends sheet from the feed sheet. If I’d kept them together, then by allowing people to add new friends, they would also have been able to tinker with (screw up) all the logic behind the feed.
So you add your friends here; and view your feed here. There is a small (20 seconds or so) latency between adding your friends and their tweets appearing in the feeds sheet. This is because the friends sheet publishes to a web page, from which the feed scrapes the content. Doesn’t happen in perfect realtime, for whatever reason. The good thing is that you can’t screw things up by deleting and inserting rows. Up to 20 feeds can be added.
The CONTINUE formula behaviour for IMPORTFEED and IMPORTRANGE has made this much harder than it should have been—such a stupid idea, Google. Thanks to Francis for discovering the friendly versions of the RSS URLs.
Anyway, enjoy playing. I did.
Google Docs does Twitter
I’ve just been futzing around with Google Documents (namely Google Spreadsheets in this instance) to figure out how far one can go in creating Twitter in a Google spreadsheet. While intellectually enjoyable for me, it would also provide a way of viewing Twitter updates in places where Twitter was banned.
The investigation came courtesy of a tweet from @davebriggs and a reply from @lesteph, each in response to Elliott Kember’s Spreadtweet.
So far, I’ve only managed to replicate the publicly-available feed of a single person. For want of a better subject, I chose @stephenfry as my proverbial guinea pig. And here’s the result.
It’s simply a list of the 20 most recent tweets from the subject of choice. Put the feed URL in C1 and the 20 tweets automatically appear. The page is best viewed in list view as per the link, but go back to spreadsheet view to see the logic, if that floats your boat.
The annoying thing is that unlike your standard, public-facing URLs (e.g. http://twitter.com/stephenfry), the RSS feed URLs are rubbish (in this case http://twitter.com/statuses/user_timeline/15439395.rss). So there’s no way that I’m aware of, apart from navigating via Twitter, of predicting a feed’s URL from the user’s name.
Also, the feed doesn’t seem to update live as new tweets come through. I’ll find out in the morning whether it’s static (and therefore nigh-on useless), or whether it is merely a little behind in picking up users’ more recent tweets.
Next step, we’ll try to aggregate a bunch of users’ tweets into a single feed to replicate what might be a user’s Twitter homepage.
Update: here’s a slightly hacked feed of four Twitterers’ last 20 posts. The way in which Google’s CONTINUE formula works sucks.The way in which an undefined number of cells relies on the value of the IMPORTFEED function makes artistic use of it difficult to say the least.
Update: the feeds don’t replicate live in Google Spreadsheets, but they do update. (Refreshing the spreadsheet at 8.30am gave me tweets up to 6.30am.) This is gonna be big! Next steps: figure out how to automatically get the RSS feed address from a twitter name; and scale to n names rather than the four currently catered for. FTW!
Britain’s Got Excel Talent
So, when will this show be airing? I’m envisioning Ant and Dec hosting a programme where spreadsheet-jockeys good and bad show off their relative Excel talents in front of a hysterical, screaming audience.
Cowell will be disparaging over sloppy VBA, Holden will focus more on spreadsheet aesthetics, while Morgan will be a stalwart for formula efficiency.
Whaddya think?
Redundant SUM
Why do people use SUM when they mean nothing of the sort. I received a spreadsheet recently that contained the following formula.
=SUM(C7-C6)/(C6/100)
I think way back when, some highly-read source must have mistakenly advised people that to create a basic arithmetic formula, you needed the SUM, irrespective of the calculation being performed.
Oh, and the 100 is to convert what would be a decimal into a number between 0 and 100 which looks more like a percentage. Schoolboy error.
Pivot Tables: data or objects?
This morning, I read Joel Spolsky’s article How to be a program manager. Therein lies the following quote:
"One of the most monumental debates I remember from the Excel 5 project was between a developer who wanted pivot tables to float on the drawing layer above the spreadsheet, and the program manager, who insisted that pivot tables live right in the cells on the spreadsheet. This debate went on for a really, really long time, and eventually, the program manager prevailed, but the final design came out much much better than any one individual’s design would have been."
I’ve always struggled with the fact that Pivot Tables sit within the Excel cells. In all honesty, it’s where they belong because they are made of data, each piece an element in its own right. But their variable size, and more importantly their ability to change size and shape unpredictably at the drop of a hat/drag of a mouse means that they don’t play well with the things around them.
There’s an argument for Pivot Tables to be cell-based, but for them not to sit on sheets with other non-Pivot Table objects. Multiple Pivot Tables could sit on a single sheet, as they could be positioned relative to other ones on the sheet (Pivot Table Y is to the right of Pivot Table X).
I’ve probably thought about this more than your average bod.
Conditional dropdowns. Lovely!
I read this morning Dany Hoter’s article on the Microsoft Excel blog on how to create contextual data validation.
I found the article difficult to follow, even with the in-spreadsheet instructions. So I thought I’d have a go at describing this wonderfully useful tool. I’ve also removed an unnecessary step that confused things hugely for me. Maybe this is no better. Hope not.
First of all, what does it do? Basically, it allows you to create contextual dropdowns, something I’ve wondered about often. Sad, huh? For example, your first dropdown can give the user the choice of continents, the latter giving a choice of countries within the chosen continent. The latter dropdown will only give a choice of countries within the chosen continent. Nice.
To help with the description below, the overarching groups will be called, er, groups; the items therein will be called values.
The first step is to create an area in which the countries within each continent are listed. Create columns headed with the group titles, or continents (unfortunately, they can’t contain spaces, so either keep them at one word or separate words with undescores). And below each group title list the items within that group, or the countries.
Now create a bunch of named ranges, one for each of the groups, but excluding the group titles. And create another named range called Continents containing just the column headings.
In Excel 2007, this can be done easily throughout the Create From Selection option in the Defined Names area of the Formulas ribbon. But if your sub-options vary in number (as would be the case with countries within continents), it’s best to create the ranges manually, otherwise you’ll have empty cells in some of the named ranges.
So now you have lists of countries for Asia, Europe, North America etc. with named ranges for each called Asia, Europe, North_America etc. And you have another named range called Continents containing a list of the values Asia, Europe, North America etc.
That’s the prep. work done. Now you need to create two cells: one in which people will select the continent, and one in which they will select the country. Let’s assume these are A1 and B1 respectively.
For A1, hit Data Validation, select List and type "=Continents" in the Source field. This will allow the user to select a continent from the list.
Now for the clever bit. Ready?
In cell B1, use Data Validation again, choose List and put "=INDIRECT(A1)" as the Source. The indirect function merely takes the value of A1 and uses it as text that can be in turn interpreted by Excel. Given that A1 can only contain the name of a continent and each continent has its own named range, the values allowed in B1 are defined by the named range associated with the continent selected in A1. And the two cells can be copied down the columns to make as many contextual dropdowns as you like.
It’s beautiful. It really is. Thanks, Dany.
Like watching format paint dry
Why on earth does Excel 2007 take so long to react to CTRL+1 (format cells)? My stopwatch just recorded 13.26 seconds before the dialogue box popped up.
Anyone?
Hiding errors where data doesn’t appear
Imagine a bunch of formulae copied down columns B, C, D etc. that are all driven by the data appearing in column A. An example might be a staff reference number in column A, with associated details (forename, surname, gender etc.) populated through formulae in the subsequent columns.
The trouble is, I don’t know how many rows will contain data—that is the decision of the operator of the spreadsheet. So I usually copy the formulae down beyond the point I think they will ever be used.
In the past, I’ve always added logic in the formulae in columns B and beyond to get rid of the horrible N/A# errors that appear when you try to look up a blank value, usually through one of the following
=if(ISERROR([INSERT LOGIC HERE]),”",[INSERT LOGIC HERE])
=if(A1=”",”",[INSERT LOGIC HERE])
The former checks whether the logic will return an error, in which case it will blank the target cell. The latter checks for the presence of data in column A.
Thinking about the problem today, I figured it probably makes more sense, both from a presentation and a memory perspective, to cater for this circumstance using conditional formatting as follows:
=isblank($A1)
The conditional format for the columns to the right of column A would simply contain ;;; as the custom number format, which hides the cells’ contents.
(In the course of writing this post, I remembered that the conditional formatting of number formats was only introduced in Excel 2007, so this tip is only applicable in the latest version, I’m afraid.)
Paste Special? Not that special
If you have two instances of Excel open, you can’t Copy | Paste Special between the two. Instead, you have to create a new spreadsheet in the destination instance, paste into that, and then Copy | Paste Special from that one to the destination.
How shit is that?
Signs of appreciation
"You f*cking c*nt. It’s like knowing someone with a superpower. […] I am actually genuinely grateful you small piece of stubborn Yorkshire granite."
Another satisfied spreadsheet customer.
Breaking rank
I was asked yesterday to enhance a spreadsheet to give a table of the top five-ranked items based on number of respondents to a question. With 115 possible respondents, there was a chance that there would be ties for some of the places, and I wanted the spreadsheet to override such ties with a random ordering. So if the most popular three question each attracted 85 responses, order them randomly into slots one, two and three nonetheless.
Excel’s RANK function ranks all of these equal first. To get around this, I added a random, miniscule amount (=RAND()*0.00001) to the total number of responses and used this revised figure to determine the ranking. After creating the random numbers, I copied and pasted values over them to make sure the ordering didn’t keep jumping with every action.
Not pure, but a creative way to solve the business problem, if I may say so.
Spreadsheets @ 20
I’m currently celebrating 20 years in the spreadsheet business—that’s 57% of my life, according to Excel. Hooray to my Dad for using child labour (me) to help him with his insurance calcuations in Lotus 1-2-3 on his Toshiba T2000 laptop back in 1988.
Google Spreadsheets to Google Maps: soup to nuts
Last week, I described how I’d used a Google API to allow a Google Spreadsheet to drive items appearing on a Google Map. I was proud of the work. Rob was less impressed, instead focusing on the fact that the latitude and longitude had to be looked up and input manually based on the postcode or town being mapped. He’s never impressed, that boy.
In a comment on the original post, Mercedes Car Finder (a person, it seems) suggested a Google formula, as follows:
=ImportData("http://maps.google.co.uk/maps/geo?output=csv&q="&A1)
where A1 contains the item to be looked up—the postcode or placename. Let’s assume the formula is entered into B1.
The CSV file resulting from the URL contains four fields. For "York", the first two numbers are 200 and 4. (I have no idea what these are. Anyone?) The last two (53.957702 and -1.082286) are the latitude and longitude respectively.
So the ImportData function brings back all four values, but only stores the first of these (200) in the cell containing the formula, in this case B1. The subsequent three values are accessed through the formulae
- =CONTINUE(B2,1,2)
- =CONTINUE(B2,1,3)
- =CONTINUE(B2,1,4)
Given that the CSV file returned has a single row of data, these formulae bring back the mysterious 4, 53.957702 and -1.082286 respectively.
Unfortunately, Google doesn’t allow you to create latitude in one step through the following syntax:
- =CONTINUE(ImportData("http://maps.google.co.uk/maps/geo?output=csv&q="&A2),1,3)
But nonetheless, if I have a column of postcodes or placenames, all I need is three columns of formulae (the ImportData formula containing the 200, the CONTINUE (3) formula containing the latitude and the CONTINUE (4) formula containing the longitude) to feed the map.
Fabulous. Thanks, Mercedes-selling-person.