So I did what all statisticians would do - go looking for data. Tweets can be geo-coded, but getting reliable info might be hard. I checked the USGS earthquake site, but it's not real-time. But then I found the British Geological Society earthquakes page, with true real-time reports from a bunch of sensors around the country. Under the Monitoring menu there's the Real-Time Data option.
Here you can choose the station and a day and get a seismogram. Here's what the one in Keswick (KESW) was showing:
BIG spike. It went off the scale of the plot. I looked at a few other plots and discovered most of them had recorded it too. And I noticed the quake had hit at slightly different times. Maybe I could use that information to work out the location of the quake.
At this point the BGS site was getting slow as worried Cumbrians checked to see if it was an earthquake or if the local nuclear power station had exploded. Anyway, it was bedtime, so I left it to the morning, as something to do in my lunch hour.
I couldn't find a way to get the raw data, so I decided I'd get what I could from the images. A quick script pulled all the GIF images from the sites, except for a couple that seemed to be off-line. Another script pulled all the station information book HTML files and from there I got the locations.
I then stepped through each of the 24 images, identifying the point near 11pm where the tremors started, and entered the data.
Now I had the location and time of the points where the quake had been sensed. This was all converted to OSGB coordinates using rgdal in R.
I didn't have a zero-time for the quake or a location. A quick plot did show the quake had to be somewhere in Cumbria.
I don't know how professional seismologists work out earthquake epicentres, so I came up with something quick and dirty.
Assume the velocity of the quake waves are constant across the UK. Then if the quake was at location (x,y) then a plot of distance against time should be a straight line, with all the points on it. Some quick experiments showed I could fit a straight line then use the sum of the residuals squared as a measure to be minimised over (x,y). A few lines of code later and I had a coordinate!
> fit$par [1] 307499.9 506035.7
By now the quake had hit the news and the USGS web site was showing it:
How did that compare to my point? I converted it to OSGB coordinates:
> coordinates(usgsos) coords.x1 coords.x2 [1,] 328994.6 500054.7
20km further east, and 6km south. Not bad considering I had to eyeball the time from the seismograms and various other assumptions. I don't know what kind of uncertainty the USGS put on their predictions, and I can't put an uncertainty on my location either since there's not a proper statistical model behind this. I can give you the data if you have a better idea - or you can scrape it again from the web site! Here's a comparison of the epicentre predictions on a map:
Here's what the distance/time plot looks like for my predicted epicentre:
And from the slope of this we can get the speed of travel of the quake waves - about 7,500 km/s - which is about what a quick google search on earthquakes tells me.
Footnote: I think on the bike home tonight I worked out a way to do this statistically. Suppose the earthquake is at (x,y), at time t0, and the waves move with speed v. Then suppose the time of arrival of the quake at a point at distance d is Normally distributed with mean time t0+d/v and variance k(d;theta), where k() is some increasing function of d. This allows us to consider the increasing noise in the system for points further away. Fit the whole thing by maximum likelihood and we get estimates (with errors) for location, time of origin, and wave speed. Something for next lunchtime...
There aren't any nuclear power stations in Cumbria – the nearest is the one at Heysham. That said, I'm sure most Cumbrians share the misconception that there is a power station at Sellafield.
ReplyDelete