-
PDF
- Split View
-
Views
-
Cite
Cite
Kimberley Kavanagh, Chris Robertson, Heather Murdoch, George Crooks, Jim McMenamin, Syndromic Surveillance of Influenza-Like Illness in Scotland During the Influenza A H1N1v Pandemic and Beyond, Journal of the Royal Statistical Society Series A: Statistics in Society, Volume 175, Issue 4, October 2012, Pages 939–958, https://doi.org/10.1111/j.1467-985X.2012.01025.x
- Share Icon Share
Summary
Syndromic surveillance refers to the rapid monitoring of syndromic data to highlight and follow outbreaks of infectious diseases, increasing situational awareness. Such systems are based on statistical models to described routinely collected health data. We describe a working exception reporting system that is currently used in Scotland to monitor calls received by the National Health Service telephone helpline NHS24. We demonstrate the utility of the system to describe the time series data from NHS24 both at an aggregated Scotland level and at the individual health board level for two case-studies: firstly during the initial phase of the 2009 influenza A H1N1v outbreak and secondly for the emergence of seasonal influenza in each winter season from 2006–2007 and 2010–2011. In particular, we focus on a localized cluster of infection in the Highland health board and the ability of the system to highlight this outbreak. Caveats of the system, including the effect of media reporting of the pandemic on the results and the associated statistical issues, are discussed. We discuss the adaptability and timeliness of the system and how this continues to form part of a suite of surveillance used to give early warnings to public health decision makers.
1 Introduction
As part of preparing for a possible outbreak of pandemic influenza Health Protection Scotland developed a syndromic surveillance system to monitor and detect changes in levels of influenza-like illness presenting in the community. A component of this system used data from all calls made to the National Health Service (NHS) helpline NHS24. With the emergence of pandemic influenza A (H1N1v) in April 2009 this system was developed further to provide realtime daily surveillance of NHS24 calls, tailored to the needs of the specific outbreak.
NHS24 is a nurse-led telephone helpline that is the means of access to out-of-hours general practice services for the Scottish population, approximately 5 million people (General Register Office for Scotland, 2008), which began operating as a special health board in 2002 (Scottish Government, 2002). NHS24 provides a similar service to NHS Direct which operates in England and Wales. Each weekday NHS24 receives around 2500–3000 telephone calls and on weekends and public holidays this rises to around 6000–7000 calls per day. Peak levels of around 10000–14000 calls per day have been known to occur during the Easter and Christmas holiday periods when general practitioner (GP) practices are closed.
The data that are provided from NHS24 facilitate syndromic surveillance in Scotland through an exception reporting system (ERS) that is monitored at Health Protection Scotland. This system gives potential for realtime monitoring of data and by generation of ‘alarms’ complements the suite of data that is available for situational awareness, namely attendance at GP practices, admissions to hospital, virological testing and deaths.
A range of factors influence the generation of alarms, including the size of an outbreak and the population affected by the type of data being monitored. Syndromic surveillance systems have become more commonplace following their inception, mainly in the USA, as a (bio)terrorism surveillance tool based on emergency department admission records (Heffernan et al., 2004; Fleischauer et al., 2004). The validation of syndromic surveillance for respiratory pathogen spread has also been demonstrated by using multiple data registries (van den Wijngaard et al., 2008). Elliot (2009) highlighted that
‘syndromic surveillance takes precedence as the primary means of estimating community burden of pandemic influenza infections’,
and syndromic indicators were used to monitor influenza activity in England and Wales during the pandemic (Smith et al., 2011).
A Scottish multidata source syndromic surveillance system, partly utilizing NHS24 data, was implemented by Health Protection Scotland before the summit of G8 countries in Gleneagles in July 2005 (Meyer et al., 2008). These data were processed through a number of parallel ERSs and reports were generated daily for 10 syndromes that were considered relevant to possible bioterrorism or disease outbreaks. Following the summit of G8 countries the NHS24 stream of this multidata source ERS has been updated weekly. In the original version the syndromes were derived from data which were precategorized into syndromes by approximately 200 clinical algorithms at NHS24—known as protocolled data. Owing to operational changes at NHS24, the proportion of calls processed by the clinical algorithms at NHS24 fell to around 60–70% of calls, reaching lows of around 50% during busy periods with the remainder of the calls recorded in free text. This change provided the impetus to create a free-text searching algorithm which classifies all calls received by NHS24 into one of the 10 monitored syndromes and an ‘other’ category, therefore allowing all calls to be analysed by the ERS. This development is particularly useful for periods when call numbers may reach peak levels, e.g. in the event of mass illness such as pandemic influenza.
In this paper we describe the development of the ERS system to analyse all calls that were received by NHS24, in particular those associated with influenza-like illness. We present the results that were produced by the system at the beginning of the 2009 influenza A H1N1v pandemic outbreak in the UK, the index cases of which occurred in a couple returning home to Falkirk, Scotland, after a holiday in Mexico (Scottish Government, 2009a). From the date of the index case to July 1st, 2009, the UK pandemic was in an initial phase, whereby each possible case was tested virologically and all confirmed cases and close contacts were offered antivirals (Health Protection Agency and Health Protection Scotland, 2009). This unique situation allows informal validation of the output of the ERS system to confirmed clusters of cases. In addition, we also describe the flexibility and timeliness of the system and the usefulness to decision makers during the outbreak.
We also illustrate how the system can be used to identify the start of the influenza season (outside pandemic spread) in advance of alternative monitoring systems giving public health consultants and epidemiologists advance warning of possible clusters.
2 Methods
2.1 Data
The data that are provided to Health Protection Scotland by NHS24 include the date and time of call, gender, age and postcode district for each caller to the service. The data also record the reason for the call stated by the individual along with the protocol title(s) if the call is processed through the clinical algorithm process. All calls have a free-text entry whereas the proportion of calls which have a protocol syndromic category has ranged from 50% to over 90% over the period that the NHS24 ERS has been running.
By using postcode district as a proxy for the caller’s location, calls are classified as originating from one of the 14 Scottish health boards, which are shown in Fig. 1, allowing for surveillance at both the national and the health board level, the latter being of particular importance for highlighting localized clusters of infection. Only the 11 mainland health boards are routinely monitored as the populations in the island health boards are small and the small call numbers originating from them make prediction unfeasible.

Illustration of the geographical location of the 11 mainland (and three island) health boards in Scotland (source: Health Protection Scotland)
Before the influenza A (H1N1v) pandemic data were received weekly; however, in response to the outbreak, data receipt, processing and reporting of the findings occurred daily with results available by midday on the day of analysis for all calls received by NHS24 on the previous day.
2.2 Syndromic classification
For the NHS24 data, there are 10 syndromic surveillance categories monitored as standard by the ERS for both the protocol (where present) and call reason data streams. These syndromes are based on groupings of symptoms and are colds and flu, coughs, fever, difficulty breathing, diarrhoea, vomiting, double vision, eye problems, lumps and rash. Four of these syndromes—colds and flu, fever, coughs and difficulty breathing—are specifically linked to influenza-like illness.
The precategorized protocol data are mapped to one of the 10 monitored syndromic surveillance categories or a generic other category. To enable classification of all calls in the call reason data stream into corresponding categories a free-text searching algorithm was devised to interrogate the call reason field for each call record. Each call is classified, as outlined in Table 1, by searching for words associated with the syndromes, excluding certain combinations of words which may convey a different meaning and accounting for common misspellings. For example the word ‘cold’ may be interpreted in a different way dependent on the words it appears with. Table 2 shows two calls to NHS24 and their eventual classification by the call algorithm. In the first, the call does not fall into any of the recorded syndromes and is therefore classified as other. For the second entry the keyword cold leads to a colds and flu categorization with the word ‘breathless’ leading to the call also being classified as difficulty breathing. In this way each call is categorized as one or more of the 10 monitored syndromes or other by the rule set.
Rule set for classification of the 10 monitored syndromes from the free-text call reason data entry
Syndrome . | Classification rule . |
---|---|
Colds and flu | Extract keyword ‘cold’ |
Ignore cases with ‘cold’ and any of the following: ‘sore’, ‘hot’, ‘hand’, ‘leg’, ‘shivery’, ‘clammy’, ‘sweat’ | |
Extract keyword ‘flu’ | |
Ignore cases where ‘flu’ is part of a word i.e. ‘reflux’, ‘fluid’, ‘flushed’ | |
Fever | Extract keyword ‘fever’ |
Extract keyword ‘temp’ | |
Exclude keyword ‘attempt’ | |
Cough | Extract keyword ‘cough’ |
Difficulty breathing | Extract keywords ‘breath’, ‘wheez’ |
Ignore any of those with ‘asthma attack’ | |
Eye problems | Extract keywords ‘eye’ and ‘dis’ (discomfort, discharge) |
Extract keywords ‘eye’ and ‘red’ | |
Extract keyword ‘visu’ (visual disturbance) | |
Diarrhoea | Extract keyword ‘diar’ or ‘dior’ |
Lumps | Extract keyword ‘lump’ and ‘neck’ or ‘under arm’ or ‘fac’ (face/facial) or ‘groin’ |
Extract keywords ‘swollen’ and ‘glands’ | |
Rash | Extract keyword ‘rash’ |
Extract keyword ‘spot’ | |
Exclude ‘nappy’ and ‘rash’ | |
Extract ‘chicken’ and ‘po’ (rash may be called chicken pox by caller) | |
Double vision | Extract keywords ‘double’ and ‘vision’ |
Vomiting | Extract keywords ‘vom’ or ‘sick’ or ‘nausea’ |
Ignore those with any keyword and ‘blood’ or ‘pregnant’ |
Syndrome . | Classification rule . |
---|---|
Colds and flu | Extract keyword ‘cold’ |
Ignore cases with ‘cold’ and any of the following: ‘sore’, ‘hot’, ‘hand’, ‘leg’, ‘shivery’, ‘clammy’, ‘sweat’ | |
Extract keyword ‘flu’ | |
Ignore cases where ‘flu’ is part of a word i.e. ‘reflux’, ‘fluid’, ‘flushed’ | |
Fever | Extract keyword ‘fever’ |
Extract keyword ‘temp’ | |
Exclude keyword ‘attempt’ | |
Cough | Extract keyword ‘cough’ |
Difficulty breathing | Extract keywords ‘breath’, ‘wheez’ |
Ignore any of those with ‘asthma attack’ | |
Eye problems | Extract keywords ‘eye’ and ‘dis’ (discomfort, discharge) |
Extract keywords ‘eye’ and ‘red’ | |
Extract keyword ‘visu’ (visual disturbance) | |
Diarrhoea | Extract keyword ‘diar’ or ‘dior’ |
Lumps | Extract keyword ‘lump’ and ‘neck’ or ‘under arm’ or ‘fac’ (face/facial) or ‘groin’ |
Extract keywords ‘swollen’ and ‘glands’ | |
Rash | Extract keyword ‘rash’ |
Extract keyword ‘spot’ | |
Exclude ‘nappy’ and ‘rash’ | |
Extract ‘chicken’ and ‘po’ (rash may be called chicken pox by caller) | |
Double vision | Extract keywords ‘double’ and ‘vision’ |
Vomiting | Extract keywords ‘vom’ or ‘sick’ or ‘nausea’ |
Ignore those with any keyword and ‘blood’ or ‘pregnant’ |
Rule set for classification of the 10 monitored syndromes from the free-text call reason data entry
Syndrome . | Classification rule . |
---|---|
Colds and flu | Extract keyword ‘cold’ |
Ignore cases with ‘cold’ and any of the following: ‘sore’, ‘hot’, ‘hand’, ‘leg’, ‘shivery’, ‘clammy’, ‘sweat’ | |
Extract keyword ‘flu’ | |
Ignore cases where ‘flu’ is part of a word i.e. ‘reflux’, ‘fluid’, ‘flushed’ | |
Fever | Extract keyword ‘fever’ |
Extract keyword ‘temp’ | |
Exclude keyword ‘attempt’ | |
Cough | Extract keyword ‘cough’ |
Difficulty breathing | Extract keywords ‘breath’, ‘wheez’ |
Ignore any of those with ‘asthma attack’ | |
Eye problems | Extract keywords ‘eye’ and ‘dis’ (discomfort, discharge) |
Extract keywords ‘eye’ and ‘red’ | |
Extract keyword ‘visu’ (visual disturbance) | |
Diarrhoea | Extract keyword ‘diar’ or ‘dior’ |
Lumps | Extract keyword ‘lump’ and ‘neck’ or ‘under arm’ or ‘fac’ (face/facial) or ‘groin’ |
Extract keywords ‘swollen’ and ‘glands’ | |
Rash | Extract keyword ‘rash’ |
Extract keyword ‘spot’ | |
Exclude ‘nappy’ and ‘rash’ | |
Extract ‘chicken’ and ‘po’ (rash may be called chicken pox by caller) | |
Double vision | Extract keywords ‘double’ and ‘vision’ |
Vomiting | Extract keywords ‘vom’ or ‘sick’ or ‘nausea’ |
Ignore those with any keyword and ‘blood’ or ‘pregnant’ |
Syndrome . | Classification rule . |
---|---|
Colds and flu | Extract keyword ‘cold’ |
Ignore cases with ‘cold’ and any of the following: ‘sore’, ‘hot’, ‘hand’, ‘leg’, ‘shivery’, ‘clammy’, ‘sweat’ | |
Extract keyword ‘flu’ | |
Ignore cases where ‘flu’ is part of a word i.e. ‘reflux’, ‘fluid’, ‘flushed’ | |
Fever | Extract keyword ‘fever’ |
Extract keyword ‘temp’ | |
Exclude keyword ‘attempt’ | |
Cough | Extract keyword ‘cough’ |
Difficulty breathing | Extract keywords ‘breath’, ‘wheez’ |
Ignore any of those with ‘asthma attack’ | |
Eye problems | Extract keywords ‘eye’ and ‘dis’ (discomfort, discharge) |
Extract keywords ‘eye’ and ‘red’ | |
Extract keyword ‘visu’ (visual disturbance) | |
Diarrhoea | Extract keyword ‘diar’ or ‘dior’ |
Lumps | Extract keyword ‘lump’ and ‘neck’ or ‘under arm’ or ‘fac’ (face/facial) or ‘groin’ |
Extract keywords ‘swollen’ and ‘glands’ | |
Rash | Extract keyword ‘rash’ |
Extract keyword ‘spot’ | |
Exclude ‘nappy’ and ‘rash’ | |
Extract ‘chicken’ and ‘po’ (rash may be called chicken pox by caller) | |
Double vision | Extract keywords ‘double’ and ‘vision’ |
Vomiting | Extract keywords ‘vom’ or ‘sick’ or ‘nausea’ |
Ignore those with any keyword and ‘blood’ or ‘pregnant’ |
Example data with recognition of the word ‘cold’ in the call reason field and illustration of the resulting classification
Gender . | Date . | Postcode district . | Age . | Call reason . | Classification . |
---|---|---|---|---|---|
Male | 05/31/2007 | KA19 | 83 | COLD SWEAT, CLAMMY 20 MINS | Other |
Male | 05/31/2007 | AB25 | 41 | BREATHLESS, HEAVY COLD, SORE THROAT | Colds and flu Difficulty breathing |
Gender . | Date . | Postcode district . | Age . | Call reason . | Classification . |
---|---|---|---|---|---|
Male | 05/31/2007 | KA19 | 83 | COLD SWEAT, CLAMMY 20 MINS | Other |
Male | 05/31/2007 | AB25 | 41 | BREATHLESS, HEAVY COLD, SORE THROAT | Colds and flu Difficulty breathing |
Example data with recognition of the word ‘cold’ in the call reason field and illustration of the resulting classification
Gender . | Date . | Postcode district . | Age . | Call reason . | Classification . |
---|---|---|---|---|---|
Male | 05/31/2007 | KA19 | 83 | COLD SWEAT, CLAMMY 20 MINS | Other |
Male | 05/31/2007 | AB25 | 41 | BREATHLESS, HEAVY COLD, SORE THROAT | Colds and flu Difficulty breathing |
Gender . | Date . | Postcode district . | Age . | Call reason . | Classification . |
---|---|---|---|---|---|
Male | 05/31/2007 | KA19 | 83 | COLD SWEAT, CLAMMY 20 MINS | Other |
Male | 05/31/2007 | AB25 | 41 | BREATHLESS, HEAVY COLD, SORE THROAT | Colds and flu Difficulty breathing |
The flexibility of the method, by which the free-text searching algorithm interrogates the call reason field of the data, allowed a new pseudosyndrome to be created which identified those calls mentioning the key word ‘swine’. This term was of importance as influenza A H1N1v was most commonly reported in the media as ‘swine flu’, at the beginning of the pandemic.
2.3 Prediction model
In most ERSs some of the historical data are used to predict one or more future values. The observed future values are then compared with the predictions and if the observed value is sufficiently greater than the predicted value an exception is highlighted. This exception will then trigger an action by the epidemiologist or public health scientist to investigate. Reviews of ERSs have been given by Sonesson and Bock (2003), Farrington and Andrews (2004), Burkom et al. (2005), Buckeridge et al. (2005) and Shmueli and Burkom (2010).
The prediction model of the NHS24 ERS is based on the overdispersed Poisson linear regression approach of Farrington et al. (1996). This model was created to process weekly micro-organism counts collated by the Communicable Disease Surveillance Centre in England and Wales and to highlight possible disease outbreaks. This model made use of the trend in the counts in historical data spanning a 7-week period (centred on the current week of interest) over the previous 5 years. Such an approach aimed to produce a routine automated surveillance system which is timely, with good sensitivity and specificity and has readily interpretable outputs.
Examining the time series of total calls received to NHS24 over a typical non-pandemic period in Fig. 2, it is clear that there is periodicity in the data with steady call numbers Monday–Friday and increases at the weekends when NHS24 is more heavily used as an out-of-hours service. Call numbers are also increased at public holidays, such as Easter and May Day illustrated. Adapting the model of Farrington et al. (1996) to include separate effects for weekdays, weekends and public holidays allows for the development of a statistical model to predict the number of calls.

Time series of NHS24 calls in the period March 1st–May 31st, 2008, illustrating the typical periodicity in the call numbers with call numbers of around 2000–3000 calls on weekdays and 6000–7000 calls on weekends: the effects of public holidays at Easter and May Day can also be seen with call numbers increased further
At the development of the system (Robertson, 2006; Meyer et al., 2008) there were limited historical data—less than 6 months—to develop a more detailed prediction model in time for the G8 conference and the use of 28 days of data was a pragmatic choice based on guidance from the Centers for Disease Control Working Group (2004). For a given syndrome the count for day i is modelled by using the data from day i−4 to day i−31. The preceding 3 days are excluded to limit the effect of large fluctuations or shocks, possibly attributable to random variation or the initial slow increase of an exception, which may bias the predicted values.
Mathematically for each syndrome, using the moving baseline data the expected count on day i, μi, is modelled as a linear regression on time with separate intercepts for weekdays and weekend days:
where βi is the linear trend over the 28-day baseline and weekday takes integer values from −4 to −31 over the 28-day period, αi is the intercept corresponding to day i, γi is the effect on a Saturday, or Sunday or public holiday and yi is the observed count on day i which is modelled as an overdispersed Poisson distribution, with dispersion parameter
with the weight and
where ri are the standardized Anscombe residuals when φ = 1 (Farrington et al., 1996).
For each syndrome, given the observed number of calls from the data and the expected number of calls from the prediction model and taking into account overdispersion to calculate an upper 99% prediction limit, the exceedance score is calculated according to the method of Farrington et al. (1996). Exceedance scores greater than 1 trigger an alarm marker indicating that the level of calls observed for that syndrome on that day exceeds what would be expected 99% of the time.
If a particularly large outbreak occurs, leading to increased call counts, then the increased counts become part of the data to model the trend, albeit downweighted according to the inverse of the squared residuals. If the downweighting is not sufficient this can lead to a lag period when the model is unusable. This can be countered by simple adaptions to the baseline data that are used for predictions, and potential ways to do so, including the use of historical data which have accumulated in years subsequent to model development, are considered. Use of historical data is centred 28 days on day i in the previous 2 years. This model is entitled ‘two years historic’. An adaption of this uses the same historical data but also uses data from days i−4 to i−31 to include recent trend, which is entitled ‘two years historic plus 28 days’.
More complex prediction models have been investigated: Box–Jenkins methods (Box and Jenkins, 1976); incorporating a temporal trend through Holt–Winters smoothing (Holt, 2004; Winters, 1960); incorporating spatial trends in adjacent health boards. There was a marginal improvement in the accuracy of prediction by including a Holt–Winters trend in the prediction model but this was not so great as to invalidate the simple model that was described above (Wagner, 2010).
2.4 Data reporting
At each daily update, summary plots showing the time series and alarms that were generated for each health board–syndrome combination are produced along with a detailed list of the alarms triggered. The system is coded in R version 2.12.2 (R Development Core Team, 2011) and is fully automated to produce all outputs. These results are then disseminated to the relevant public health consultants and epidemiologists at Health Protection Scotland. From this information during the pandemic, daily and weekly situation reports were produced (Health Protection Scotland, 2009a, b). These were distributed to health boards and to the Scottish Government with the NHS24 system providing a component of a suite of surveillance which aided the understanding of the current infection burden of influenza within the community.
Subsequently to the pandemic, daily extracts of data are still received. The system is run weekly and is used in the weekly published influenza update reports (Health Protection Scotland, 2011a). During peak influenza activity and during times of public health importance, such as the papal visit to Scotland in September 2010 and the Icelandic volcano eruption in spring 2010 (Elliot et al., 2010), the system is run daily and the results analysed by consultants at Health Protection Scotland.
3 Results
3.1 Influenza A H1N1v pandemic
The time trend in the number of calls classified as the colds and flu syndrome, by the free-text search algorithm over the initial phase of the influenza pandemic, is shown in Fig. 3. Three distinct sets of concurrent alarms, where the observed number of calls exceeds the upper limit of what the model predicts as expected marked by the black dots, can clearly be seen in the observed colds and flu time trend at the Scotland level. The dates of the first two sets of these concurrent alarms correspond to four distinct events as identified during the containment phase; the reporting of the UK index case in Falkirk in NHS Forth Valley and three clusters of infection, one in Dunoon in NHS Highland and outbreaks in Greenock and in Glasgow both in NHS Greater Glasgow and Clyde.

Cold and flu syndrome calls at (a) a national level and in (b) NHS Forth Valley, (c) NHS Highland and (d) NHS Greater Glasgow and Clyde health boards monitored over the period April 1st–July 31st, 2009, following the UK index case at the end of April: , count of colds or flu calls received each day to NHS24 as classified from the call reason data;
, expected number predicted from the model;
, predicted upper limit with a 1% level of significance;
, ‘alarms’, i.e. points where the observed count exceeds the expected upper limit
Examining the temporal trend in the proportion of calls to NHS24 which mention cold or flu as the reason for the call as shown in Fig. 4, it is clear that there is a change point in the time series which corresponds to the confirmation of the UK index case of infection on April 27th, 2009. Analysis of the call records on April 27th–30th, when the first four alarms were triggered, show that of the 849 calls classified as colds and flu 592 callers mentioned swine in their call, indicating a possible association with the media reporting of a swine flu outbreak. Of those who mentioned swine, 64.5% did not go through the call algorithm process, indicating either a medical emergency or that the protocol was not required as only information or advice was given. These callers were found to be less likely to have a protocol diagnosis than those not mentioning swine (64.5%versus 30.4%; p ⩽ 0.0001), possibly indicating that a high proportion of these calls were of an advisory nature. As levels of infection in the community were low at this point and the majority of the calls mentioned swine, the first group of alarms can be attributed to the effect of the ‘worried well’ responding to media reports and official advice to contact NHS24 if concerned. This signal is evident not just in the Forth Valley health board, where the first cases lived, but also in the other two health boards illustrated—Highland and Greater Glasgow and Clyde—and over all Scotland, emphasizing that this exception is most likely to be due to the ‘worried well’ rather than those who have been in contact with symptomatic individuals.

Proportion of all calls received by NHS24 which mention colds or flu in the call reason field from October 2008 to April 2010 (): the first UK confirmed case of influenza A H1N1 was confirmed on April 27th, 2009
The second sequence of alarms corresponds to the time of an outbreak in the Dunoon area which is in the NHS Highland health board and is illustrated in Fig. 3(c). In this case the alarms are likely to have been triggered by both actual cases in the area and the ‘worried well’ in the area. The source of the outbreak was found to be linked to a football supporter’s bus which travelled to and from a football game on May 24th. Two individuals travelling were confirmed to be infected with influenza A H1N1v at the time of travel and subsequently infected 16 of the remaining 27 passengers. This event was followed by a rapidly growing number of cases within the town and the surrounding area. Health Protection Scotland received notification of the initial confirmed cases on May 29th (Health Protection Scotland, 2011b). The first exceedances that were found by the system were on May 27th and May 30th for colds and flu. These alarms were triggered before the exclusive news report by Scottish Television (2009) on May 31st which broke the story and the official press release from the Scottish Government on June 1st (Scottish Government, 2009a) and was 7 days before the peak of the call numbers in this area. Furthermore the signal is most evident in the Highland health board. Therefore these exceedances cannot be attributed solely to media influencing the ‘worried well’.
In addition to the alarms that are signalled in NHS Highland, there are also alarms signalled in NHS Greater Glasgow and Clyde, the first of which occurs on May 13th. This large-scale alarm followed three confirmed cases in a family living in Greenock (part of NHS Greater Glasgow and Clyde) which led to the precautionary closure of a nursery on May 12th (Scottish Government, 2009c) and a primary school on May 13th (Scottish Government, 2009d). Of the 109 cold and flu calls received from NHS Greater Glasgow and Clyde on May 13th, analysis of the postal code of call origin indicated that 69.7% (76/109) of these calls were from Greenock postal codes and 31.5% (24/76) of these were for children under 11 years old (primary school age and below). Of the callers from the Greenock area with cold and flu symptoms on May 13th, 93.5% mentioned swine in their reason for the call. Analysing the protocol diagnosis for all 76 callers, 17% had a colds or flu protocol diagnosis, 14% had an influenza-related protocol diagnosis of coughs, fever or difficulty breathing and 63% had no protocol diagnosis. Taken together, this alarm indicates an effect of the ‘worried well’ alongside true infection in the community.
The scale of the alarm signalled on May 13th in NHS Greater Glasgow and Clyde led to the model’s overpredicting colds and flu call numbers in the 2 weeks which followed. During this time there was a cluster of infection in the south of Glasgow, again leading to a precautionary school closure with news releases on May 26th, 27th and 28th (Scottish Government, 2009e, f, g)). During this period increases in the number of calls are observed but no alarms are signalled. The lack of alarms is due to both the overpredictions of the model and a smaller effect of the ‘worried well’. On the three days highlighted, 179 cold and flu calls were received from NHS Greater Glasgow and Clyde and only 50% of these made reference to swine, which is less than the proportion that was observed for the Greenock cluster earlier in the pandemic. The proportion with a colds and flu protocol diagnosis was also greater (28.5%) and less had no protocol diagnosis (26.8%). This may be indicative of a calming in the effect of the ‘worried well’ as the pandemic progressed.
At the beginning of June a series of alarms were triggered in NHS Greater Glasgow due to a gradual increase in the number of cases, as seen in the temporal maps in Fig. 5, coupled with the model’s overprediction subsiding.

Spatial spread of laboratory-confirmed cases of influenza A H1N1v during the containment phase (points are proportional to the size of the cluster in that area relative to the final number infected at the end of the containment phase): (a) May 27th, 2009; (b) June 3rd, 2009; (c) June 10th, 2009; (d) July 1st, 2009
At the end of the initial phase, as illustrated in Fig. 5, all mainland health boards had confirmed cases with the majority in NHS Greater Glasgow and Clyde, which is the most populated health board. The final set of alarms, after the initial phase ended, can be seen in the three health boards illustrated and at the national level. This is indicative of the spread of the pandemic over a wide geographical area in July which can be seen in the increasing rates of consultation of GPs in this period (Health Protection Scotland, 2009a).
3.2 Seasonal influenza
The previous section has illustrated that the NHS24 ERS can respond in a timely fashion to outbreaks of H1N1v influenza in local areas. The volume of the calls is greater than the number of people who were symptomatic during the outbreak and so represents a combination of sick and ‘worried well’ calling. In Fig. 6 we illustrate the correspondence between the system and the time trend in GP consultation rates outside a pandemic period.

(a), (c), (e), (g) NHS24 ERS output and (b), (d), (f), (h) GP consultation rates for influenza seasons (a), (b) 2006–2007, (c), (d) 2007–2008, (e), (f) 2008–2009 and (g), (h) part 2010–2011(note the change in date scale): , dates where NHS24 signalled an alarm for the colds and flu syndrome at the national level;
, observed;
, expected;
, upper limit
In the winter season 2006–2007 the NHS24 ERS has a few alarms in mid-October and the beginning of November corresponding to the rise in the GP consultation rates for influenza-like illnesses and acute respiratory symptoms (Figs 6(a) and 6(b)). Alarms are also signalled in the period from mid-December to the end of December and this precedes the sharp rise in consultation rates beginning in late December and peaking at the end of the first week in January 2007. In this season there is some temporal evidence that the syndromic surveillance system signals slightly before the effect on the health service.
Season 2007–2008 did not have such a sharp peak in GP consultations as season 2006–2007 but the peak again occurs in the first week of January 2008 with an earlier rise in the rates in late November 2007 (Figs 6(c) and 6(d)). The NHS24 ERS signals during the second week in
November and again in the last week of December 2007. Both of these periods precede the rise in consultations by about 1 week.
Figs 6(e) and 6(f) show that the agreement is not so clear in season 2008–2009 where the peak in GP consultations appeared in the week before Christmas 2008. The NHS24 ERS signals four times in November and December, which precede the increase in consultations in the week before Christmas. There are another two signals in late December and early January before the second peak in the second week of January 2008.
Season 2009–2010 was the pandemic season and did not have the same winter pattern of a peak around Christmas and New Year (the data are not presented). In the season 2010–2011 there were two signals before the first peak in late November 2010 but there were no signals to herald the beginning of the peak in GP consultations which came just before Christmas 2010 (Figs 6(g) and 6(h)). All the signals after this came when GP practices were closed for the Christmas holidays.
The comparison of the signals from the NHS24 ERS on colds and flu calls in relation to the GP consultation rates has demonstrated some consistent agreement with the ERS’s signalling about 1 week before the major rise in GP consultation rates; thus the system provides some early warning. However, this was not always so and in season 2010–2011 the major peak was not signalled through NHS24 colds and flu calls.
3.3 Effect of variation of the 28-day baseline
The results that were presented in previous sections are based on using 28 days of data to make predictions. Fig. 7 summarizes the results generated in the initial stages of the pandemic for the colds and flu syndrome over all Scotland by using different baseline data.

Comparison of NHS24 ERS output for the cold and flu syndrome calls across all Scotland over the initial period of the influenza A H1N1 pandemic, April 1st–July 31st, 2009, using different baseline data to make predictions (, observed;
, expected;
, upper limit): (a) 28 days centred; (b) 28 days centred, outbreak adjusted; (c) 2 years historic; (d) 2 years historic plus 28 days
In the original implementation of Farrington’s model (Farrington et al., 1996) historical data from the same time in previous years were used. Results by using 2 years’ historical data centred 28 days on the equivalent date in previous years to that being predicted are shown in Fig. 7(c). If using only historical data, alarms are consistently generated over the entire period considered because the call volume was considerably higher than in previous years. An adaption of the baseline, incorporating the previous 28 days with the historical data, is shown in Fig. 7(d). Such an adaption leads to an upward trend in expected call numbers and less volatile predictions, following large changes, than using the 28 days’ data alone. This leads to fewer alarms being generated in the period between the initial media reporting of the UK index cases and the first ‘true’ outbreak at the end of May.
The period of overprediction in the original model following the media reporting of the outbreak could also be countered by using a more manual adaption without the need for historical data. By using a manually coded dichotomous variable to mark whether an outbreak, or in this case an unprecedented high volume of cold and flu calls, is occurring the baseline 28 days data used can be altered to exclude the ‘outbreak’ data and to use the preceding 28 days ‘non-outbreak data’. The results of this implementation with outbreaks defined as occurring between April 27th–May 11th and May 30th–July 31st are shown in Fig. 7(b). In this case, the model has no delay in usability ‘post outbreak’ and predicts a greater upward trend than the model using 2 years historical and 28 days data.
4 Discussion
The NHS24 syndromic surveillance system successfully detected exceedances in syndromes related to influenza A (H1N1v) in advance of media reporting. This system was operating during the initial phase for influenza A (H1N1v) whereby possible cases, who had known contacts with a confirmed case and were exhibiting symptoms of the virus, were given relevant treatment and advice and reported by either their healthcare practitioner and/or NHS24 call operatives to the emergency response team at Health Protection Scotland. In this situation the alarms that were triggered by the system did not require to be investigated as possible cases had already been identified. The system did, however, provide useful information on the trends occurring within an area and in particular reassurance with the lack of exceedances in areas where outbreaks were not known to be occurring.
Following the initial ‘containment’ period, which ended on July 1st, 2009, the flexibility of the system has proved useful in providing outputs not only to highlight areas where influenza A (H1N1v) may be increasing and to inform the relevant health boards of this, but also to provide a general understanding of both the levels of concern in the community via calls by the ‘worried well’. Monitoring of calls which mentioned swine not only gave an indication of the ‘worried well’ but also extraction of the raw call data and in particular the geographical location of the call aided epidemiological investigation during the containment phase. The value of the entire system in understanding the spread of influenza A H1N1v in the community is discussed by Health Protection Scotland (2011b).
In a routine surveillance situation without epidemic spread, one of the main criticisms of syndromic surveillance is that there is generation of non-specific signals or exceedances which may result in investigative action being taken unnecessarily (van den Wijngaard et al., 2008). As exceedances in this system are generated at a nominal 1% level of significance this indicates that there is a 1% probability that the observed number of calls is greater than the upper expected number of calls due to chance alone, i.e. due to natural variability in call numbers (excluding any day of the week or holiday effect). This level of significance is dependent on the weighting used, and in cases where the baseline values are downweighted the value of the type I error will be higher. Such an exceedance occurring due to chance could therefore be classified as a false positive alarm.
On a given day of monitoring NHS24 by Health Protection Scotland each of the 11 syndromes is examined for 11 health boards and this is repeated for both the reason for the call and the protocol data. This means that 242 exceedance tests are conducted each day, giving approximately two expected false positive out of all the health board–syndrome combinations per day and approximately 14 per week. However, there is no way to identify which exceedances are false positive and which are not and for this reason exceedances of note are generally looked for. A modification of the output could be to adjust the exceedances according to the false discovery rate approach that was outlined in Jones et al. (2008). Such an approach would favour the most extreme exceedances but is problematic for syndromes or health boards where the counts are low and small variation could lead to large alarms which may have very little epidemiological meaning. For the context of this surveillance it is therefore better to err on the side of caution and to investigate possible false positive signals rather than increasing the exceedance limit.
The issue of false positive alarms is not restricted to multiple-testing issues. The results presented clearly illustrate the influence which media reporting and the ‘worried well’ can have on the system whereby alarms are generated but not linked to any cluster of disease. False positive alarms occurring because of the ‘worried well’ are more likely in the call reason data stream. For this reason the subset protocol stream of calls, where an over-the-phone diagnosis has led to a syndromic categorization of the call at NHS24, is also routinely monitored.
Modelling of public holidays is also problematic as areas within health boards (the smallest area considered) can have different local holidays which are not modelled leading to a greater number of calls than predicted in that area and hence false alarm generation.
The role of the statistician monitoring the system is to identify exceedances and any patterns occurring. Any action required falls to the judgement of the consultant in charge. The protocol for dealing with the investigation of these exceedances is based on various factors which include the nature of the exceedances, the magnitude of the exceedances and whether the exceedances are concurrent. The epidemiological determination of whether these exceedances warrant management as a signal indicative of further investigation then involves additional information on the detailed nature of the exceedances, e.g. whether confined to a specific age group or sex, locality or whether the individuals required hospitalization. A protocol that was developed by Meyer et al. (2008) which describes the process of management of any exceedances and identifies when subsequent investigation and escalation are required is routinely in operation at Health Protection Scotland.
Conversely, problems may also arise with the lack of alarms signalled in the time period following a large increase in the number of calls received by NHS24 for a particular syndrome, as indicated following the large exceedance in NHS Greater Glasgow and Clyde on May 13th. Under normal circumstances the weighting that is employed in the prediction model as instructed by Farrington et al. (1996) counters the effect of sporadic fluctuations in the baseline 28 days that are used to model the expected values. During pandemic spread, however, the alarm system becomes less efficient as, following a chain of alarms where the observed count is substantially greater than expected, the baseline 28-day data predict larger expected values and hence larger upper limits despite the downweighting that is employed. The effect of these large values is slow to diminish if there are only small downward changes in the observed data and the alarm system takes time to recover. During this time, the system is useful to track temporal changes in the data by eye but the alarm system is less sensitive unless large upward fluctuations occur again. It can, however, be postulated that once epidemic spread begins in an area the signalling of alarms is pointless as the problem is apparent without the surveillance system. In the case of wishing to detect small changes over time, a differing methodology, such as that based on control charts, may be preferable. For a full discussion of the various methods that are used in syndromic surveillance see Unkel et al. (2011).
In light of experience during the pandemic, alternative baselines have been considered. The current implementation is sufficient for small-scale outbreaks but may not be robust if large-scale changes occur. Results show that using only historical data limits the interpretability of fluctuations within an outbreak period as alarms are consistently signalled. In the absence of historical data, the overprediction period could be avoided by using the most current data before the outbreak as the baseline for predictions. This approach is, however, hindered by its manual nature and would require the outbreak period to be defined for a given syndrome in a given area, which introduces subjectivity to the process. A more appealing choice may be to use a combination of historical data and the previous 28 days’ data. Such an approach modelled the upward trend in the data well and avoided the overprediction spikes. Such an implementation would, however, require the omission of any outbreak periods, such as the pandemic, from the historical data to avoid inaccurate predictions. This could be introduced for future versions of the model.
Overall, the prediction model is simple and has been tested over the 4 years of operation of the system. Extensions to this model to take into account serial correlation, better trend estimation and spatial associations have been investigated (Wagner, 2010) and, although some of the extensions are useful, their effect is not great. The most important development is to add an estimate of the term trend by using a Holt–Winters estimate as a prediction term in the regression model, which reduces the effect of serial correlation and increases the prediction accuracy.
The system has proven to be timely with the ability to update the system with a lag of one reporting day ensuring that up-to-date information regarding the current situation can be disseminated rapidly to NHS boards and local health protection teams.
5 Conclusions
The impetus to be able to make use of all calls to NHS24 rather than a possibly unrepresentative subset led to the development of a syndromic surveillance system based on a call reason interrogation algorithm. This system has been routinely used weekly to monitor influenza-like illness since 2006 and daily surveillance began when the influenza A H1N1v pandemic was declared, providing a timely up-to-date view of the picture in the community to decision makers.
The ERS highlighted the cluster of influenza A H1N1v in the Highland health board and provided an indication of public perception of the outbreak, i.e. the ‘worried well’. The flexibility of the call reason interrogation algorithm allowed the analysis to be tailored to consider terms that are specific to influenza A H1N1v and to extract relevant data for further epidemiological investigation.
The system has continued to be used throughout season 2009–2010 daily and again in season 2010–2011 weekly. In addition to tracking influenza-like illness in the community the system can also give reassurance that something is not happening as was the case during the summit of G8 countries (Meyer et al., 2008). More recently the difficulty breathing syndrome was tracked to give reassurances that the community was not experiencing any ill effects due to the volcanic ash cloud which was present over the UK following the eruption of the Icelandic volcano (Elliot et al., 2010). It has also improved awareness during the recent papal visit to Scotland and during an outbreak of legionnaire’s diseases in 2011.
The usefulness of the system, with an understanding of the caveats due to the influence of the ‘worried well’ and the media, has led to the system’s forming part of the suite of surveillance provided to decision makers in government to help to inform the community response to the influenza A H1N1v pandemic and has continued to be used in season 2010–2011 to monitor seasonal influenza.
The system is not static and is continuously developed. Future developments will implement methods discussed which can improve predictability in the weeks after an outbreak occurs and also to consider modelling spatial correlation as discussed by Wagner (2010). The system is not specifically used for detection of outbreaks, and indeed there is no guarantee that all outbreaks will be detected but the main strength of the system is as a general monitoring system which along with other routine surveillance systems that are used at Health Protection Scotland improves situational awareness and aids rapid investigation.
Acknowledgements
We thank NHS24 for supplying the data, the staff in the respiratory team at Health Protection Scotland for providing epidemiological context and finally the referees for their insightful comments on ways to improve the system.