Appendicitis in children with acute abdominal pain in primary care, a retrospective cohort study

Background: General practitioners (GPs) face a diagnostic challenge when assessing acute abdominal pain in children. However, no information is available on the current diagnostic process or the diagnostic accuracy of history and physical examination in primary care settings. Objective: To describe the diagnostic process for acute abdominal pain among children in primary care, focusing on appendicitis, and to assess the diagnostic accuracy of individual clinical features. Methods: A retrospective cohort study in Dutch primary care, using the Integrated Primary Care Information database. Children aged 4–18 years were included if they had no history of appendicitis and presented with acute abdominal pain during 2010–2016. We evaluated GP management and the diagnostic accuracy of clinical features for appendicitis. Pre- and post-test probabilities were calculated for each clinical feature and compared with the probability of appendicitis after GP assessment. Results: Out of 5691 children, 944 (16.6%) were referred and 291 (5.1%) had appendicitis, of whom 55 (18.9%) were initially misdiagnosed. The pre-test probability (i.e. of appendicitis in evaluated children) varied from 3% (rigidity) to 28% (migratory pain). Concerning post-test probabilities, positive values for rebound pain (32.1%) and guarding (35.8%) GP assessment. Approximately 1 in 20 of the included children was diagnosed with appendicitis, one in five were initially misdiagnosed, and one in four were ultimately referred to the hospital. We show that some signs and symptoms were not particularly useful for assessment, but when they were, signs detected by the GP examining the patient were more useful than symptoms reported by patients or parents. We recommend that GPs provide safety netting advice and examine the abdomen.

GP assessment. Approximately 1 in 20 of the included children was diagnosed with appendicitis, one in five were initially misdiagnosed, and one in four were ultimately referred to the hospital. We show that some signs and symptoms were not particularly useful for assessment, but when they were, signs detected by the GP examining the patient were more useful than symptoms reported by patients or parents. We recommend that GPs provide safety netting advice and examine the abdomen.
Key words: Appendicitis, child, Diagnosis, electronic health records, observational study, primary health care

Background
Appendicitis is a potentially serious, but relatively uncommon, presentation in primary care (1,2) and has a prevalence of 4.4% among children with abdominal pain (3). It is a diagnostic challenge for general practitioners (GPs) to differentiate appendicitis from self-limiting conditions. This is not only because clinical symptoms and signs may overlap between appendicitis and self-limiting conditions but also because there is potential for perforation and fatal peritonitis (4). When doubt remains, children may need reassessment or referral to secondary care for further diagnostic work-up, and the GP must further balance the need for timely referral of children with appendicitis against the potential risks of unnecessary referral. Accurate clinical assessment is, therefore, key to this process (5).
In secondary care, clinical prediction rules have been developed and validated that have sensitivities and specificities of 72%-100% and 34%-98%, respectively (6,7). There is some consensus that the Alvarado score, consisting of clinical features and inflammatory markers, can be used in the emergency department to exclude appendicitis (8). Dutch and British guidelines each detail clinical features that should increase the suspicion of appendicitis and prompt referral to secondary care (3,9). However, neither the diagnostic value of these features nor the available clinical prediction rules have been evaluated among children in primary care. Improving our insight into the diagnostic process and the value of clinical features in this setting could improve the detection of appendicitis.
We aimed to describe the diagnostic process for children with acute abdominal pain in primary care, focussing on appendicitis, and to determine the diagnostic value of clinical features. Registration data in primary care provides an invaluable resource for this purpose.

Study design
We conducted a retrospective cohort study of children presenting with acute abdominal pain in primary care between November 2010 and November 2016, using electronic medical records. Data were collected from the Integrated Primary Care Information database (IPCI), which is a longitudinal primary care database managed at the Erasmus Medical Centre, Rotterdam, the Netherlands.
The IPCI database contains complete pseudonymised medical records for 1.5 million patients from 600 Dutch GP practices and complies with European Union guidelines on the use of medical data for research (10,11). Participating practices supply medical records to the IPCI each year for research purposes only and without reimbursement. Once pseudonymised in the IPCI database, these data are shared with researchers after further approval by participating GPs. Clinical data are entered by GPs in the medical records in free text format. To enhance data quality, they are encouraged to code the clinical data using their record systems (10,11). The IPCI database contains data from six software platforms, and we used data from the three that contained the most complete specialist reports (368 practices, 32.9% of the total IPCI database). In the Netherlands, secondary care specialists are required to report every patient contact to a GP, regardless of whether the child was referred by the GP.
We aimed to comply with the Standards for the Reporting of Diagnostic Accuracy Studies (STARD) (12).

Study population
Children aged 4-18 years with acute abdominal pain were eligible. We selected the first contact with an International Classification of Primary Care (ICPC) digestive tract symptom or diagnosis code (i.e. D01-D99) and abdominal pain mentioned in the free text. Four medical students trained in coding then screened the medical records and assessed patient eligibility for inclusion based on the free-text entries. Children with abdominal pain lasting >7 days at presentation or with a history of prior appendicitis or appendectomy in their medical records were excluded.

Outcome measures
We evaluated management by GPs and the diagnostic accuracy of clinical features for appendicitis. Because GPs do not necessarily want to identify appendicitis, but rather want to identify children who need an urgent referral, we did an additional analysis for the outcome emergencies requiring referral. Relevant data were extracted automatically or by the coders from free-text entries, using standardised forms and instructions. Whenever doubt occurred, the case was discussed with an experienced GP (C.G.H. Blok).

Key Messages
• GP assessment will miss almost one-fifth of children with appendicitis. • About two-third of all GP referrals will not have appendicitis.
• Clinical signs provide greater diagnostic accuracy than clinical symptoms. • Careful physical examination can improve the diagnosis of appendicitis.
• Safety netting and reassessment can improve the timely diagnosis of appendicitis.

GP management
This concerned the GPs decision to refer the child to secondary care during the first consultation, to refer after a planned reassessment or not to refer. The GP diagnosis was reported as the ICPC code during their consultation. Referral details were extracted from free-text entries and ICPC codes were retrieved automatically.

Clinical features
We looked for clinical features and tests indicative of appendicitis based on existing clinical prediction rules (Supplementary Data 1) (13)(14)(15)(16)(17)(18)(19)(20). Demographic data, body temperature, C-reactive protein (CRP) levels and white blood cell (WBC) counts were retrieved automatically. Data concerning symptoms, signs and imaging were extracted by the coders. Symptoms and signs were then recoded as 'present', 'absent' or 'not recorded' (Supplementary Data 2). GPs in the Netherlands do not routinely measure neutrophil counts, so this was not recorded.

Diagnosis of appendicitis
Confirmation of appendicitis was based on the operation or imaging report, as relayed by a secondary care specialist. The absence of appendicitis was based on either the specialist's report or the GPs medical records during a 6-week follow-up period. Children with other emergencies warranting immediate referral were identified. Two experienced GPs resolved any cases in which there was diagnostic uncertainty.

Statistical analysis
GP management was analysed by descriptive statistics. Each clinical feature was evaluated among different groups of children based on whether that clinical feature was present or absent (i.e. no assumptions regarding the outcome were possible if a feature was missing). Assuming that GPs did not evaluate features with no impact on management decisions, this should have provided insights into their diagnostic reasoning. We then calculated the pre-and post-test probabilities of appendicitis for each group and used dumbbell plots to display changes graphically (21). Pre-test probability was calculated as the prevalence of appendicitis among children in whom the feature was recorded, and post-test probability as the probability of appendicitis if the feature was present (positive predictive value) or absent (one minus the negative predictive value). Next, we assessed the relative change in probability for appendicitis based on the positive (LH+) and negative (LH−) likelihood ratios for appendicitis given each clinical feature (22). Likelihood ratios and probabilities were calculated with 95% confidence intervals (95%CIs). An additional analysis in which we calculated likelihood ratios for the outcome: emergency (including appendicitis) was performed. We used IBM SPSS version 24.0 (IBM Corp., Armonk, NY) for all analyses.

Study group
In total, 15 607 children had one or more contacts with a GP for abdominal symptoms, and we included 5691 who presented with acute abdominal pain, during the study period. Of these, 291 (5.1%) had appendicitis and 52 (17.9%) had perforation. The prevalence of appendicitis was higher among boys (6.8%) than girls (3.6%) and among older children (9-12 and 13-18 years, both 6.9%) than younger children (4-8 years, 2.2%). The features of each group are summarised in Table 1. Figure 1 shows the flow of referrals and diagnosis for the 944 children (16.6%) referred to secondary care. Among these, 798 (84.5%) were referred at their initial presentation and 236 had appendicitis (37 also had perforation). Of the 4893 children not referred immediately, 55 were later diagnosed with appendicitis (15 also had perforation). Therefore, GPs initially failed to diagnose 55 (18.9%) children with appendicitis and only six (11%) of these had a reassessment planned.

Management and assessment
The prevalence of appendicitis was 18.7% among children aged 4-8 years referred directly, compared with 37.6% and 30.0% among those aged 9-12 and 13-18 years, respectively. Perforations were more common in children aged 4-8 years (36.2%) than in those aged 9-12 years (24.0%) and 13-18 years (7.1%). Of all children referred during the first consultation, 37 (4.6%) had a perforation. The GP recorded ICPC code D88.00 for appendicitis in 81 referrals where the diagnosis was not confirmed by a specialist. The ICPC code recorded most by GPs was D01.00 (generalised abdominal pain/abdominal cramps), being recorded in 1884 cases (33.1%). Other diagnoses that required treatment or referral (1.2%) are detailed in Table 2.

Diagnostic accuracy of clinical features
Among the symptoms, the pre-test probability of appendicitis was lowest for anorexia (5.8%) and highest for pain migration (27.7%), corresponding with the prevalence of appendicitis in children evaluated for these symptoms. By contrast, the lowest and highest pre-test probabilities among the signs were for rigidity (2.7%) and pain on movement (17.0%), respectively. The dumbbell plots (Figure 2) show that guarding and rebound tenderness increased the probability of appendicitis to 35.8% (95%CI: 29.8%-42.3%) and 32.1% (95%CI: 27.2%-37.4%), respectively. Both were superior to the overall positive post-test probability of GP assessment, which was 29.6% (95%CI 27.6%-31.6%), with reference to the percentage referred during the first consultation who had a confirmed diagnosis of appendicitis. The probability decreased to 0.6% (95%CI: 0.3%-1.1%) in the absence of right lower quadrant (RLQ) tenderness, which was below the negative post-test probability of 1.1% (95%CI: 0.9%-1.4%) for GP assessment overall. Pain in the RLQ and symptom duration <24 hours had the highest LR+ (1.61), and pain in the RLQ had the lowest LR− (0.22). Of the clinical signs, guarding (13.1) and rigidity (10.44) had the highest LR+, whereas tenderness in the RLQ had

Summary
Appendicitis was the most common condition necessitating referral (5.1%) among children with acute abdominal pain, even when compared with other important conditions (1.2%). Furthermore, 653 (69%) referrals were ultimately not diagnosed with appendicitis, while 55 (19%) children with appendicitis were not referred when they first presented and only six (11%) of these had a reassessment planned.
The diagnostic accuracies of individual symptoms tended to be lower than those for signs like guarding, rigidity, rebound pain and RLQ tenderness. Thus, clinical suspicion depended on the GP eliciting signs on physical examination. The dumbbell plots also show that GPs recorded different clinical features depending on the probability of appendicitis (low or high), indicating selective information gathering depending on their assessments. This is consistent with the pragmatic approach of only collecting information until further data will not affect management decisions (23). These findings also support the assumption that GPs make the diagnostic process more efficient by relying on features with the strongest predictive values.

Risk of bias
We used routine data to evaluate the diagnostic process in children with acute abdominal pain who present to primary care. The IPCI database benefitted from containing data for many patients, thereby allowing sufficient cases to be identified in a low prevalence setting, which is an option that would be very difficult in a prospective cohort study (24). The use of a dumbbell plot also provided insight into the recording of clinical features by GPs, given that they carry useful information that may affect the predictive value (25). However, four important methodological problems arose from using routine data (23). First, it was challenging to select the study population based on clinical features alone. That said, the ICPC system did allow for an adequate study population to be selected based on symptoms and diagnoses, rather than coded diagnoses alone. Children were also excluded if they had an ICPC code related to the urinary tract, and because dysuria may be a symptom of appendicitis, this should be considered a limitation.
Second, the diagnosis of appendicitis was based on different reference standards (i.e. assessment by a secondary care specialist and outcomes at 6 weeks' follow-up), and indeed, some children in the follow-up group may have had spontaneously resolving appendicitis. Therefore, we cannot exclude verification bias (26,27). In the event of an unrecognised self-limiting appendicitis, surgery would be associated with more harm than benefit, and as such, we believe this verification bias has a limited clinical impact (4).
Third, extracting variables from free text is challenging because it may contain gradations that require interpretation. In turn, this may be influenced by knowledge of the outcome of the reference test, which was also available in the free text, producing information bias that may lead to an overestimation of diagnostic accuracy (28). We, therefore, employed four measures to reduce this bias: (1) coders used standardised forms and instructions, (2) coders were encouraged to assess the free text independently of the outcome, (3) predictors were discussed when in doubt and (4) the appendicitis outcome was reassessed by an expert panel. Given the high number of cases incorrectly coded as appendicitis (ICPC D88.00), we consider the free text analysis to have been an indispensable supplement to the ICPC codes (29).
Fourth, data about symptoms and signs were missing for many cases. Given that the recording of clinical information is known to be influenced by the perceived probability of disease, we assume that there was bias due to diagnostic suspicion, which often leads to overestimations of diagnostic accuracy (30). Therefore, the diagnostic accuracy is not applicable to all children with acute abdominal pain because the pre-test probability of each symptom and sign varied and the diagnostic accuracy was not evaluated in all such children. Finally, many of the diagnostic features had too many missing values to further analyse their diagnostic value using a logistic regression model.

Comparison with existing literature
The prevalence of appendicitis in our study (5.1%) was consistent with that in a Dutch cohort with acute and chronic abdominal pain (4.4%) (3). After excluding minor ailments (e.g. gastroenteritis and constipation), other important causes of acute abdominal pain were rare (1.2%), including hernia (0.18%) and pyelonephritis (0.09%). Among those who were eventually referred, the prevalence of appendicitis was 30.8%, which was low compared with reports from emergency departments (27%-72%) (31). Although children may have attended emergency departments for reasons other than appendicitis, the low prevalence in general practice suggests that GPs refer early when in doubt. Moreover, despite 19% of children with appendicitis being initially missed by GPs, this is comparable to secondary care (range 11%-28%) (32). Structured guidance or an algorithm could help clinical decision-making and reduce the number of children referred without missing cases. However, aside from a small descriptive study on the Alvarado score, clinical prediction rules for appendicitis have not been evaluated in primary care settings (33).
We report high LR+ and low LR− values compared with data from secondary care (34)(35)(36). For instance, the LR+ of 13.1 for guarding was high compared with previously reported values of 2.07-2.48 (34,35), while the LR− of 0.08 for RLQ tenderness was low compared with previously reported values of 0.25-0.55 (34,35). This indicates that clinical features have different diagnostic accuracies in primary and secondary care, associated with differences in the patient spectrum (including disease prevalence and severity). To unravel the mechanisms behind these observations, further research is warranted (37,38).
Dutch guidance recommends evaluating 18 clinical features during clinical assessment, specifying acute presentation, fever, RLQ tenderness, signs of peritoneal irritation and abnormal bowel sounds as alarm features (3). British guidance for appendectomy in children also recommends that GPs assess similar features (9). However, although the Dutch and British guidelines include many of the same features, the Dutch gives greater weight to fewer alarm features. Irrespective of the nuanced differences, our results support both guidelines because they emphasise clinical signs over symptoms. We also showed that more young children were referred when the probability of appendicitis was low, but that the risk of perforation was high when appendicitis was present. This is consistent with the extra caution recommended by the British guidance for appendicitis in the young (9), based on the higher likelihood of atypical presentations and the higher risk of perforation (39).
CRP is a widely used biomarker in secondary care for children with acute abdominal pain (8), but its use in not recommended in primary care because its diagnostic value has not yet been evaluated in this setting (3). Although the test was performed in 10% of children presenting to a GP with acute abdominal pain in this study, our sample was too small to evaluate its diagnostic value. As such, there continues to be a lack of information about the optimal cut-off value for use in primary care. Additional studies are needed to evaluate if CRP adds value to the existing diagnostic process. Concerning its reduced use in cases with appendicitis, we can only speculate that CRP was not ordered in some children with severe appendicitis because they had a clearer clinical picture that warranted immediate referral.

Implications for research and practice
At the initial primary care consultation, GPs want to stratify the risk of appendicitis rather than necessarily making a final diagnosis. Omitting physical examination from the diagnostic process generally increases the risk of error (40), and consistent with this, we found that physical signs were better than symptoms at discriminating appendicitis from other conditions. Therefore, we emphasise the importance of a targeted physical examination to stratify the risk of appendicitis in children presenting with abdominal pain. Given that the residual risk of appendicitis was 1.1% in children not initially referred, and given that 19% of children initially went undetected, we also recommend that GPs always provide safety netting advice. This should include clear instructions to seek appropriate medical attention (e.g. call the GP) when in doubt or when symptoms increase (41). Planned reassessment may be a safe alternative to immediate referral in the young or in patients with alarm symptoms (3). However, further research is needed to elicit whether such strategies prevent unnecessary referrals while minimising the risk of perforated appendicitis. Given that some clinical features from existing clinical decision rules have diagnostic value, GPs may benefit from structured guidance with such aid, possibly including CRP. We propose these as topics for further research.

Conclusion
In a low prevalence setting, such as primary care, we stress the importance of a targeted physical examination that focuses on signs with the best predictive values: guarding, rebound tenderness and RLQ tenderness. The GP must also anticipate that appendicitis is easy to miss, and therefore, should always provide safety netting advice.

Supplementary material
Supplementary material is available at Family Practice online.