1,755
Views
13
CrossRef citations to date
0
Altmetric
Original Articles

Cohort profile: decoding the epidemiology of liver disease in Sweden (DELIVER)

ORCID Icon, , &
Pages 978-983 | Received 22 Jan 2022, Accepted 04 Mar 2022, Published online: 16 Mar 2022

Abstract

Background and aims

Swedish nationwide registries can be used to identify patients with a wide range of diagnoses. This information can be used to construct cohorts useful to determine prognosis and identify risk factors for disease progression. Here, we describe a new register-based cohort of patients with a diverse set of chronic liver disease diagnoses in Sweden.

Methods

The DELIVER (DEcoding the epidemiology of LIVER disease in sweden) was constructed using extensive data linkages between different Swedish registers, diagnosed between 1964 and 2016. Patients in DELIVER are matched 1:10 to reference individuals from the general population on age, sex, municipality and calendar year of first liver disease diagnosis. Longitudinal cross-linked data from several registers allow for identification of outcomes occurring before or after liver disease diagnosis. Further, since July 2005 all dispensed drugs can be identified.

Results

In total, 307 768 unique individuals with a diagnosis of a chronic liver disease since 1964 were identified, and these were matched with 3 067 714 reference individuals from the general population. As examples, DELIVER contains data on 90 948 patients with a diagnosis of viral hepatitis; 50 593 patients with alcohol-related liver disease and 13 242 patients with non-alcoholic fatty liver disease.

Conclusions

The DELIVER cohort can be used to examine several important research questions. Long-term outcomes of chronic liver diseases, risk factors for disease progression, impact of dispensed drugs, disease panorama and time trends are examples. Here we describe the construction and data availability of DELIVER.

Introduction

Data on the natural history of diseases are important in determining prognosis and the clinical course. Obtaining accurate data on the prevalence, incidence and outcomes of diseases can however be challenging. Most countries lack specific data sources to correctly identify specific diseases and to ascertain outcomes with low degree of bias. Common problems include selection bias, where only a certain type of patients can be included, such as studies from tertiary setting academic hospitals or from insurance databases. Another issue is often lack of appropriate follow-up time, which might induce further selection bias [Citation1], and is especially important in hepatology, given the many times long disease course of chronic liver diseases [Citation2, Citation3]. Further, many countries lack systems to track patients when they change care providers or move within the country.

Here, we describe a new register-based cohort from Sweden with data on more than 300 000 unique patients with liver diseases diagnosed between 1964 and 2016. This cohort can be used for a wide variety of research questions.

Introduction to national registries in Sweden

The Nordic countries, including Sweden, have a long tradition of keeping registries of aspects of the populations, starting in 1749 when the Swedish church was responsible for accurate classification of key causes of death [Citation4]. Several population-based registries are available with a diverse range of data that are generally accessible to Swedish researchers. The basis of the possibility to link personal data to these registries is the Personal Identification Number (PIN). This is a 10-digit unique number that was introduced in 1947 and is given to all Swedish inhabitants at birth or immigration [Citation5]. The PIN does not change during life, although a rare exception is in cases of emigration and subsequent re-immigration. The PIN is used to a large extent in daily life such as when applying for bank accounts or insurance, and always used in contact with public institutions such as when visiting health care.

Population-based data in Sweden are held by two major governmental institutions: the National Board of Health and Welfare (www.socialstyrelsen.se/en/) and Statistics Sweden (www.scb.se/en). The National Board of Health and Welfare is responsible for a wide range of health issues and holds several commonly used national registries accounted for here. The Causes of Death Register was initiated in 1911 and since 1952 holds electronic data that can be used for research [Citation4]. The register uses international classification of diseases (ICD) codes to classify underlying and contributing causes of death based on death certificate reporting from physicians. For instance, a person dying with hepatocellular carcinoma (HCC) and with coding for cirrhosis reported on the death certificate will have HCC recorded as the main cause of death and cirrhosis as a contributing cause. The register is more than 99% complete, although the validity might vary depending on factors such as age and cause of death [Citation4].

The National Patient Register was established for hospital care in 1964 in six Swedish regions, and full national coverage was obtained in 1987. The register initially covered only inpatient health care, corresponding to approximately 1.5 million hospital discharges per year [Citation6]. From 2001, the register also captures outpatient visits in specialized care, but does not include primary care. The register is based on ICD-coding made by the responsible physician and holds data on the primary cause of the hospitalization or outpatient visit, as well as contributing diagnoses. The responsible physician decides on what to define as the primary diagnosis, which can lead to challenges. For instance, a patient with both diabetes and nonalcoholic fatty liver disease (NAFLD) might have the liver disease considered as the primary diagnosis during a visit to a hepatologist, but diabetes as the primary diagnosis during a visit to an endocrinologist. This can partly be accounted for as the register also holds data on which hospital and which type of healthcare practitioner that reported the contact. Additionally, the register holds data on procedures, such as operations, endoscopies, and other interventions. The register has been externally validated and diagnoses are considered to have high positive predictive values (PPV) [Citation6]. It has been specifically validated for diagnoses corresponding to HCC, cirrhosis or decompensated cirrhosis with PPV:s over 90% [Citation7].

The Swedish Cancer Register was founded in 1958. The register receives data on incident cancers from two sources. First, the reporting physician submits a report to one of six regional oncology centers. Secondly, a report is sent from the diagnosing pathologist, who also adds data on morphology and tumor characteristics. The register is highly complete [Citation8] but there are a few important limitations. In particular, HCC is a type of cancer often underreported as there is a lower demand of biopsy. This can lead to underreporting, why data capture from other registers also is recommended [Citation9].

The Prescribed Drug Register is the most recently formed register, with data availability since July 2005 [Citation10]. The register is unique in the sense that it is not based on reporting from physicians, but from automated reports when prescribed drugs are dispensed at any Swedish pharmacy. Roughly 100 million transactions are reported each year. Drugs are classified according to the Anatomical Therapeutic Chemical (ATC) classification system (https://www.whocc.no/atc_ddd_index_and_guidelines/guidelines). The register does however not cover over-the-counter medications or medications given in in-hospital settings, such as TNF-alpha infusions, certain chemotherapies, or intravenous antibiotics. The register does not cover prescribed, but only dispensed prescriptions and does not contain specific data on the reason for the prescription which can be a limitation. A usual approach is to calculate Defined Daily Doses (DDD) of a drug to define exposure.

Statistics Sweden is responsible for general statistics in Sweden on a wide range of issues such as education, income, and sick leave. Statistics Sweden holds the Total Population Register, which contains data on for example date of birth and death, country of birth, marital status, as well as migration [Citation11]. This register is often used to ascertain date of death since it is updated more frequently than the Causes of Death Register but can also be used for matching a cohort to reference individuals. A separate register is the Longitudinal Integrated Database for Health Insurance and Labour Market Studies (LISA) which contains data on socioeconomic parameters including education, income, and occupations [Citation12]. The LISA register can also be used to identify sick-leave and early retirement.

summarizes the available data from the different registers and highlights the type of information that can be obtained from each register.

Figure 1. Overview of the available registries and timeline for data availability.

Figure 1. Overview of the available registries and timeline for data availability.

Identification of patients with liver disease

In 2017, we requested data from the National Board of Health and Welfare on all patients with an ICD-code reported in the National Patient Register corresponding to liver diseases listed in . The selection of these codes is in accordance with a recent expert panel consensus statement [Citation13]. The source population included the full Swedish population. The first date where a diagnosis was reported was used as the index date and for matching to a reference population (described below). Thus, a person with a first diagnosis of viral hepatitis in 1998-01-01 and a later diagnosis of cirrhosis in 2005-01-01 will have reference individuals sampled the first date. Identification of individuals with liver diseases was available from 1964 until the end of 2016.

Table 1. ICD-codes used to define presence of liver disease and study baseline. Truncated codes contain data on all underlying codes. For instance, K70 also includes data on K703, etc.

Matching to reference individuals and ascertainment of outcomes

After identification of the study population, the data file was sent to Statistics Sweden for identification of a reference population. Each person with liver disease was matched with up to ten reference individuals. Matching factors included age, sex, and municipality at the year of the first occurrence of a liver-related diagnosis. Reference individuals were drawn without replacements. Persons with liver disease could theoretically be included as controls up to the date when they were diagnosed with a liver disease. The full file was subsequently sent back to the National Board of Health and Welfare for additional cross-linkages to the registers presented above. All diagnoses, causes of death, cancers and dispensed drugs were recorded in the cohort where available. Causes of death, inpatient contacts (hospitalizations) and cancers were available from 1964. Outpatient visits were available from 2001, and dispensed drugs from July 2005.

Ethical considerations

The study and data collection process were approved by the Regional Ethics Committee in Stockholm, Sweden, reference number 2017/1019-31/1, prior to study initiation. All data was collected from centralized registers, but no variables that allow for identification of individual patients (such as the PIN) are distributed to the researchers. Thus, no individual patient informed consent was collected. This practice is commonly applied in register-based research [Citation14].

Available data

The DELIVER cohort allows for identification of all individuals with diagnoses listed in starting in 1964 up until the end of 2016. It is possible to follow each individual with a liver disease and their respective reference individuals from the date of first diagnosis until emigration, death or the end of the study period. Outcomes of interest can thus be identified in populations of patients with liver disease and compared to that of reference individuals. It is also possible to compare persons with one specific liver disease to others. For instance, examining the risk of death in patients with NAFLD compared to patients with ALD. Diagnoses made before the liver disease diagnosis are also available. This allows for case-control studies and historical data on possible confounders or risk factors for the examined outcome.

In total, 307 768 unique persons with liver disease and 3 067 714 reference individuals are available in DELIVER. presents an overview of the total number of healthcare contacts and unique patients per diagnosis for selected liver diseases in Sweden between 1964 and the end of 2016 in the Patient Register.

Figure 2. Number of registrations in DELIVER for selected diagnoses, presented both as number of total diagnoses (light blue), and individual patients (dark blue). Note that there is likely to be overlap, i.e., some patients might have coding for both HCC and NAFLD.

Figure 2. Number of registrations in DELIVER for selected diagnoses, presented both as number of total diagnoses (light blue), and individual patients (dark blue). Note that there is likely to be overlap, i.e., some patients might have coding for both HCC and NAFLD.

An update of DELIVER was recently approved and will allow for follow-up to be extended until the end of 2020 as well as obtaining additional data from Statistics Sweden on socioeconomic parameters.

Data that are not available include laboratory results, anthropometric data such as height and weight, radiology and biopsy results, and diagnoses from primary care. Thus, data on parameters such as Child-Pugh or MELD scores are not available.

Considerations for study designs

Prior to 1969, the ICD-7 version was used in Sweden. This was replaced with ICD-8 between 1969 to 1986, followed by ICD-9 until 1997, when the latest version (ICD-10) came into place (except for Region Skåne where ICD-10 was introduced in 1998). As the different versions of the ICD system contains different definitions and are updated with new diseases or definitions, this have some implications that needs to be considered. For instance, there was no ICD-code for autoimmune hepatitis (AIH) prior to ICD-10, why studies examining AIH might be best executed in the ICD-10 era. For some diseases such as NAFLD, clinicians might have become more familiar with the disease in the most recent years, and an increase in incidence might reflect increased detection and not necessarily a true increase in incidence.

For practical reasons due to changes in the ICD-coding system, it is recommended not to use ICD-7 codes to define exposure status in the Patient Register, meaning that start of studies is usually from 1969 and onward. The Patient Register became nationally complete in 1987, why diagnoses recorded prior to that stems from the regions of Sweden that were the first adopters of the register. Selection bias due to this is possible. Another important consideration is that outpatient contacts became available in 2001, meaning that diagnoses recorded prior to that are based on inpatient visits. Hence, studies need to consider the data source for included patients. presents the year of first recorded diagnosis of a liver disease in the inpatient (between 1969 and 2016), and the outpatient register (between 2001 and 2016), respectively. Competing causes of certain diagnoses also needs to be considered. For instance, ascites can have non-hepatic causes, and coding for ascites need to be combined with coding for chronic liver disease if hepatic ascites is the outcome in a study [Citation7]. Additionally, consideration of clinician coding patterns is important. For instance, an event of bleeding esophageal varices might not include coding for cirrhosis. Use of composite outcomes to ascertain progression of chronic liver disease to cirrhosis is therefore recommended [Citation13].

Figure 3. Number of unique patients with a first diagnosis of any liver disease in the inpatient and outpatient register, respectively. Note that a patient could be recorded in both registries. The high number of patients in the outpatient register in 2001 is due to lag since this was the initiation year of the outpatient register.

Figure 3. Number of unique patients with a first diagnosis of any liver disease in the inpatient and outpatient register, respectively. Note that a patient could be recorded in both registries. The high number of patients in the outpatient register in 2001 is due to lag since this was the initiation year of the outpatient register.

Additionally, outcomes that lead to capture in the used registers to a high degree are likely to have a higher validity as compared to outcomes that do not. For instance, investigating bleeding esophageal varices as an outcome is likely to capture most outcomes since this to a high degree would lead to contact with hospital-based healthcare and thereby recorded in the inpatient register. We have recently validated diagnoses associated with cirrhosis and found that these have high PPV:s [Citation7]. On the other hand, investigating hypertension as an outcome will not record diagnoses made in primary care, leading to a falsely low incidence. One way to account for this is to use the Prescribed Drug Register, allowing for capture of medications associated with hypertension, although this must be restricted to the period after July 2005 when this register became active.

Potential for research

A wide range of studies can be performed in DELIVER. Survival analyses can be done to investigate the clinical course of liver diseases and capture not only instances of mortality, but also diverse outcomes based on any set of ICD codes, such as fractures, cardiovascular disease, cancers, and suicides. The Prescribed Drug Register can be used to examine the effects of pharmacotherapies but needs careful planning of the study design. For instance, use of aspirin in patients with viral hepatitis was recently shown to associate with a lower incidence of liver cancer and lower risk for liver-related mortality [Citation15]. The Prescribed Drug Register can also be used to define outcomes based on ATC-codes for medications. As an example, prescriptions for metformin can be used to define occurrence of type 2 diabetes also identifying patients only followed in primary care. Other examples are health economy evaluations, examining risk factors for progression to outcomes such as HCC, studies of regional difference in access to healthcare such as liver transplantation and other socio-economic studies. A first paper based on this cohort was recently published, examining rates of cancer development in patients with NAFLD compared to matched reference individuals [Citation16].

Strengths and limitations

Several major strengths can be utilised in DELIVER. First, the design allows for capture of all diagnoses of interest in the available registries, giving a low risk for selection bias. Since healthcare is tax-funded and essentially free in Sweden, patients from all socio-economic groups can be identified. The size of the cohort allows for high statistical precision, and the extensive follow-up is important since it assures sufficient time from exposure to outcome, which is of particular interest in liver disease epidemiology. The low loss to follow-up is of high importance since it reduces selection and misclassification bias. Long and accurate follow-up also enables investigation of time trends in disease epidemiology and to evaluate changes in practice or available pharmacotherapies. Importantly, DELIVER also holds data on matched reference individuals. This ensures comparisons with the general population which is important for determination of relative risk estimates and accurate conclusions regarding excess risk in the examined population.

Limitations are also discussed above and relates to the design of the used registers. In general, consideration in study design needs to be made to reflect the data source for the exposure (inpatient vs. outpatient part of the Patient Register) and outcomes (severe outcomes are captured to a higher degree).

Conclusion

Here we present the Swedish DELIVER cohort and the registers from which the cohort is constructed. We highlight strengths and limitations as well as considerations for study design. In essence, DELIVER allows for important additions to liver disease epidemiology.

Author contributions

Study conception and design: HH; Acquisition of data: HH; Statistical analysis: LW; Analysis and interpretation of data: All; Drafting of manuscript: HH; Critical revision: All; Guarantor of article: HH; All authors approved the final version of the article, including the authorship list; Writing assistance: None.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

HH was supported by grants from Region Stockholm, The Swedish Cancer Society, Radiumhemmet Research Foundation, the Åke Wiberg Foundation and the Swedish Research Council.

References

  • Howe CJ, Cole SR, Lau B, et al. Selection bias due to loss to follow up in cohort studies. Epidemiology. 2016;27:91–97.
  • Hagstrom H, Nasr P, Ekstedt M, et al. Fibrosis stage but not NASH predicts mortality and time to development of severe liver disease in biopsy-proven NAFLD. J Hepatol. 2017;67(6):1265–1273.
  • Poynard T, Mathurin P, Lai CL, PANFIBROSIS Group, et al. A comparison of fibrosis progression in chronic liver diseases. J Hepatol. 2003;38(3):257–265.
  • Brooke HL, Talbäck M, Hörnblad J, et al. The swedish cause of death register. Eur J Epidemiol. 2017;32(9):765–773.
  • Ludvigsson JF, Otterblad-Olausson P, Pettersson BU, et al. The swedish personal identity number: possibilities and pitfalls in healthcare and medical research. Eur J Epidemiol. 2009;24(11):659–667.
  • Ludvigsson JF, Andersson E, Ekbom A, et al. External review and validation of the swedish national inpatient register. BMC Public Health. 2011;11:450.
  • Bengtsson B, Askling J, Ludvigsson JF, et al. Validity of administrative codes associated with cirrhosis in Sweden. Scand J Gastroenterol . 2020;55(10):1205–1206.
  • Barlow L, Westergren K, Holmberg L, et al. The completeness of the swedish cancer register: a sample survey for year 1998. Acta Oncol. 2009;48(1):27–33.
  • Torner A, Stokkeland K, Svensson A, et al. The underreporting of hepatocellular carcinoma to the cancer register and a log-linear model to estimate a more correct incidence. Hepatology. 2017;65(3):885–892.
  • Wettermark B, Hammar N, Fored CM, et al. The new Swedish prescribed drug Register-opportunities for pharmacoepidemiological research and experience from the first six months. Pharmacoepidemiol Drug Saf. 2007;16(7):726–735.
  • Laugesen K, Ludvigsson JF, Schmidt M, et al. Nordic health Registry-Based research: a review of health care systems and key registries. Clin Epidemiol. 2021;13:533–554.
  • Ludvigsson JF, Svedberg P, Olén O, et al. The longitudinal integrated database for health insurance and labour market studies (LISA) and its use in medical research. Eur J Epidemiol. 2019;34(4):423–437.
  • Hagström H, Adams LA, Allen AM, et al. Administrative coding in electronic health care record-based research of NAFLD: an expert panel consensus statement. Hepatology. 2021;74(1):474–482.
  • Ludvigsson JF, Håberg SE, Knudsen GP, et al. Ethical aspects of registry-based research in the nordic countries. Clin Epidemiol. 2015;7:491–508.
  • Simon TG, Duberg AS, Aleman S, et al. Association of aspirin with hepatocellular carcinoma and Liver-Related mortality. N Engl J Med. 2020;382(11):1018–1028.
  • Björkström K, Widman L, Hagström H. Risk of hepatic and extrahepatic cancer in NAFLD: a population-based cohort study. Liver International. 2022. DOI:https://doi.org/10.1111/liv.15195