1,158
Views
5
CrossRef citations to date
0
Altmetric
Articles

A multifactorial study on duration of temporary disabilities in Spain

, ORCID Icon &
Pages 328-335 | Received 04 May 2016, Accepted 25 Oct 2016, Published online: 29 Nov 2016

ABSTRACT

The extreme variability of temporary disability duration has a deep effect in public health. We tried to understand what factors duration of disability depends on. Through cohort study with data of temporary disabilities collected by Ibermutuamur from 2008 to 2012, we used statistical multivariate methods. The most reliable and convenient algorithm to predict duration was a categorical classification tree that distinguished between brief and long disabilities, taking into account both medical-biological and socioeconomic factors. The influence of socioeconomic factors in the disability process made numeric predictive models not accurate enough. Some of these socioeconomic factors were isolated and their influences were quantified. In particular, the one we named factor unemployment could explain a huge increase in duration for certain common diagnoses such as anxiety, low back pain, headache, and depression.

Many efforts have been made in different countries to understand what factors explain duration of disability.Citation1–3 In this work, we analyze data on temporary disability (TD) due to common disease. These data were collected in Spain from 2008 to 2012 by Ibermutuamur, one of the main widespread companies (Mutuas) that cooperate with National Institute of Social Security in managing worker's compensation insurance and medical attention in case of temporary disability, both strictly work related (due to a work accident or a typical occupational disease) and not work related (due to a common disease or an accident not related to work). In fact, about 78% of temporary disabilities in Spain are managed by Mutuas, so that the data we studied could be considered to be representative in order to understand what factors affect duration of temporary disabilities in Spain. Besides biological and medical factors, we analyzed other factors included in the data (according to Ibermutuamur's data-gathering protocol). Some of them, such as socioeconomic and occupational factors, were considered to be relevant in previous papers.Citation4–5 In Spain, recent studies have been carried out from small samples.Citation6–9 In this study, we tried to take advantage of the huge size of data and powerful data mining in order to isolate the relevant factor as clearly as possible.

The study of this issue depends obviously on the country we focus on since we must take into account its specific rules. In this case, we point out that, in Spain,Citation7 employees receive 60% of salary from the fourth day to the 20th day of TD, and 75% during the rest of TD. Moreover, money is paid by the company during the first 15 days, and it is paid (directly or indirectly) by Mutua from then on. Therefore, it is important to discriminate whether TD lasts more than 15 days. Self-employed workers have the same benefit but it is paid directly by Mutua from the fourth day until the end of TD.

Methods

Original data were composed of N = 1,085,824 entries, each one consisting of 40 values (most of them categorical) corresponding to a specific person who received medical release after a temporary medical leave that occurred between January 1, 2008 and December 31, 2012. The duration of TD (total number of days of medical leave) was the response variable. In other words, every entry consisted of one response variable and 39 predictors. Duration was also transformed into another binary categorical variable to distinguish whether duration was more than 15 days. We discriminated between two main types of predictors: medical-biological and socioeconomic. Prior to searching a suitable statistical model, data were simplified by considering just the most common diagnoses in workers according to Pareto rule (see ), thus reducing the total to N = 370,076. Moreover, some original predictors were removed after applying ANOVA model (as explained in the following), so that we eventually considered just 10 final predictors: gender, age, occupation, geographical zone (we distinguished between 4 different zones as heterogeneously as possible), main diagnosis (we considered just the 29 most common diagnoses in ), existence of codiagnosis, type of employment regime (employee or self-employed), payment (indirect if Ibermutuamur pays through company or direct if Ibermutuamur pays directly to the worker), civil status, and children (having or not). We can see in and and further details about these variables. Other variables not considered in the study were, for example, number of different episodes during TD, smoking status, drinking status, medical background, town, type of contract, beginning date, ending date, and so on.

Figure 1. Distribution of the main diagnostics.

Figure 1. Distribution of the main diagnostics.

Table 1. Duration of temporary disability by diagnosis.

Table 2. Duration of temporary disability by the remaining relevant variables.

Statistical analysis

The statistical software SPSS 19.0 (Chicago, IL, USA) and R 3.2.2 (R Foundation for Statistical Software Platform) were used to analyze the data. Due to the large size of the sample, the main decisions throughout the statistical process depended on the performance of the different models according to a random subsample joined by 30% of data. In other words, 258,711 entries were used as a training set to fit the different models, and 111,365 entries were used as a validation set.

For quantitative prediction, a WLS multi-factor ANOVACitation10 and a gamma generalized linear modelCitation11 (with a logarithmic linkage function) were applied to fit the variable log-duration and duration, respectively, considering the categorical predictors and the training sample. The logarithmic transformation was considered in both models due to the strong skewness of the variable duration. Skewness also made us consider median instead of average in some parts of study.

Sources of variability in the ANOVA model (we are considering here variables as well as interactions between them) were carefully chosen according to corrected R2 so that we could explain the maximum percentage of variance with the minimum degrees of freedom. In addition, the influence of each source of variability was measured by eta-squared partial coefficient (η2), so that those with η2 under 0.001 were not considered in any case (see ). In fact, the sources of variability removed from ANOVA model were no longer considered in the rest of the study. The remainder of the variables are described in , where they appear sorted by their respective η2 (when the interaction of a variable with diagnosis was big enough, η2 corresponding to interaction was added up to the one corresponding to the variable itself in ).

Figure 2. Importance of main variables in categorical and numerical models.

Figure 2. Importance of main variables in categorical and numerical models.

For categorical prediction, we carried out a logistic regressionCitation12 and a classification tree,Citation13 in which we considered CRT method, the Gini index as measure of error, a prune coefficient of 0.5, and a maximum depth of 10 levels in the growth of the tree. The influence of each variable on categorical prediction was measured by its capability to reduce the Gini index. We also tried to improve the classification tree by carrying out AdaBoost,Citation14 the most popular boosting algorithm,Citation13 implemented in R-package Adabag.Citation15 We considered that the other possible improvement of classification tree, Random Forest,Citation16 was not appropriate in this case because of the particular conditions of our predictors.Citation13 Finally, a neural network algorithmCitation17 was applied.

To illustrate the joint correlation between all the categorical variables involved in categorical prediction, we applied a multiple correspondence analysisCitation18 (MCA), which provided the . There, each category is identified by a symbol specific to the involved variable (circles for both types of categorical duration, stars for diagnostics, pentagon for occupations, etc.). Proximity between categories may be roughly understood as direct statistical association if they belong to different variables and as a similar statistical behavior if they belong to the same variable.

Results

Quantitative prediction

Duration of TD had a strong skewness and median 7. In other words, temporary disabilities were usually short although many of them were long or very long. Our challenge was to explain the behavior of data by a suitable statistical model. ANOVA was applied to fit the variable log-duration from the categorical predictors considering just the training sample. A value of 0.635 was reached for corrected R2 considering the 10 variables we explained in the preceding and some interactions between them (112 degrees of freedom in all). The behavior of the model residuals turned out to be appropriate. Taking into account the validation sample, ANOVA predicted 10.8 days less than the real duration in mean with a standard deviation (SD) of 43.8. However, its error was close to 0 in terms of the median. The main factor was clearly diagnosis, followed by payment. The interaction between diagnosis and some other variables (especially payment) turned out to be also relevant. In , we can value the influence of each variable according to its respective η2 coefficient (numerical model). A Gamma GLM was also applied to fit the variable duration considering the same sources of variability in the model above, but it performed quite similar to ANOVA.

Categorical prediction

To predict whether TD would last more than fifteen days, we tried first the logistic regression model, considering as predictors the sources of variability included in the ANOVA model and, as response, the categorized duration of TD. We obtained a Nagelkerke R2 coefficient 0.660 with the training sample, and 17.7% of misclassification error rate with the validation sample (12.0% for brief TD and 19.2% for long TD). Second, we applied the method that is, from our point of view, the most natural and convenient to deal with this problem: the decision tree applied to the variables selected in ANOVA as predictors. It led to 14.2% of misclassification error rate with the validation sample (9.5% for brief TD and 24.9% for long TD). The importance of the variables according to their ability to reduce the Gini index is also shown in (categorical model) together with η2 (Gini index expresses the importance of a variable at predicting whether a TD will last more than 15 days, while η2 expresses the capability of a variable at predicting an exact duration). Adaboost algorithm did not improve the fitting of the classification tree since the misclassification error rate was almost the same as that in the simple tree, and the same thing happened with the neural network algorithm.

The main variables

There exists a strong consistency between the different models considered in the preceding since all of them led to similar conclusions. Due to its better interpretability, we would use the simple classification tree for categorical predictions and would take into account the ANOVA model for numerical predictions.

Diagnosis was the most important variable in both cases, as described in Table I. The other relevant variables were, sorted according their influence in the classification tree, payment, regime, civil status, children, codiagnosis, occupation, age, geographical zone, and gender. They are described in . The purpose of is to illustrate the data from a numeric point of view as simply as possible. We can check there the medians of the duration depending on the two main factors in the numeric model: diagnosis and payment. We should clear up that regime showed also an important influence on duration (see ) although it is strongly correlated with payment (namely, we checked that direct payment corresponded to 82.1% of the self-employed and just 3.8% of employees). To summarize data from a categorical point of view, all the relevant categories are represented by MCA in .

Figure 3. Categorical relationships between the relevant variables.

Figure 3. Categorical relationships between the relevant variables.

Figure 4. Median of duration by diagnosis and payment (direct or indirect).

Figure 4. Median of duration by diagnosis and payment (direct or indirect).

Comment

While categorical prediction (classification tree) achieved an acceptable fitting, we think that quantitative prediction turned out to be quite ambitious in this case, since both ANOVA and gamma models are not accurate enough to make reliable predictions. We also think that, in general and for very big samples like ours, the lack of accuracy cannot be explained by the statistical model chosen but by the lack of relevant information (variables) in the data. Let us see the variables we have (although we prefer to talk about factors).

Diagnosis

Diagnosis was by far the most important factor, as we can see in and . shows which diagnoses are in general associated with longer TD, namely, those at the right part of the straight line. In fact, the first step in the classification tree roughly lies in knowing in which side of the diagnoses are located. Despite the large variability interdiagnosis, we also found a huge variability intradiagnosis that could be partially explained by the other factors.

Payment regime

Payment is the primary other factor. It is quite surprising how duration increased considerably with direct payment, especially for some specific diagnoses (see ) such as depression (one month for indirect and four months for direct in median), anxiety, threatened abortion, low back pain, headache, and migraine. Nevertheless, we found a smaller difference in other diagnoses such as inguinal hernia or carpal tunnel syndrome. We should study payment together with regime since they were bound from a statistical point of view (see ). In addition, the huge standard deviation of duration for direct payment suggests a heterogeneous behavior in this group. This fact has been pointed out in previous studies from smaller samples.Citation6,9 We will try to explain this subject better in the section devoted to unemployment and other irregularities.

Age and familial responsibilities

Civil status showed an important influence in the model, since medians went from 6 days for singles to 26 for widows/widowers. We can also see in that duration of TD was greater for people with children. Moreover, it was clear that duration of TD was greater as age increased. We can see in that these three variables were correlated in the sense that youth was associated with single status and lack of children, and these circumstances led to brief TD. Therefore, this fact could be explained as the influence on duration of two correlated factors: a biological factor, namely age, and a sociological factor that we could name “familial responsibilities.”

Codiagnosis

As expected, the existence of a codiagnosis is clearly associated with longer TD as we can see in . However, its influence in the model is only moderate (see ) because there were very few entries with a codiagnosis. That is why there is not a specific figure for this variable.

Occupation

The influence of occupation in the model was smaller than expected. Although the longest median TD corresponds by far with managers and armed forces (see ), we must point out that these two occupations presented also by far the largest percentage of direct payment (59% and 53%, respectively). Thus, we could regard the payment regime as the real source of this influence. We did not find a strong correlation between diagnosis and occupation, except in the case of inguinal hernia and skeletal-muscle pathologies, which are associated with craft, agriculture, forestry, fishery, and plant and machine operators. Some of these statements could be considered similar to conclusions founded in previous studies.Citation9

Geographic location

Zone played a more important rule in the numerical model than in the categorical model (see and ). We can also see in that Zone 1, corresponding mainly to the west and south of Spain, had a median of 13 days of TD, while Zone 4, composed mainly of the most developed areas from a socioeconomic point of view in north and central Spain, had a median of just 5 days.

Gender

Gender turned out to be relevant in the model due to its moderate interaction with diagnosis and with age. For example, low back pain, which, according to , is the most common diagnosis, lasted a median of 11 days for men and 22 days for women. Something similar happened with sciatica.

Differences between genders were most observed under 42 years old. Since there was little interaction between gender and children in the ANOVA model, we cannot explain these differences in terms of familial responsibility. Therefore, according to the data, we are inclined to think that young women are more sensitive than young men to lumbar problems.

Some other variables not considered

When we say that the 10 variables studied were the most relevant, we are not saying that the predictors removed at the beginning of the study had no correlation with duration. We just mean that they explained nothing apart from what the 10 main variables explain. In the case of smoking status and drinking status, we found not even a simple correlation with duration. Nevertheless, this conclusion is not reliable since unfortunately these two variables were evaluated only for a small portion of special patients (3.2% and 2.7%, respectively). We feel these variables would have been able to improve the model if they had been evaluated better. The same problem (missing data) to a lesser degree happened with some other variables, such as children, civil status, and occupation (see ).

Unemployment and other irregularities

What do payment and employment regime really depend on? Although these strongly correlated variables were shortly explained in previously, we add 3 remarks about them. First, it must be taken into account that 90.7% of workers were employees, and only 3.8% of them received direct payment (34,483 patients in all). Employees were paid indirectly through their company until the end of TD unless they were fired or their labor contract ended during TD. In such case, payment was assumed directly by Ibermutuamur from that moment.

From this point of view, long disabilities could be understood as a refuge against unemployment. So, while many employees probably tried to hide their sickness or to shorten their TD for fear of being fired,Citation4 others probably tried to prolong TD because they were about to get fired or had been already fired (direct payment). This could explain the extremely large skewness of TD duration that has been pointed out in previous studies.Citation8 We are even inclined to think about a vicious circle of disability-dismissal: Many employees were probably fired because of disability, and they prolonged disability because of it.

Second, 61.3% of direct payment corresponded to self-employed (managers). According to our experience, it is possible that many of these workers kept charging Ibermutuamur while their companies were really broken, so that TD could be considered to be a refuge against unemployment as well. It is also very likely that other managers kept running their small business in an irregular way while receiving a direct payment from Ibermutuamur for a long time, as has been described in recent studies.Citation8

Third, as we can see in , these kinds of unexpected long TD are associated with typical diagnoses that are hard to check. We can see that, in these cases, the median of direct payment is about 4 times (or more) the median of indirect payment. Thus, there exists a not strictly medical component in the diagnosis. All these facts help us to understand the huge importance of nonmedical factors in TD.

Limitations of the study

Although temporal disability is a global subject, we found it difficult to compare this study with others in different countries because of the sociological differences and obvious variations between legislations. In addition, we have taken into account a few recent studies in Spain dealing with this issue that were not published in English, so it could be hard to compare our conclusions with the statements of previous studies. Nevertheless, the most conflictive point in the study (unemployment) has been explained according to the professional experience of the authors.

Moreover, despite the large sample size, several circumstances could spoil some conclusions. First, we did not control data gathering. As a result, some important variables, such as smoking and drinking status, were not properly registered. Furthermore, medical diagnosis should have been less specific in order to analyze the whole data. Second, since Ibermutuamur is a widespread company in Spain, we think that our sample is representative, but we are not able to assure this statement. Third, the sample was not stable over the years.

Conclusions

We have obtained an algorithm to distinguish between brief (15 days at most) and long (over 15 days) TD, depending on some biological and medical factors— diagnosis, codiagnosis, age, and gender—and on some socioeconomic factors—unemployment, familial responsibility, geographic location, and others. The influence of the second kind of factor is large and difficult to control. We think this is the reason numerical predictions are not accurate enough. Nevertheless, we have studied them as deeply as possible.

Funding

This work was supported by the Junta de Extremadura (Autonomous Government of Extremadura, Spain) under the project GR15013.

References

  • Cheadle A, Franklin G, Wolfhagen C, Savarino J, Liu PY, Salley C, Weaver M. Factors influencing the duration of work-related disability: a population-based study of Washington State workers' compensation. Am J Public Health. 1994;84:190–196.
  • Álvarez-Theurer E, Llergo-Muñoz A, Vaquero-Abellán M. Modelo Predictivo de la duración de Incapacidad temporal por Lumbalgia. Factores determinantes. Atención Primaria. 2005;14:10–17.
  • Álvarez-Theurer E, Llergo-Muñoz A, Vaquero-Abellán M. Análisis de la duración de los periodos de incapacidad temporal de los procesos de AndalucÍa. Factores asociados. Atención Primaria. 2009;41:387–393.
  • Virtanen M, Kivimäki M, Elovainio M, Sund R, Virtanen P, Ferrie JE. Sickness absence as a risk factor for job termination, unemployement, and disability pension among temporary and permanent employees. Occup Environ Med. 2006;63:212–217.
  • Benavides FG. 2013. Occupational categories and sickness absence certified as attributable to common disease. Eur J Public Health. 2013;13:51–55.
  • Dominguez A, López R, Gordillo F, Pérez-Nieto MA, Gómez A, De la Fuente JL. Distorsión Clínica y simulación en la Incapacidad Temporal. Psicopatología Clínica Legal y Forense. 2013;13:29–45.
  • Llordén S. El despido durante la Incapacidad Temporal [monografía en internet]. Madrid: Universidad Pontificia de Comillas; 2014. https://repositorio.comillas.edu/jspui/bitstream/11531/600/1/TFG000525.pdf.
  • Pérez MA. Análisis del resultado en el proceso de Incapacidad Temporal en el área sanitaria de Albacete. Influencia del estado de salud, factores sociodemográficos, satisfacción laboral y locus de control [thesis]. Albacete: Universidad de Castilla La Mancha, Facultad de Medicina; 2014.
  • Ruiz-Moraga M, Catalina-Romero C, Martínez-Muñoz P, et al. Periodo prequirúrgico y duración de la Incapacidad Temporal por contingencias comunes en la hernia inguinal. Cirugía Española. 2014;92:269–276.
  • Searle S R. Linear Models. New York: Wiley; 1971.
  • Ahan H. 1996. Log-gamma regression modeling through regression tree. Commun Stat Theory Methods. 1996;25:295–311.
  • Doobson AJ. An Introduction to Generalized Linear Models. New York: Chapman & Hall; 1990.
  • Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning. Amsterdam: Elsevier; 2008.
  • Freund Y, Schapire R. A decision-theoretic generalization of online learning and an application to boosting. J Comput Syst Sci. 1997;55:119–139.
  • Alfaro E, Gámez M, García N. Adabag: an R package for classification with boosting and bagging. J Stat Softw. 2013;54:1–35.
  • Breiman L. Random forest. Machine Learning. 2001;45:5–32.
  • Riplet BD. Pattern Recognition and Neural Networks. Cambridge: Cambridge University Press; 1996.
  • Greenacre MJ. Theory and Applications of Correspondence Analysis. Amsterdam: Elsevier; 1984.