833
Views
7
CrossRef citations to date
0
Altmetric
Original Articles

Reliability of observers' subjective impressions of families: A generalizability theory approach

, &
Pages 448-463 | Received 14 Feb 2012, Accepted 21 Sep 2012, Published online: 15 Oct 2012
 

Abstract

Parenting was observed in videotaped interactions in 30 families referred for child conduct problems. Generalizability coefficients and the impact of varying numbers of raters were estimated. Two measurement designs were compared: All raters observed all families (“crossed” design) and a different rater observed each family (“nested” design). The crossed design provided higher generalizability coefficients than a nested design, implying inflated generalizability estimates if a crossed estimation model is used for a nested data collection. Three and four raters were needed to obtain generalizability coefficients in the .70–.80 range for monitoring and discipline, respectively. One rater was sufficient for a corresponding estimate for positive involvement and for an estimate in .80–.90 range for problem-solving. Estimates for skill encouragement were non-acceptable.

Se observaron las actitudes y conductas parentales (parenting) a través de video-filmaciones de 30 familias cuyos hijos habían sido derivados a consulta por problemas de conducta. Se estimaron los coeficientes de generalizabilidad y el impacto que tenía variar el número de evaluadores. Se compararon dos diseños de medición: a) todos los observadores evaluaron a todas las familias (diseño “cruzado”), o bien b) un observador fue asignado a cada familia (diseño “en nido). El diseño cruzado mostró mejores coeficientes de generalizabilidad que el diseño en nido, lo cual implica que se verían artificialmente aumentadas las estimaciones si se aplican índices del modelo cruzado a un modelo en nido. Se necesitaron 3 y 4 evaluadores para alcanzar un coeficiente con un rango entre .70 o .80 para evaluar monitorización y disciplina, respectivamente. Un solo evaluador fue suficiente para lograr un rango similar de generalizabilidad para la categoría “involucramiento positivo” y dos para un nivel similar en el ítem resolución de problemas. Las estimaciones no fueron aceptables para el ítem estímulo de habilidades.

Das Erziehungsverhalten von 30 Familien, die aufgrund von Verhaltensproblemen der Kinder in Behandlung waren, wurde anhand von Videoaufnahmen beobachtet. Generalisierungskoeffizienten und der Einfluss unterschiedlicher Anzahl von Ratern wurden bestimmt. Zwei Messdesigns wurden verglichen: alle Rater beobachteten alle Familien (“crossed” design) und unterschiedliche Rater beobachteten unterschiedliche Familien (“nested” design). Das “crossed”-Design erbrachte höhere Generalisierungskoeffizienten als das “nested”-Design, was impliziert, dass Generalisierungseinschätzungen überhöht werden, wenn Kreuzbeurteilungsmodelle auf “genestete” Datenerhebungen angewandt werden. 3 und 4 Rater wären nötig, um eine Generalisierbarkeit zwischen .70 – .80 für Monitoring (Überwachen) beziehungsweise Maßregelung zu erhalten. Ein Rater war ausreichend für korrespondierende Einschätzungen bezüglich Positives Beteiligtsein und für eine Einschätzung zwischen .80 – .90 bezüglich Problemlösung. Einschätzungen bezüglich Fähig- bzw. Fertigkeitsanregungen waren nicht akzeptabel.

Il Parenting è stato osservato nelle interazioni videoregistrate in 30 famiglie che riferivano problemi della condotta del figlio. Sono stati valutati i coefficienti di generalizzabilità e l'impatto relativo al numero dei valutatori. Sono stati confrontati due disegni di ricerca. In uno tutti i valutatori hanno osservato tutte le famiglie (disegno di tipo “crossed”) ed nell'altro un valutatore diverso ha osservato ciascuna famiglia (disegno di tipo “nested”). Il disegno crossed ha fornito coefficienti di generalizzabilità più altri rispetto a quello nested: ciò implica stime di generalizzabilità gonfiate se un modello di valutazione crossed è usato per una raccolta di dati nested. Tre e quattro valutatori erano necessari per ottenere coefficienti di generalizzabilità in un range di .70–.80 rispettivamente per il monitoraggio e la disciplina. Un valutatore era sufficiente per una stima di .70–.80 del coinvolgimento positivo e per una stima di .80–.90 per il problem solving. Le valutazioni dell'incoraggiamento dell'abilità non erano accettabili.

Notes

1. It should be noted that the estimates of the variance components for the main effect of the father facet may be biased. The urGENOVA statistical program (Brennan, Citation2001b) was applied to estimate the G-study variance components for the crossed design, which is based on a completely random model. The pooling procedure was based on the sampling variance for the components estimated in each sub-sample of mothers, as described above. Because the father facet is considered to be a fixed facet, the present weighting procedure in pooling the components may have produced a biased variance component for the father facet. However, our research aim was not to assess the relative importance of the different sources of variation but to compare the generalizability coefficients derived from different D-study designs. D-study estimations assumed the father facet to be fixed. However, the sampling status for the father facet does not affect the estimated generalizability coefficients.