833
Views
20
CrossRef citations to date
0
Altmetric
REVIEW ARTICLE

True or false: All genes are rhythmic

&
Pages 1-12 | Received 10 Apr 2010, Accepted 01 Nov 2010, Published online: 08 Dec 2010

Abstract

Multiple microarray studies have documented the importance of circadian regulation of gene expression in different species under many experimental conditions. However, these reports often differ with respect to the identity and total number of oscillating genes. This review explores the interrelated questions of: How many genes are oscillating within individual tissues or systems? What are the forces that drive these oscillations? What are the methodological sources contributing to the discrepancy between estimates of gene oscillation? And finally, what are the physiological and systemic implications of oscillatory gene expression with respect to circadian molecular biology? Since this remains an evolving area of investigation, this hypothetical and speculative review also highlights the potential limitations faced by the current data in the literature relating to the novel paradigm(s) proposed.

Key message

  • Life is structured along the time dimension, and this structure is manifested in rhythmic gene expression.

When the development of scientific methodology in modern times opened life science to quantitative observations, diurnal rhythms were among the first phenomena to attract attention. In 1729 Jean-Jacques d'Ortous de Mairan, astronomer and geophysicist, published his observations that plant leaves moved in the dark in anticipation of sunrise, as if the plant had received a signal from a clock. The daily rhythms in animal physiology and behavior that persist in the absence of changing environmental cues were reported in subsequent scientific publications that heralded the new era of circadian biology (see (Citation1) for the historic review) (Citation2). The term ‘circadian’ or ‘approximately daily’ was introduced to the literature by Halberg (Citation3). Throughout the last third of the twentieth century circadian biology has gained new prominence as an independent scientific discipline. It became apparent that the circadian clock was encoded within the genomes of multiple phyla and composed of a number of interacting genes. The first discovery of the Per gene in Drosophila in 1971 (Citation4) has been followed by multiple studies characterizing components of the molecular clock through artificial mutagenesis (see (Citation5) for historic overview). The following decades brought to light the details of the molecular clock mechanism in mammals, insects, plants, and bacteria as well as multiple observations of interactions between the molecular clock and genes implicated in energy metabolism, development, signal transduction, sleep, and related physiological functions. Based on the known genetic components Hardin, Hall, and Rosbash proposed the transcription negative feedback loop model of the molecular clock (Citation6). With some additions and revisions this model became the template for studies of the circadian molecular clock in eukaryotes. Application of microarray technology starting a decade ago allowed large-scale observations of expression activity of most known genes, if not the entire genome (Citation7–10). Multiple publications concur that as many as 10% of all genes are influenced by the circadian molecular clock. This number varies depending on the nature of the specific tissue under investigation, the experimental conditions, and whether the authors’ calculations included all genes represented on the microarray or only those that were ‘actively expressed’. Nevertheless, the value of 10% is now the canonic estimation referenced in reviews and text-books. This review considers the origins of this value and explores alternative methodologies to arrive at an estimation of the number of oscillating genes.

Many circadian studies retain the general experimental design first employed in 1729. The subject (a plant or an animal) is entrained to a normal periodic change in environment—typically 12 hours of daylight and 12 hours of darkness—with constant ambient temperature. Afterwards, the subject is placed in constant darkness and observations are made using state-of-the-art technologies. While sketches were used in the 1700s, and physiological measurements were added in the 1900s, the introduction of ‘omics approaches in the late 1990s has opened a new era in circadian biology where researchers monitor the activity of thousands of different genes in each sample. These tools have made it possible to address the question: How many genes display a circadian oscillation?

Certain technical elements of the microarray studies influence their outcome and accuracy. First, these techniques are labor-intensive and expensive. Second, to estimate gene activity, a sample of biological material has to be taken for RNA extraction, an invasive procedure in which the animal or plant is biopsied or sacrificed. These factors curtail the number of samples that can be collected and processed in a single series of observations. Consequently, researchers face a serious challenge when planning experiments where both the length of observation and sampling rate are critically important. Optimally, the observations must cover at least two periods (i.e. 2 days) to replicate the cyclic processes. A low sampling rate can limit the ability to detect oscillating genes or to distinguish the relative phase shift between oscillating genes. The limitations follow from the Nyquist theorem stating that the number of possible frequencies (or rhythms with different period length) in a series of observations cannot exceed half the number of samples taken. The majority of published microarray studies employ 12 microarrays in each series of observations where samples are taken every 4 hours, spaced equally over a 2-day interval (Citation10–13). Only a few studies have used more frequent sampling rates of every third (Citation14,Citation15), second (Citation16), or single hour resolution (Citation17).

A large amount of data must be analyzed to identify rhythmically expressed genes in microarray experiments. Detection of periodic patterns in noisy profiles of microarray probe intensities can be a challenge. The classic algorithmic approach to detect periodicity was developed prior to the introduction of ‘omics technology to circadian biology (Citation18,Citation19). Cosinor and related algorithms, such as COSCORR, COSOPT, etc., are based on fitting cosine curves to the time series. The significance of the oscillatory pattern is estimated through the goodness of fit. Although this approach pre-dates the advent of molecular biology methods, it remains popular among researchers (Citation8,Citation9,Citation20–22).

Variations on a second approach are based on Fourier transformation followed by an estimation of the significance for particular peaks in a periodogram. The principles of statistical tests for the significance of a periodic pattern in time series were outlined by Fisher as early as 1929 (Citation23). These have been adapted to short time series with low sampling rates as exemplified by microarray studies. A general overview of this approach is given by Wichert et al. (Citation24).

A third approach uses autocorrelation to detect periodicity. The significance of correlation is used to estimate the significance of a periodic pattern. However, in spite of its simplicity, this approach is rarely used in practice. Much more common are algorithms based on a permutation procedure. Unlike Fisher's g-test, these methods derive an expected noise level for the test from the time series by permuting the order of observations. A number of different algorithms using permutation as an element have been published in recent years (Citation10,Citation25–27).

All of the tests mentioned above rely on a null hypothesis that no difference exists between the expected random and the observed patterns in the time series. Each test estimates the significance of the rhythmic pattern using P = 0.05 or 0.01 as the cut-off value (Citation16). Most studies report the exact number of oscillating genes, often along with the percentage of the transcriptome that was found to be oscillating. The body of genes to which this fraction is related is not always the entire number of genes or probes on a microarray but the subset of ‘actively expressed genes’, i.e. either the number of genes expressed above a preselected intensity level (Citation16) or those detected in more than half of the samples in a time-line (Citation10). Most studies employ a single method (or algorithm, see (Citation28) for recent review); however, a comparison between similar tests and experimental conditions can lead to a puzzling diversity of numbers. Therefore, how many genes are circadian, i.e. reported oscillating with periods spanning approximately one day?

One approach to this question is to avoid arbitrary cut-offs in the analysis of microarray data. In this review, the percentages of cycling genes shown have been recalculated in relation to the total number of interrogated transcripts (probe sets) in the microarray used in a particular study. shows the general overview of surveyed transcriptomes and fractions of oscillating genes reported in different studies (more detailed explanation of the algorithm generating transcriptome heat-maps is given in Supplemental Figure 1). The initial report by Panda et al. of circadian time series microarray analysis of circadian time series in mice identified several hundred oscillating transcripts (Citation8). Related to the total number of transcripts (probe sets) represented on the specific microarray and considered in the analysis, the cycling fraction of the transcriptome was 1.5% in liver and 0.8% in hypothalamus (Citation8). Using the same type of microarray and the same murine tissue, Ueda et al. reported 3.1% in liver and 0.8% in hypothalamus (Citation7). Storch et al. described a larger cycling fraction of 4.6% in the murine liver transcriptome in a similar experiment using an alternative test for periodicity (Citation10). Ptitsyn and colleagues, using a permutation-based algorithmic approach along with more traditional Fisher's g-test and autocorrelation, reported up to 23.2% cycling genes in murine liver, up to 18.3% and up to 19.6% in white and brown fat (Citation12), and 26% in murine calvarial bone tissue (Citation11). More recent studies of murine transcriptome in heart have revealed 1.6% in wild-type cardiomyocytes from the atria and 0.6% in those from the ventricles (Citation6). In murine skeletal muscle circadian oscillation has been attributed to 0.6% genes (Citation29).

Figure 1. Overview of the expression pattern and reported circadian fraction of transcriptome. The heat-map diagrams show higher (red) and lower (green) levels of gene expression for two (four in case of murine heart study) circadian periods for the entire set of transcripts interrogated in the analysis. The algorithm generating heat-maps of time-line gene expression is explained in Supplementary Figure 1. For the original reports marked (*) there is available reanalysis reporting circadian oscillation in nearly 100% of corresponding transcriptome.

Figure 1. Overview of the expression pattern and reported circadian fraction of transcriptome. The heat-map diagrams show higher (red) and lower (green) levels of gene expression for two (four in case of murine heart study) circadian periods for the entire set of transcripts interrogated in the analysis. The algorithm generating heat-maps of time-line gene expression is explained in Supplementary Figure 1. For the original reports marked (*) there is available reanalysis reporting circadian oscillation in nearly 100% of corresponding transcriptome.

Studies on other species produce numbers of circadian oscillating genes on the same scale. Multiple experiments of Drosophila melanogaster report as much as 1.2% cycling genes in the study by Boothroyd et al. (Citation14), 1% in McDonald and Rosbash (Citation30), 0.8% in Ceriani et al. (Citation21), and 0.5% in Lin et al. (Citation22), respectively. Likewise, studies of circadian rhythm in plants (Arabidopsis thaliana) document 15.4% of circadian cycling genes in the study by Edwards et al. (Citation31) and 7% as reported by Covington and Harmer (Citation13). Because human gene expression has been harder to study, there has been limited availability of complete circadian time series. Nevertheless, in human skeletal muscle, Zambon et al. identified only 0.3% genes as putatively circadian (Citation32). Similar analysis of the human milk fat globule transcriptome identified a rhythmic pattern in approximately 4.6% of genes (Citation15).

An overall comparison between the studies above reveals significant discrepancies in quantitative estimations of the oscillating fraction of the transcriptome, even when studies are performed in the same tissue and species. The overlap between lists of circadian oscillating genes from different murine tissues is small despite the same experimental design and analysis strategy (Citation8). The overlap between lists of circadian genes from studies using different microarrays and variations in experiment design is even smaller. While a small group of genes that include components of the molecular clock are identified consistently, other genes reported to be under circadian control differ in almost every experiment. A critically important factor determining the number of circadian genes may be the algorithm employed to identify periodicity. Nevertheless, a meta-analysis of circadian expression in different murine tissues has led some to conclude that up to 100% of all genes could be circadian—i.e. they are found oscillating in at least one study (Citation16). Similarly, meta-analysis of independently acquired data sets resulted in the conclusion that 30%–40% of plant gene expression is modulated by circadian clock (Citation33). Likewise, meta-analysis across different conditions in A. thaliana led to the conclusion that up to 89% of genes could be controlled by the circadian clock (Citation34). Together, these meta-analyses suggest that the environmental or experimental conditions decide whether a particular gene will be oscillating in a given study.

Our series of papers in murine models using a permutation algorithm suggest that > 90% of all expressed genes were oscillating, supporting conclusions consistent with the meta-analyses (Citation11,Citation12,Citation26,Citation27,Citation38). This leads to the question: How does the choice of algorithm/methodology account for such a difference in estimating the number of oscillating genes?

An explanation for this question can be found within the mathematics and statistics. The major challenge to detection of periodicity in microarray expression profiles results from the combination of a short series of observations with a low sampling rate. In a recent publication, Liew et al. (Citation35) estimated the statistical power of the commonly used Fisher's g-test in an analysis of short time series. The authors came to the conclusion that this test cannot be reliable unless the time series reaches 40–50 time points. A much earlier publication by Lindström et al. (Citation36) analyzing the statistical power of periodicity tests based on autocorrelation estimated that an acceptable P = 0.05 confidence level could not be achieved even with 100 time points. Nevertheless, most published estimations of circadian oscillating genes are based on a time series reaching only 12 time points; a few studies include 18 time points, and the absolute maximum is 24 (Citation14–17). Horne and Baliunas (Citation37) estimated that the ‘false alarm rate’ for detection of a significant oscillation in a series of 12 samples would be 0.0462 under ideal noise-free conditions yielding a cosine curve; however, microarrays are notorious for generating noisy data. According to Horne and Baliunas's estimation for the ‘false alarm rate not exceeding 5%’ (equivalent to P < 0.05 cut-off) the required signal-to-noise ratio for a typical circadian microarray study with 12 time points would be 0.473. When applied to microarray data, the Horne–Baliunas estimation imposes limitations on 1) the amplitude of detectable oscillation; 2) the overall intensity of expression at which oscillation can be detected; 3) perfection of the multistage sample preparation, extraction, labeling, and hybridization which affect the noise; and 4) the perfection of design and fabrication of the probes also contributing to the noise in estimation of gene expression activity. Nevertheless, Horne and Baliunas conclude on an optimistic note that with careful application periodogram analysis can detect even faint oscillations in a noisy data set. In their own simulation, while a signal-to-noise ratio of 0.313 was enough to obscure a periodic pattern beyond visual recognition, it did not render it undetectable computationally. What are the implications of using under-powered statistical tests for detection of periodicity (such as circadian rhythm) in a short, under-sampled and noisy time series?

To answer this question one would have to sort out those parts of the transcriptome that do not pass periodicity tests in their straightforward application. Genes that do pass the test can be assumed periodic beyond reasonable doubt. The reasonable doubt is usually defined by an arbitrary cut-off of P = 0.05 (P = 0.1 in some studies). For each of the genes tested there is a certain probability of type I and type II error. In a conventional sense, this means that, if all genes tested for periodicity are ranked in order of ascending P value, for the last gene accepted as periodic there is a chance of being false-positive (up to 5%, resulting from type I error, inappropriate rejection of null hypothesis in the test). Likewise, the first gene rejected as non-oscillating will have the highest probability of type II error—being false-negative. We can be almost certain that this first rejected non-oscillating gene is in fact oscillating. Further down the list the probability of type II error is lower, but it remains to be seen how much lower and how fast this rate is dropping. The simulation experiment indicates that in vivo murine transcriptome has estimated type I and type II errors consistent with only a small fraction of non-oscillating genes (Citation38). Selecting a few truly rhythmic genes from the top controls only type I error and leaves open the question: How many oscillating genes are left behind? Thus, straightforward application of under-powered statistical tests may under-estimate how many genes are circadian.

Most current models employ a null hypothesis that presumes a gene as non-oscillating. If P values estimated by the periodicity test exceeds a particular threshold (typically 0.05), the default assumption of non-rhythmic expression is accepted. Consequently, the gene in question is deemed to have failed the test for periodicity and is excluded from further consideration. However, when under-powered testing is employed it is not the gene that fails, but the test itself. The test could be balanced by considering a differently formulated null hypothesis, i.e. assuming the presence of circadian rhythm in every gene and then testing the genes for the absence of oscillation. The fraction of genes for which the new hypothesis is rejected would be considered non-oscillating while the rest would be considered to retain an oscillatory potential. While we have no examples of studies exactly like that, Ptitsyn et al. attempted to estimate the presence and size of a non-oscillating, steady line fraction of transcriptome (Citation38). This paper estimated that such a non-oscillating fraction, if present, could represent only less than 5% of all genes. In addition to the conventionally oscillating genes, the oscillating fraction of the transcriptome includes the majority of transcripts with uncertain oscillatory status. These estimates contrast with traditional methods concluding that approximately 20% of all the expression profiles are oscillating. Just as the traditional methods may under-estimate the number of oscillating genes, a limitation of the Ptitsyn approach is that it may over-estimate this number.

Regardless, direct application of under-powered tests for periodicity can identify a small fraction of genes that are periodic beyond reasonable doubt. Still, for the bulk of the transcriptome the answer is neither positive nor negative since the absolute majority of transcripts fall into a gray zone for which conventional tests are ineffective. Elucidating this shadow zone demands significant changes in methodology. There are a few conceivable ways to tackle this challenge. One way to accomplish this goal would be to reproduce the data at a higher sampling rate—at least 48 time points per day (sampled every 30 minutes) for a period of at least 2 days. Collection and processing of such data might be possible in the near future, but at present it remains unrealistic for most laboratories due to cost.

The second approach is to improve the sensitivity of the periodicity test itself. Multiple groups have proposed improved algorithms for periodicity detection based on a general idea of permuting time points to estimate the noise in calculation of the signal-to-noise ratio; Storch et al. first applied the original algorithms which included the element of permutation to transcriptome-scale analysis of periodicity (Citation10). This robust and sensitive algorithm may explain why Storch et al. reported more oscillating genes compared to similar microarray studies published within a year (Citation7,Citation8). Later reanalysis of the same data with Pt-test (Citation26) has confirmed the original Storch et al. count with marginal improvement and yielded approximately the same numbers of oscillating genes (exceeding the original report) from the other experiments. Selected transcripts reported rhythmic by Pt-test and missed by other tests have been validated by RT-PCR (Citation26). So far Pt-test reports the largest number of oscillating transcripts (also independently validated) compared to other tests. A more sophisticated permutation-based test has been proposed by Ahdesmaki et al. (Citation25). Another version of the permutation test is described in the methods section of an analysis of metabolic rhythms in a yeast model (Citation39), which also tends to report more rhythmically expressed genes. Nevertheless, the improvement is often marginal, and the major issue of statistical power could not be resolved. Thus, algorithm alone cannot be invoked to explain the discrepancies in oscillating gene estimates.

The third alternative approach would be to increase the power of the statistical test for periodicity by use of a ‘phase continuum’ strategy (Citation24) which can be combined with many different tests for periodicity. In this approach the statistical power is increased at the price of uncertainty regarding the oscillatory status of up to a few dozens of genes. Compared to the uncertainty left by traditional analysis for the majority of genes, this trade-off seems to be fair. In ‘phase continuum’ all genes are separated into groups potentially oscillating in the same phase. The phase can be tentatively assigned by correlating each separate gene expression to an artificial cosine curve generated with all possible phase shifts. In each group expression profiles are in synchrony with each other, whether truly of falsely oscillating. All expression profiles are tested for periodicity, then sorted and concatenated in order of decreasing signal-to-noise ratio. The result of this operation is a continuous signal, separate for each phase group (‘phase continuum’), that starts with the most clear rhythmic profiles and gradually deteriorates into stochastic noise. In order to finalize the identification of oscillating genes, one would need to set a gate along the continuous signal and find the point at which the signal is lost in noise beyond a reasonable significance threshold. A similar gating approach is routinely employed in the analyses of flow cytometry data for the expression of surface antigens detected with fluorescent-labeled antibodies. The phase continuum approach can accommodate the same tests for periodicity already used for gene-by-gene testing. However, because a number of genes are tested simultaneously, the number of points in a time series ceases to be an issue. For example, testing a group of five genes with a typical sampling rate (12 time points over 2 days) estimates the number of oscillating genes plus or minus five (times the number of same-phase continuums tested separately), but provides 60 time points, thereby increasing the reliability of Fisher's g-test. The additional benefit of the phase continuum approach comes from an effective application of low-pass digital filters that can significantly improve the signal-to-noise ratio. From the same data sets that yield approximately 5%–20% of oscillating genes in traditional gene-by-gene tests, the phase continuum approach identifies oscillation in more than 90% of genes. The idea of testing genes in groups rather than individually is consistent with the biological concept of focusing on linked pathways. Genes represented by microarray probe sets are rarely independent in terms of their function; why not extrapolate this linked relationship to their oscillation behavior? Tests for periodicity focus on one factor and one frequency only: circadian, with a complete period of approximately 1 day. Regardless of the source of modulation (such as transcriptional control by molecular clock and/or environmental cues) these factors are closely correlated. Oscillating genes may peak at different times, but they all are driven by the same interdependent group of factors and have the same frequency. Therefore, testing a few synchronized genes at a time does not introduce any bias. In contrast, adjusting the number of circadian genes for potential false discovery rate (FDR) is inappropriate regardless of the choice of algorithm since none provides multiple tests of independent hypotheses. This is analogous to an electrical circuit where testing one electric outlet having an oscillating current at 60 Hz is accurately reflective of all other outlets on the same power grid.

Finally, the fourth approach to identification of true oscillation among short and noisy under-sampled time series observations is based on gene interaction. To some extent this approach is a further development of the third approach which considers groups of genes together. Expression patterns of genes interconnected by their biological function can be considered in the context of biological pathways. For example, elements of the same signaling pathway (Citation11,Citation40) could have expression profiles that fall short of the P = 0.05 cut-off if considered independently. Since we know that these genes are co-regulated transcriptionally and that their protein products form a functional unit, should we consider their correlated patterns of expression, whether higher or lower, as mutual confirmation of their co-ordinated rhythmic expression?

There are precedents to the concept that the majority of genes display a circadian base-line oscillating component. Prior to the discovery of the genes encoding the circadian clock, observations and modeling of biological pathways had identified multiple cellular oscillators that included both oxidative phosphorylation and glycolysis (Citation41). Indeed, most if not all cellular circuits with a feedback loop are configured to function as natural oscillators. However, these intrinsic oscillators have flexible periods and may function out of synchronization with each other. In more recent publication, Aon et al. highlighted the multilevel organization of rhythms in a living cell (yeast and cardiomyocytes in culture) (Citation42). The bioenergetic dynamics in both systems showed fractal scaling over at least three orders of magnitude and that this scaling obeys an inverse power law. Presumably, these scale-free patterns were synchronized by environmental cues. Consistent with these early models, the first report of circadian gene expression in cyanobacteria concluded that the absolute majority of genes were under a form of circadian control (Citation43). While this statement was radical, it was plausible in a primitive single-cell organism strictly dependent on photosynthesis. In their discussion, Liu et al. (Citation43) noted that the involvement of oscillations permeating every cellular pathway in cyanobacteria did not necessarily extrapolate to eukaryotic systems, despite the fact that all organisms share the common energy metabolism constraints.

In similar experiments on oxidative phosphorylation cycles in yeast, independent research groups from UT Southwestern (Citation44) and City of Hope (Citation39) reported distinct periods of oscillation. Each experiment applied a specific synchronization procedure to the yeast colony. Regardless of the oxidative phosphorylation cycle period length, both studies reported pervasive oscillating patterns affecting the majority of expressed genes, even when applying statistically under-powered gene-by-gene independent testing.

This interplay between the circadian clock and oxidative phosphorylation deserves particular scrutiny. Metabolic oscillations have been intensively studied using a yeast model (Citation39,Citation44). A colony of yeast synchronized by oxidative phosphorylation cycle demonstrates stable oscillatory pattern in the expression of the majority of genes (over 5,000) (Citation39). The period of oscillation is approximately 40 minutes. The same period of oscillation has been reported for the absolute majority of metabolites identified by gas chromatography–mass spectrometry studies (Citation45). The latter study also reported rhythmic changes in transcription for the majority of genes in synchrony with the rhythmic changes in metabolite abundance. This observation reflects an important property of cellular organization of life: compartmentalization of mutually exclusive processes (such as oxidative and reductive stages of oxidative phosphorylation) in time. The same idea has been proposed by McKnight and colleagues (Citation44). Analysis of the experiments conducted at UT Southwestern also reported oscillation in the absolute majority of actively expressed genes. However, the period of oscillation was drastically different: approximately 300 minutes. Pervasive oscillations following the main metabolic cycle permeate and command all cellular functions and modulate DNA replication (Citation46). Summarizing multiple observations and extending this logic to the entire cell has led Klevecz et al. to a radical conclusion that the entire cell is an oscillator (Citation47). This inevitably leads to a question: If the basic machinery of oxidative phosphorylation is shared between all organisms from yeast to human, can the yeast's state of non-stop pervasive oscillation be extrapolated to the human cell? Energy-driven oscillation has been reported in isolated cardiomyocytes (Citation42). Mechanistic links between basic energy metabolism and the circadian molecular clock have been reported (Citation48,Citation49) as well as the rhythmic pattern of expression of major metabolism regulators (Citation12,Citation50), and a connection between circadian and metabolic rhythms through period duplication has been postulated explicitly (Citation47). However, the phenomena of circadian and metabolic oscillations are rarely considered in one study in spite of both oscillations being present and modulating gene expression. For example, we have extracted expression profiles for a few genes forming protein complexes on the oxidative (cytochrome C oxidase) and reductive (NADH dehydrogenase) parts of the oxidative phosphorylation pathways. The diagrams in show these genes in plant (A. thaliana, leaf), human (Homo sapiens, milk fat globule), mosquito (Aedes aegypti, heads) and mouse (Mus musculus, liver). None of the studies made any effort to synchronize the energy metabolism, and yet the genes involved in energy metabolism are among the most clearly oscillating genes. Not all microarray probes show perfectly rhythmic profiles, but overall parts of the oxidative phase tend to be expressed in the opposite phase to the genes from the reductive phase. We hypothesize that basic energy metabolism is actually the primary source of pervasive oscillations of the entire transcriptome, which may or may not be in synchrony with the molecular clock. These observations bring us one step closer to the realization that all genes may be expressed in a rhythmically oscillating manner and synchronized with each other by the co-ordinated actions of environmental cues and an internal molecular clock (Citation47).

Figure 2. Expression profiles of the genes forming oxidative and reductive parts of oxidative phosphorylation in plant (A), human (B), mosquito (C), and mouse (D).

Figure 2. Expression profiles of the genes forming oxidative and reductive parts of oxidative phosphorylation in plant (A), human (B), mosquito (C), and mouse (D).

The concept of pervasive circadian oscillation may explain previously puzzling findings. For example, the lists of periodic genes in previous publications and meta-analyses differ radically from tissue to tissue. Since conventional tests identify only a tiny fraction of truly oscillating genes for which the signal-to-noise ratio is most favorable, such lists are biased toward highly expressed genes associated with the specific metabolism and function of a particular tissue. The mystery of why genes under circadian control in one tissue are not in any other tissue may simply reflect the fact that genes display different activity levels and amplitudes in distinct tissues.

Multiple reasons can contribute to the inability to detect a circadian base-line in gene expression. Expression at a constant ‘straight line’ pattern (i.e. independent of time, environment, and rhythmic expression of other genes) is only one among many possible causes. As already highlighted, a possible cause is a low sampling rate and a short observation period reducing the statistical power of periodicity tests. A high level of technical noise due to imperfect affinity of some probes and cross-hybridization issues can obscure oscillation in low-expressed genes. Asynchrony is another factor that obscures peaks and valleys in expression. Cell cultures have no centralized signaling, and the synchronizing action of external factors is often lost after a few cycles (Citation51). While samples taken from tissues and organs are better orchestrated, their synchronization is also far from perfect due to heterogeneity within their cellular composition.

The obliteration of an observed rhythm in expression by asynchrony may come not only from cell heterogeneity, but also from mixed signals reflecting alternative transcripts (Citation52). Investigators stumbled on this when observing radically different behavior in alternative probe sets for the same gene on an Affymetrix GeneChip. Different probe sets were found to be oscillating in counter-phases to each other in a tissue-specific manner. For example, the leptin signaling pathway displayed counter-phase expression for transcripts of the SOCS3 gene in brown fat and JAK gene in white fat. These alternative probe sets represented alternative polyadenylated transcripts sharing the same coding sequences. Taken together, the summation of counter-phased expression of all transcripts resulted in a straight line or constant expression profile, without evidence of circadian oscillation. This may occur to ensure that the cell will have a constant abundance of transcripts available for protein translation regardless of the time of day. In the context of signal transduction via the leptin receptor, the cell can maintain a constant state of readiness by modulating the levels of JAK and/or the signal suppressor, SOCS3, depending on the tissue. From this perspective, the oscillation of every transcript is the default state while the maintenance of a constant rate of protein production is a special (albeit not rare) adaptation. These observations, of course, are dependent on the selection of probe sets for individual genes, and modification of commercial microarrays over the years has removed probes detecting many of these alternatively spliced mRNAs. Nevertheless, in addition to the ‘best’ probe set, many alternative probe sets can still be found mapping to the same gene that have been relegated to the bottom part of the annotation file. Considered separately in a time-line, they often show a clear oscillatory pattern in both primary and secondary probe sets, albeit with a phase shift. This is a frequent phenomenon for murine and plant (A. thaliana) genes that cannot be explained by cross-hybridization alone (unpublished observation). As a result, many oscillating genes may result in the circadian oscillations of their protein's production rate. Furthermore, these genes might appear non-oscillating if only the coding portion is sampled without consideration of the 5′ or 3′ untranslated regions (UTRs) and evaluation of the separate alternative transcripts.

The experimental design can confound gene synchronization patterns. For example, the classic design entrains the model organism (yeast, plant, or animal) with controlled rhythmic environmental cues. Subsequently, the environmental factors are excluded to spotlight the residual intrinsic oscillation. In the case of photic stimuli, samples are collected either in complete darkness or in a constant dim light. This design assumes that all genes are expressed at a constant rate and retain this pattern unless under control of the molecular clock. The presence of multiple independent entraining or synchronizing cues confounds this study design. While removal of one entrainer may disrupt the phase of the molecular clock, it also initiates a cascade of adaptive events designed to resynchronize gene and pathway expression. Thus, samples collected during this time reflect a disturbed state such as jet lag and are compared to a hypothetical constant base-line which is often not actually monitored. Consequently, this experimental design may under-report the number of oscillating genes. Furthermore, the set of oscillating genes detected will be biased toward that small cohort linked directly to the central or local cellular molecular clock. While such studies accurately report rhythmic genes, it is far less conclusive about the rest of the transcriptome nor does it necessarily reflect the normal organization of molecular mechanisms in a living organism. Some of the available circadian time-line experiments have collected samples under normal lighting conditions, the same regular alternation of 12-h light and 12-h dark periods used in initial entraining (Citation11,Citation12). The original analysis performed with single-gene testing reported higher numbers of oscillating genes but not dramatically higher than other studies using similar algorithms (Citation10). On the other hand, reanalysis of available data sets (), both published and unpublished, shows little difference in numbers of oscillating genes between experiments with and without altered lighting.

The statement that all genes have a circadian base-line does not necessarily distinguish ‘circadian’ from ‘diurnal’ rhythms. The former is presumed to be driven by the cellular molecular clock; the latter may be modulated by environmental cues and is presumed to dissipate in an unchanging environment such as constant darkness or constant dim light. Such a distinction can be evaluated experimentally by excluding external factors to cancel all oscillations except those directly connected to the molecular clock. Reanalysis of previously published circadian expression studies (partially published in (Citation26,Citation53)) shows that oscillations persist in the majority of genes even without changes in the ambient light and temperature cycles. These oscillations are not registered by under-powered statistical tests. Regardless of the driving force and the mechanism of synchronization to the daily cycle, whether molecular clock, oxidative phosphorylation, or another cellular circuit with a feedback loop, all rhythmic patterns are persistent and play a role in regulating biological processes. A critical limitation of the methods described in this review is their inability to distinguish between these two states. In humans, experiments using a constant routine protocol, whereby subjects remain constantly awake in a recumbent position with frequent caloric intake under constant indoor light, may minimize the impact of such environmental cues. Consequently, such studies may evaluate circadian gene oscillations more accurately and should be applied to future microarray approaches if possible (Citation54).

As outlined above, while there is quantitative evidence for the statement that all genes have a rhythmic base-line pattern in expression, this remains speculative due to limitations in the current database. Most expression estimations are noisy, statistically under-powered, and rarely detect verifiable changes in expression of less than 2-fold, thereby overlooking the amplitude range of many circadian oscillations. These low-amplitude rhythmic changes will become more important as detection methods improve. Currently, many gene expression experiments rely on RT-PCR techniques that require a reference gene. This method is complicated by the fact that many ‘housekeeping’ genes themselves display circadian oscillations. The levels of GAPDH in mice, for example, can vary almost 2-fold (). This can be observed in microarray profiles because microarray probe signals are scaled to multiple controls selected with no regard to their circadian behavior. The pool of control probes with various phase shifts cancels all oscillations and sums up into a steady line. Using a single reference gene with rhythmic transcript abundance can be reliable only when used with genes oscillating in an identical phase and with approximately the same amplitude.

Figure 3. Expression profiles of GAPDH in different murine tissues in absolute intensity (A) and scaled relative abundance used for detection of periodic pattern (B). Data acquired in Zvonic et al. studies (Citation11,Citation12).

Figure 3. Expression profiles of GAPDH in different murine tissues in absolute intensity (A) and scaled relative abundance used for detection of periodic pattern (B). Data acquired in Zvonic et al. studies (Citation11,Citation12).

What are the consequences and implications of accepting the hypothesis that pervasive oscillations affect the majority and/or all expressed genes? Assumption of default oscillatory status (unless proved otherwise) allows better explanation of observed patterns of gene expression and provides the foundation for developing new methods in Systems Biology without altering the general sum of knowledge in Life Science. Current text-books would not a priori contradict the finding that all genes are expressed in a rhythmic manner instead of the previously estimated 10%–15%.

Most molecular biology text-books do not discuss gene expression oscillations and only briefly discuss circadian clocks. This approach instills future researchers with the notion that the non-oscillating pattern is the default gene expression profile. According to this model, the organism's energy source is typically viewed like a battery, generating a constant flow devoid of any phases of oxidative phosphorylation. Diagrams of gene interaction make no reservation for the timing of events, simply stating that ‘gene A activates transcription of gene B’. The factor of time in real life adds an additional layer of complexity. However, this fourth dimension is not beyond comprehension. The basic unit of this structure is a cycle. A living cell is a complex three-dimensional union of multiple cycling oscillators structured and orchestrated by internal and external rhythms. The circadian molecular clock plays a very critical role as a timekeeper and synchronizer, although it may not be the only generator of oscillations. From this point of view it is less important to catalog how many genes are under direct transcriptional control of the circadian clock. Rather, it is more important to know the oscillatory properties of gene expression and synchronization patterns among functionally related groups of genes (pathways). These patterns differ based on cell type, tissue, and organs, their stages of development as a function of their changing environment, and homeostasis. Whether the cycle is circadian, diurnal, or some other interval, multiple genes display oscillatory expression with consequences relevant to the pathophysiology of patient health (Citation55).

Supplemental material

Supplementary Figure 1

Download PDF (256.2 KB)

Acknowledgements

The authors would like to thank all colleagues who have contributed data and thoughts in discussing rhythmic patterns of gene expression. In particular, Dr Morey Haymond, Dr Martin Young, Dr Molly Bray, Dr Jon Carlson, Dr Erica Suchman, and Dr William Black IV have provided the experimental data for unpublished reanalysis referenced in this review.

Declaration of interest: The authors declare no conflict of interest and have received no payment in preparation of this manuscript.

References

  • Lemmer B. Discoveries of rhythms in human biological functions: a historical review. Chronobiol Int. 2009;26:1019–68.
  • Szymanski JS. Die Verteilung von Ruhe und Aktivitätsperioden bei einigen Tierarten. Pfluger Arch ges Physiol. 1918;172: 430–48.
  • Halberg F. Some physiological and clinical aspects of 24-hour periodicity. J Lancet. 1953;73:20–32.
  • Konopka RJ, Benzer S. Clock mutants of Drosophila melanogaster. Proc Natl Acad Sci U S A. 1971;68:2112–6.
  • Wager-Smith K, Kay SA. Circadian rhythm genetics: from flies to mice to humans. Nat Genet. 2000;26:23–7.
  • Hardin PE, Hall JC, Rosbash M. Feedback of the Drosophila period gene product on circadian cycling of its messenger RNA levels. Nature. 1990;343(6258):536–40.
  • Ueda HR, Chen W, Adachi A, Wakamatsu H, Hayashi S, Takasugi T, . A transcription factor response element for gene expression during circadian night. Nature. 2002; 418:534–9.
  • Panda S, Antoch MP, Miller BH, Su AI, Schook AB, Straume M, . Coordinated transcription of key pathways in the mouse by the circadian clock. Cell. 2002;109: 307–20.
  • Duffield GE, Best JD, Meurers BH, Bittner A, Loros JJ, Dunlap JC. Circadian programs of transcriptional activation, signaling, and protein turnover revealed by microarray analysis of mammalian cells. Curr Biol. 2002;12:551–7.
  • Storch KF, Lipan O, Leykin I, Viswanathan N, Davis FC, Wong WH, . Extensive and divergent circadian gene expression in liver and heart. Nature. 2002;417:78–83.
  • Zvonic S, Ptitsyn AA, Kilroy G, Wu X, Conrad SA, Scott LK, . Circadian oscillation of gene expression in murine calvarial bone. J Bone Miner Res. 2007;22:357–65.
  • Zvonic S, Ptitsyn AA, Conrad SA, Scott LK, Floyd ZE, Kilroy G, . Characterization of peripheral circadian clocks in adipose tissues. Diabetes. 2006;55:962–70.
  • Covington MF, Harmer SL. The circadian clock regulates auxin signaling and responses in Arabidopsis. PLoS Biol. 2007;5:e222.
  • Boothroyd CE, Wijnen H, Naef F, Saez L, Young MW. Integration of light and temperature in the regulation of circadian gene expression in Drosophila. PLoS Genet. 2007; 3:e54.
  • Maningat PD, Sen P, Rijnkels M, Sunehag AL, Hadsell DL, Bray M, . Gene expression in the human mammary epithelium during lactation: the milk fat globule transcriptome. Physiol Genomics. 2009;37:12–22.
  • Hogenesch JB, Panda S, Kay S, Takahashi JS. Circadian transcriptional output in the SCN and liver of the mouse. Novartis Found Symp. 2003;253:171–80; discussion 52–5, 102–9, 80–3 passim.
  • Hughes ME, DiTacchio L, Hayes KR, Vollmers C, Pulivarthy S, Baggs JE, . Harmonics of circadian gene transcription in mammals. PLoS Genet. 2009;5:e1000442.
  • Nelson W, Tong YL, Lee JK, Halberg F. Methods for cosinor-rhythmometry. Chronobiologia. 1979;6:305–23.
  • Bingham C, Arbogast B, Guillaume GC, Lee JK, Halberg F. Inferential statistical methods for estimating and comparing cosinor parameters. Chronobiologia. 1982;9:397–439.
  • Harmer SL, Hogenesch JB, Straume M, Chang HS, Han B, Zhu T, . Orchestrated transcription of key pathways in Arabidopsis by the circadian clock. Science. 2000;290: 2110–3.
  • Ceriani MF, Hogenesch JB, Yanovsky M, Panda S, Straume M, Kay SA. Genome-wide expression analysis in Drosophila reveals genes controlling circadian behavior. J Neurosci. 2002;22:9305–19.
  • Lin Y, Han M, Shimada B, Wang L, Gibler TM, Amarakone A, . Influence of the period-dependent circadian clock on diurnal, circadian, and aperiodic gene expression in Drosophila melanogaster. Proc Natl Acad Sci U S A. 2002; 99:9562–7.
  • Fisher RA. Tests of significance in harmonic analysis. Proc R Soc Lond A. 1929;125:54–9.
  • Wichert S, Fokianos K, Strimmer K. Identifying periodically expressed transcripts in microarray time series data. Bioinformatics. 2004;20:5–20.
  • Ahdesmaki M, Lahdesmaki H, Pearson R, Huttunen H, Yli-Harja O. Robust detection of periodic time series measured from biological systems. BMC Bioinformatics. 2005;6: 117.
  • Ptitsyn AA, Zvonic S, Gimble JM. Permutation test for periodicity in short time series data. BMC Bioinformatics. 2006;7 Suppl 2:S10.
  • Ptitsyn AA, Zvonic S, Gimble JM. Digital signal processing reveals circadian baseline oscillation in majority of mammalian genes. PLoS Comput Biol. 2007;3:e120.
  • Doherty CJ, Kay SA. Circadian control of global gene expression patterns. Annu Rev Genet. 2010;44:419–44.
  • Bray MS, Shaw CA, Moore MW, Garcia RA, Zanquetta MM, Durgan DJ, . Disruption of the circadian clock within the cardiomyocyte influences myocardial contractile function, metabolism, and gene expression. Am J Physiol Heart Circ Physiol. 2008;294:H1036–47.
  • McCarthy JJ, Andrews JL, McDearmon EL, Campbell KS, Barber BK, Miller BH, . Identification of the circadian transcriptome in adult mouse skeletal muscle. Physiol Genomics. 2007;31:86–95.
  • McDonald MJ, Rosbash M. Microarray analysis and organization of circadian gene expression in Drosophila. Cell. 2001;107:567–78.
  • Edwards KD, Anderson PE, Hall A, Salathia NS, Locke JC, Lynn JR, . FLOWERING LOCUS C mediates natural variation in the high-temperature response of the Arabidopsis circadian clock. Plant Cell. 2006;18: 639–50.
  • Zambon AC, McDearmon EL, Salomonis N, Vranizan KM, Johansen KL, Adey D, . Time- and exercise-dependent gene regulation in human skeletal muscle. Genome Biol. 2003;4:R61.
  • Covington MF, Maloof JN, Straume M, Kay SA, Harmer SL. Global transcriptome analysis reveals circadian regulation of key pathways in plant growth and development. Genome Biol. 2008;9:R130.
  • Michael TP, Mockler TC, Breton G, McEntee C, Byer A, Trout JD, . Network discovery pipeline elucidates conserved time-of-day-specific cis-regulatory modules. PLoS Genet. 2008;4:e14.
  • Ptitsyn AA, Zvonic S, Conrad SA, Scott LK, Mynatt RL, Gimble JM. Circadian clocks are resounding in peripheral tissues. PLoS Comput Biol. 2006;2(3):e16.
  • Alan Wee-Chung Liewa N-FL, Xiao-Qin Caoc, Hong Yan. Statistical power of Fisher test for the detection of short periodic gene expression profiles. Pattern Recognition. 2009(42):549–56.
  • Jan Lindström HK, Esa Ranta. Detecting Periodicity in Short and Noisy Time Series Data. Oikos. 1997;78(2): 406–10.
  • James H, Horne SLB. A Prescription for Period Analysis of Unevenly Sampled Time Series. The Astrophysical Journal. 1986 March 15(302):757–63.
  • Klevecz RR, Bolen J, Forrest G, Murray DB. A genomewide oscillation in transcription gates DNA replication and cell cycle. Proc Natl Acad Sci U S A. 2004;101:1200–5.
  • Ptitsyn A. Stochastic resonance reveals ‘pilot light’ expression in mammalian genes. PLoS ONE. 2008;3:e1842.
  • Selkov E. Stabilization of energy charge, generation of oscillation and multiple steady states in energy metabolism as a result of purely stoichiometric regulation. European Journal of Biochemistry. 1975;59:151–7.
  • Aon MA, Roussel MR, Cortassa S, O'Rourke B, Murray DB, Beckmann M, . The scale-free dynamics of eukaryotic cells. PLoS One. 2008;3:e3624.
  • Liu Y, Tsinoremas NF, Johnson CH, Lebedeva NV, Golden SS, Ishiura M, . Circadian orchestration of gene expression in cyanobacteria. Genes Dev. 1995;9: 1469–78.
  • Tu BP, Kudlicki A, Rowicka M, McKnight SL. Logic of the yeast metabolic cycle: temporal compartmentalization of cellular processes. Science. 2005;310:1152–8.
  • Murray DB, Beckmann M, Kitano H. Regulation of yeast oscillatory dynamics. Proc Natl Acad Sci U S A. 2007;104: 2241–6.
  • Chen Z, Odstrcil EA, Tu BP, McKnight SL. Restriction of DNA replication to the reductive phase of the metabolic cycle protects genome integrity. Science. 2007;316:1916–9.
  • Klevecz RR, Li CM, Marcus I, Frankel PH. Collective behavior in gene regulation: the cell is an oscillator, the cell cycle a developmental process. FEBS J. 2008;275:2372–84.
  • Lemberger T, Saladin R, Vazquez M, Assimacopoulos F, Staels B, Desvergne B, . Expression of the peroxisome proliferator-activated receptor alpha gene is stimulated by stress and follows a diurnal rhythm. J Biol Chem. 1996;271:1764–9.
  • Sato TK, Panda S, Miraglia LJ, Reyes TM, Rudic RD, McNamara P, . A functional genomics strategy reveals Rora as a component of the mammalian circadian clock. Neuron. 2004;43:527–37.
  • Stutz AM, Staszkiewicz J, Ptitsyn A, Argyropoulos G. Circadian expression of genes regulating food intake. Obesity (Silver Spring). 2007;15:607–15.
  • Refinetti R. Time for sex: nycthemeral distribution of human sexual behavior. J Circadian Rhythms. 2005;3:4.
  • Ptitsyn AA, Gimble JM. Analysis of circadian pattern reveals tissue-specific alternative transcription in leptin signaling pathway. BMC Bioinformatics. 2007;8 Suppl 7:S15.
  • Ptitsyn A. Comprehensive analysis of circadian periodic pattern in plant transcriptome. BMC Bioinformatics. 2008; 9 Suppl 9:S18.
  • Brown EN, Choe Y, Luithardt H, Czeisler CA. A statistical model of the human core-temperature circadian rhythm. Am J Physiol Endocrinol Metab. 2000;279:E669–83.
  • Storch KF, Weitz CJ. Daily rhythms of food-anticipatory behavioral activity do not require the known circadian clock. Proc Natl Acad Sci U S A. 2009;106:6808–13.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.