711
Views
3
CrossRef citations to date
0
Altmetric
Articles

Funnel testing in webpage optimisation: representation, design and analysis

&
Pages 3-14 | Received 01 Mar 2017, Accepted 17 May 2017, Published online: 12 Jul 2017

ABSTRACT

When optimizing webpages in order to achieve the best conversion rate, the traditional approach is to isolate and analyze them separately through a sequence of experiments. In this paper, we propose a new framework to study a system of webpages simultaneously through the use of directed graph, fractional factorial design and simple optimization algorithm. The illustrative example shows that such complicated web systems can be easily studied and optimized by using the proposed methodology.

1. Introduction

In the information technology age, more business transactions are conducted on the Internet. Webpage has become an important source of revenue for many companies such as Amazon, Facebook, Walmart and eBay. How to design the webpages to best serve the interest of the business owners has now become a hot research topic in e-commerce (Ash, Citation2009). The jobs of interest completed by the visitors to the webpages are called conversions. Typical examples of conversions are purchases, newsletter or membership subscriptions, viewing of a page, etc. The percent of visitors completed the jobs of interest is called the conversion rate. A main goal of studying the webpages design is to maximise the conversion rate. This is called conversion rate optimisation (abbreviated as CRO). CRO has been extremely important in large IT companies in the last decade. Through this practice, companies have seen a huge increase in their profits. Moreover, companies doing CRO for others, such as Webtrends and SiteTuners, have grown rapidly in the last ten years and are now very popular in the IT industry. Through CRO, they helped the clients achieve greater business success.

Two methods are commonly used in CRO (Ash, Citation2009). The first one is called A/B test. As the name indicates, this method compares two versions of a webpage: the original version and the proposed new version. A variation of this method is called A/Bn test, where multiple proposed versions of a webpage are compared with the original design in one experiment. Hypothesis testing is used to assess the difference and the best version is chosen as the design of the webpage in the future. The second method is called multivariate test (abbreviated as MVT), where multiple factors, each with two or more levels, are studied in one experiment. MVT is usually implemented with fractional factorial designs, and models are fitted for the conversion rates with respect to the factors. The optimisation is done by the standard method of choosing optimal level settings in design of experiment and the best combination of factor levels is used for future webpage design (Montgomery, Citation2012; Wu & Hamada, Citation2009).

A/Bn test is more commonly used in CRO because it is easy to understand, implement and analyse. When companies start a web campaign, they usually have different designs for their campaign page. In order to maximise the revenues from the campaign, they use the beginning part of the campaign to do an A/Bn test and use the best version as the campaign page for the rest of the campaign. On the other hand, when it comes to optimising a product page, which consists of multiple sections such as header, banner, text and pictures, MVT can be more efficient.

The most commonly studied page with CRO is the landing page, which is the first page visitors see when directed from other sources such as search engines or directly entered web addresses (Ash, Citation2009). Most landing pages only have the general information of the company's products and services, and conversions usually do not take place here. For example, suppose the visitors want to buy products from the website. Before they make payments, they usually have to go through the product description, view the product pictures, check the product reviews, and enter the payment information. With so many other pages involved, studying just the landing page in order to maximise the conversion rate may be an oversimplification. The series of pages the visitors went through until a possible conversion is called the conversion funnel (Ozolins, Citation2012), or abbreviated as the funnel if there is no ambiguity in the context. In the last example, the conversion funnel consists of the landing page, the product description page, the product picture page, the product review page and the payment information page. Moreover, conversions may take place on different pages. For example, the visitors can subscribe the newsletters on the landing page, the product page and even on the payment confirmation page. The pages where conversions can possibly happen are called conversion points. If the landing page is studied with respect to only one conversion point, the results can hardly be conclusive. The set of pages associated with the conversion of interest is defined as the conversion system, or abbreviated as the system if there is no ambiguity in the context. A conversion system consists of the landing page, all the conversion points, and all other pages that link between them. For example, suppose we have three pages: the landing page, page 1 and page 2, and both page 1 and page 2 are conversion points. Suppose visitors can go from the landing page to either page 1 or page 2, and also from page 1 to page 2. Then, the conversion system consists of these three pages, and we referred this example as the toy example. This example will be used as the primary illustrative tool of the framework proposed in this paper. In this example, the landing page and page 1 make a conversion funnel.

In the next section, we use directed graph to represent a conversion system and use this graph representation to identify all the conversion funnels in a system. A fractional factorial design on all the pages in a system is used to conduct the experiment. In section 3, we propose an analysis strategy to optimize the conversions in a system and illustrate its application in a toy example in section 4. A simulated example is given in section 5. Concluding remarks and future research are given in section 6.

2. Representation and design

The idea of using directed graph to study the Internet originates from a concept called webgraph (Donato, Laura, Leonardi, & Millozzi, Citation2004), where webpages are viewed as vertices and the linkage relationships between pages are expressed with directed edges in the graph. In computer science, the size of the webgraph being studied is usually very large (over millions), and researchers are interested in the large-scale properties such as in/out distributions, connectivity and cyclic patterns. The results are used to identify communities and hubs, filter spams, rank pages and predict the growth of the Internet (Donato et al., Citation2004). In this work, we use this idea, and represent a conversion system with directed graph. Here we are most interested in how to identify all the conversion funnels from the graphical structure.

The representation is straightforward. All the webpages in a conversion system are viewed as vertices of the graph. If there is a hyperlink on page X referring to page Y, draw a directed edge from X to Y. For the toy example, the conversion system consists of three pages: the landing page, page 1 and page 2, and we denote these three vertices as vL, v1 and v2, respectively. On the landing page, there are links referring to both page 1 and page 2. Therefore, there are two edges starting at vL and directing to v1 and v2, respectively. We denote them as e1 and e2 correspondingly. Furthermore, there is a link on page 1 referring to page 2. Draw another edge from v1 to v2 denoted by e3. There are no more links in this conversion system, so the toy example is represented by the following graph in . The conversion points are marked in solid dots in contrast to others.

Figure 1. Directed graph representation of the toy example.

Figure 1. Directed graph representation of the toy example.

After representing the conversion system as a directed graph, the next step is to identify all the conversion funnels. By definition, a conversion funnel is a series of pages that a visitor has gone through before making a possible conversion. Since visitors always start with the landing page, and make conversions on the conversion points, in the graph, a conversion funnel is a path from the landing page to a conversion point. To identify all the conversion funnels in the system amounts to finding all the paths connecting the landing page and all the conversion points. In the toy example, page 1 and page 2 are both conversion points. For page 2, there are two paths connecting it to the landing page: vL via e1 to v1 then via e3 to v2, and vL via e2 to v2. These two conversion funnels are referred as CF1 and CF2. For page 1, there is only one path connecting it to the landing page: vL via e1 to v1. This conversion funnel is referred to as CF3. Since there are no more conversion points in this system, the toy example has three different conversion funnels.

Before this work, some researchers have realised the concept of conversion funnels and have done experiments using this concept (Qualaroo, Citation2014). However, they used one-page-at-a-time method, and studied the pages in a funnel sequentially. Note that, even for a small conversion system like the toy example, there are already three different conversion funnels. The old method is extremely time-consuming and ignores any interactions between different pages. In this work, we study the conversion system as a whole and design one experiment for all the pages involved.

Since there are multiple pages involved in the experiment, A/Bn test will be inefficient. How about the MVT? In MVT, the first step is to identify the factors being studied. Since this experiment considers all the pages in the conversion system at one time, the set of factors consists of the factors from all the pages. For example, in the toy example, suppose each page has two factors to be studied: A and B from the landing page, C and D from page 1 and E and F from page 2. The factors being studied in this experiment are A, B, C, D, E and F. After identifyg the correct set of factors, the next step is to construct a fractional factorial design for them. For details on the choice of designs, the reader may refer the book by Wu and Hamada (Citation2009).

For ease of implementation, we assume each factor has two levels. In this case, a 2kp design is used, where k factors are being studied, each at two levels denoted by + and −. It is the p− 1 faction of the 2k full factorial design. In some other cases, mixed-level designs are also used.

The conversion data are collected as binary responses: 1 means conversion made and 0 means no conversion. The conversions for each funnel are recorded sepately for ease of analysis. Take the toy example, if a conversion is made page 2, and the visitor comes directly from the landing page, this conversion is recorded as data with CF2.

3. Analysis

Define the total conversion rate as the weighted sum of the conversion rates of all the conversion funnels in the system, where the weights reflect the importance of the funnels to the business owners. For example, in studying the sales of shoes, the weights can be the price of shoes in different funnels. When considering the subscription rate of the newsletters, the weights can be set to be equal. The goal is then to maximize the total conversion rate.

Take the toy example for illustration. We have 8 factors to be analyzed: A through F plus E′ and F′ for the additional variant of page 2 in the conversion funnel 2. The reason to add E′ and F′ is that the optimal settings for page 2 under conversion funnel 1 may not be optimal for conversion funnel 2. Suppose you want to design a three-page conversion system to maximize the membership subscriptions. For simplicity, assume this system is the same as in the toy example. Then, let us check which page(s) can have multiple versions and why. The landing page appears in all three conversion funnels, but the visitors have to start with the landing page, so all three funnels share the same version of the landing page. Page 1 appears in CF1 and CF3, but visitors are always directed from the landing page to page 1. Therefore, page 1's in the two funnels are the same. Page 2 appears in CF1 and CF2, but before page 2, visitors have viewed different pages for these two funnels: for CF1, visitors have viewed the landing page as well as page 1 before page 2, whereas for CF2, visitors have only viewed the landing page before page 2. Since page 1 is not in CF2, visitors in the two funnels have been exposed to different information before page 2. Therefore, page 2's in CF1 and CF2 can have different versions. This can be explained with the following example. Suppose page 1 shows the benefits of the membership and page 2 shows the price in the original design. Normal visitors check the benefits and compare them with the price on page 2 and determine whether to subscribe the membership. If the visitor does not care about price, she/he may subscribe after viewing the benefits, and if the visitor is eager to know the price, she/he may jump to page 2 directly from the landing page. For the last visitor, she/he does not have the information about the benefits; the decisions made might be biased. Suppose the membership price is high, but also offers great benefits whose monetary value may exceed the cost. For normal visitors, after comparing the cost with benefits, they are likely to pay for the membership price. But for visitors that have skipped the benefits page, just by looking at the price, they might think it is too high and decide not to go for it. These two decisions are made with different amounts of information, and the second one is biased. With the concept of the conversion funnel, it is immediately noticed that these two decisions are made on two different funnels. In order to correct the second situation, we simply show the visitors another version of page 2, which has the benefits as well as the price of the membership, if they come directly from the landing page.

The idea of the analysis is to study the conversion rate of a specific conversion funnel instead of specific pages, since a page can be related to multiple conversion funnels. For example, in the toy example, page 2 is the conversion point for CF1, but it is also the conversion point for CF2. For each conversion funnel, we identify its related factors from its constitution of web pages. As in the toy example, CF1 is related to page 1 and 2, so the model for CF1 would have A, B, C and D as the candidate factors. Similarly, we can identify the candidate factors for CF2 and CF3. After all the candidate factors are identified for all conversion funnels, we can then build linear models individually for each of them. For information on the analysis of fractional factorial design, the readers can review the book by Wu and Hamada (Citation2009).

The total conversion rate is obtained through a linear combination of the conversion rates for individual conversion funnels. The optimization is then done with respect to the 8 factors including E′ and F′ as discussed in the previous paragraphs.

4. Illustration of analysis with the toy example

In this section, we use the toy example and simulate a set of data to illustrate the analysis strategy. Recall that the conversion system in the toy example consists of three pages: the landing page, page 1 and page 2, and there are three conversion funnels in the system: CF1, CF2 and CF3. Note that we suppose each page has two factors to be studied, so a 26 − 2 design is used for the simulation, where each row of the design matrix represents a version of the conversion system. The design matrix is given in , whose defining relations are I = ABCE = BCDF = ADEF.

Table 1. Design matrix, toy example.

For each simulation, we first choose a version of the conversion system from the 16 candidates in the design table with equal probability, and then simulate the visitors’ behaviour in the chosen system. The visitor always starts with the landing page, and she/he then have three choices: go to page 1, go to page 2 or leave the system. The first step is to simulate the decision on the landing page. Suppose the visitor goes to page 1 or page 2 with probabilities t1 or t2, respectively, where t1 and t2 are functions of the factors related to the landing page, i.e., A and B. All the functions of decision probabilities used in this simulation are listed in . The decisions are made sequentially. First, we determine if the visitor goes to page 1. If she/he does not go to page 1, then we check if she/he goes to page 2. If the choice of the second one is still negative, she/he leaves the system. So to be specific, t2 is the probability of the visitor going to page 2 given that she/he does not go to page 1. If the visitor chooses to go to page 1, then on page 1, there are still three choices: make a conversion, go to page 2 or leave. We then simulate his/her decision on page 1. Suppose she/he makes a conversion with probability c1, If she/he does not make a conversion, then goes to page 2 with probability t3. If the decisions are both negative, she/he leave the system. Note that, although the choices are made on page 1, the two probabilities c1 and t3 are functions of factors related to both the landing page and page 1, i.e., A, B, C and D, because it is believed that the information on the landing page will affect the visitors’ behaviour thereafter. Finally, for visitors who land on page 2, there are two types: they come directly from the landing page, or have visited page 1. This represents the two conversion funnels CF2 and CF1, respectively. Since page 2 has no link referring to others, the choices are whether to make conversions. The decisions are made as follows. If the visitor comes via CF2, she/he converts with probability c2; otherwise, convert with probability c3. Note that CF2 consists of two pages. Therefore, c2 is a function of the factors related to those two pages, i.e., A, B, E and F. Similarly, c3 is a function of factors A, B, C, D, E and F. Each simulation terminates when the visitor either makes a conversion or leaves the system. We ran 10,000 simulations, and recorded the conversions of each funnel separately. The conversion rates are given in .

Table 2. Functions for decision probabilities.

Table 3. Conversion rates for different funnels, toy example.

For simplicity, assume all the conversion funnels have equal weights in the toy example. Denote the total conversion rate as CRT, and the conversion rates for CF1, CF2 and CF3 as CR1, CR2 and CR3 respectively. The objective function for optimisation can be written as

Then we build models for CR1, CR2 and CR3 separately. Recall that the first step in modelling the conversion rate of a funnel is to identify the related factors. For CR1, since CF1 consists of three pages, the landing page, page 1 and page 2, the factors considered in this model are A, B, C, D, E, and F. Similarly, for CR2 and CR3, the factors considered are A, B, E and F and A, B, C and D, respectively.

The model building is then straightforward. For CR1, it is a function of all the six factors. The corresponding design matrix and responses considered in this model are shown in . The first step of modelling is to make a half-normal plot to identify significant effects. In , it is clearly seen that main effects F and D and two-factor interaction (abbreviated as 2fi thereafter) AE are the most significant. Because these three terms have the same magnitude, we denote them as group 1. They are followed by main effects E and A and 2fi's AF and D, whose magnitudes are the same, and denoted as group 2. The other effects are not significant. Therefore, the model for CR1 has seven terms. The R2 value for this model is 99.12% and the p values for the significant effects in group 1 and group 2 are 1.91e−7, and 0.00133, respectively. The explicit expression of the model is (1)

Table 4. Design matrix and response data, conversion rates for CF1.

Figure 2. Half-normal plot, conversion rate for CF1.

Figure 2. Half-normal plot, conversion rate for CF1.

For CR2, the corresponding funnel consists of two pages: the landing page and page 2. Therefore, only factors A, B, E and F are considered. The corresponding design matrix and responses used in this model are shown in . We start the analysis by drawing a half-normal plot to identify significant effects. In , it is clearly seen that the main effect E is the most significant. It is followed by B and BE, and then A and AE. The other effects are not significant. Therefore, the model for CR2 has five terms. The R2 value for this model is 97.28% and the p values for the significant effects are 4.11e−8, 6.71e−5, 6.71e−5, 0.0377% and 0.0377%, respectively. The explicit model is written below: (2)

Table 5. Design matrix and response data, conversion rates for CF2.

Figure 3. Half-normal plot, conversion rate for CF2.

Figure 3. Half-normal plot, conversion rate for CF2.

Finally, for CR3, the corresponding conversion funnel CF3 consists of two pages: the landing page and page 1. Therefore, factors A, B, C and D should be considered. The corresponding design matrix and responses used in this model are shown in . We again start the analysis by drawing a half-normal plot to identify significant effects. In , it is seen that the main effect A is the most significant. It is followed by C, BD, B, AC and AB. The other effects are not significant. Therefore, the model for CR3 has six terms. The R2 value for this model is 98.3% and the p values for significant effec are 1.23e−7, 1.16e−6, 7.72e−6, 4.47e−5, 1.51% and 6.42%, respectively. The explict model expression is given below: (3)

Table 6. Design matrix and response data, conversion rates for CF3.

Figure 4. Half-normal plot, conversion rate for CF3.

Figure 4. Half-normal plot, conversion rate for CF3.

Recall that page 2 appears in both CF1 and CF2 and can have two different versions. Replace the factors E and F in (Equation2) with E′ and F′, respectively. Now, put all three models together, we have the total conversion rate expressed as a function of all the eight factors:

To maximise CRT, we find the optimal level settings of the eight factors. By checking all the possible combinations of the eight factors, it is seen that by setting A, E and F to −, and B, C, D and E′ to +, we have the maximal expected conversion rate of this system, which is 26.12%. Because F′ does not appear in any significant terms, we can choose either setting according to other considerations.

Since the objective of this methodology is to maximize the overall conversion rate, one might suggest using the overall conversion rate as the response and fitting the model with all available factors. In the beginning, this approach may seem simple and obvious with only a few webpages. However, as the size of the web system increases, the number of factors can grow exponentially. Fitting a model with so many variables may be infeasible. Instead, if we study each conversion funnel separately, the difficulty of model fitting within the conversion funnel can be well managed. On the other hand, this bottom-up approach gives the user more flexibility: would the weight for the conversion funnel changes per user perception changes, the proposed method can easily be adapted to this situation by changing the corresponding weight in the total conversion rate formula instead of fitting a new model. In addition, the user may also gain a deeper understanding of each of the conversion funnels: the user would be able to identify factors that could affect a particular conversion funnel the most and study deeper into those factors for this conversion funnel if needed.

5. Simulated example

In this section, we will demonstrate the idea of funnel testing with a more complicated example. Consider a conversion system that consists of six pages. The first page is the landing page (vL). All customers start visiting the conversion system with this page. The second page is the individual page (v1). It is the page showing information for individual customers. The third page is the business page (vB). It is the page showing information for business customers. There are three more pages, called Product 1 page (v1), Product 2 page (v2) and Product 3 page (v3) respectively, for customers to make conversions for three different kinds of products. The linkage relationship between pages can be describes as follows. The customers always start with the landing page, where they have three choices: go to individual page, go to business page, or leave the system. If the customer goes to individual page, she/he then has three choices: go to Product 1 page, go to Product 2 page, or leave the system. Similarly, if the customer goes to business page, she/he then also has three choices: go to Product 3 page, go to Product 2 page, or leave the system. Customers can convert on any of the Product pages, or leave the system. For customers on Product 1 page or Product 3 page, they have one more choice to go to Product 2 page.

We start our analysis by representing this conversion system with the directed graph in . The six pages are viewed as six vertices vL,   vI,   vB,   v1,   v2,    and v3, respectively, and the linkage relationships are viewed as eight directed edges denoted by e1, …, e8. v1,   v2 and v3 are marked in solid dots in contrast to others, meaning that they are conversion points.

Figure 5. Directed graph representation of the simulated example.

Figure 5. Directed graph representation of the simulated example.

The next step is to identify all the conversion funnels from the directed graph representation. By definition, a conversion funnel is a path from the landing page to a conversion point. For the conversion point v1, only the path vL → vI → v1 connects vL to it. We denote this conversion funnel by CF1. Similarly, only one path vL → vB → v3 connects vL to v3, which is denoted by CF3. For the last conversion point v2, there are four paths connecting vL to it: vL → vI → v2, vL → vB → v2, vL → vI → v1 → v2, and vL → vB → v3 → v2, which are represented by CF2I, CF2B, CF21 and CF23, respectively.

Suppose each page has two factors to be investigated (A and B for the landing page, C and D for individual page, E and F for business page, G and H for Product 1 page, I and J for Product 2 page, and K and L for Product 3 page), and each factor has two levels. A 212 − 6IV design is used for the experiment. The design matrix is shown in . The defining relations of this design is I = ABCG = ABDH = ACDEI = ACDFJ = ABEFK = BCEDFL. Each row of the matrix represents a version of the conversion system to be studied.

Table 7. Design matrix, simulated example.

Take the landing page and its corresponding factors A and B for example. Factor A may represent the choice of the header, which has two candidates with version 1 denoted by level + and version 2 denoted by −. Similarly, B can be the choice of the main picture on the landing page, with + being version 1 and − being version 2. Other factors of other pages can be interpreted similarly. In general, each factor represents one element of its corresponding page that we want to investigate. In this example, each element has two candidate versions and we want to decide which one is better. As discussed in Section 1, elements can be headers, banners, texts, pictures, etc.

The experiment is done by simulations. For each simulation, we first choose a version of the conversion system from the 64 candidates in the design table with equal probability, and then simulate customers’ behaviour in the chosen system. The customers always start with the landing page. Recall that, on the landing page, they have three choices: go to individual page, go to business page, or leave the system. We simulate their decisions in the following way: they go to individual page with probability tLI, where tLI represents the transition probability from the landing page to individual page; they go to business page with probability tLB. If the simulated decisions of the above two statements are both negative, customers leave the system. If the simulated decisions are both positive for the two statements, the customers go to either page with equal probability. For customers on individual page, we simulate their decisions among the three choices in a similar way: they go to Product 1 page with probability tI1, and go to Product 2 page with probability tI2. Similarly, for customers on business page, we suppose they go to Product 3 page with probability tB3, and go to Product 2 page with probability tB2. For customers on either Product 1 page or Product 3 page, they make conversions with probability c1 or c3, respectively. If they do not make conversions, they can further go to Product 2 page with probability t12 or t32 respectively. Otherwise, they leave the system. Finally, for customers on Product 2 page, they can either make conversions or leave the system. The probabilities for them to make conversions are c2I,   c2B,   c21,    and c23, respectively, depending on the conversion funnels they are from. The simulation is terminated when the customers leave the system or a conversion is made. All the decision probabilities used in the simulations are listed in . If the calculated probabilities are less than zero, we suppose they are zero in the simulation. The choice of the decision probabilities is somewhat arbitrary but with the following rationalisation. Take tLI for example, we have tLI = 0.5 + 0.1A, which is a function of the main effect A of the landing page. According to the assumption, A has two levels denoted by + and −. Therefore, the transition probability from the landing page to individual page is 0.6 (= 0.5 + 0.1) if factor A is set to level +, and 0.4 (= 0.5 − 0.1) if A is set to −. Similarly, for t12, we have t12 = 0.05 − 0.05DH + 0.05C − 0.05AG. It is a function of the main effect C and 2fi's DH and AG, which involves five factors A, C, D, G and H. Among the five factors, A is from the landing page, C and D are from individual page, and G and H are from Product 1 page. Note that t12 is the transition probability from Product 1 page to Product 2 page, which is part of CF21. Recall that CF21 is the path vL → vI → v1 → v2. Before making this decision, customers have gone through the landing page, individual page and Product 1 page along the path. Therefore, it is assumed that only factors on these three pages can affect this decision. The 2fi DH in t12 can be interpreted as follows: factor D on individual page and factor H on Product 1 page will jointly affect the customers’ decision as whether to go to Product 2 page from Product 1 page. Other 2fi's in can be interpreted similarly. By changing the level settings of the five factors, t12 can be as high as 0.2 or as low as 0.

Table 8. Functions for decision probabilities.

We repeat the simulation for 10,000,000 times and record the conversions for each funnel separately. The calculated conversion rates are given in . Note that CF21 and CF23 both have no conversions in the simulated results.

Table 9. Conversion rates for different funnels, simulated example.

The total conversion rate is assumed to be

We model the conversion rate for each conversion funnel separately. Recall that the first step in modelling the conversion rate is to identify its related factors. For CR1, it is the conversion rate for CF1, which consists of three pages: the landing page, individual page and Product 1 page. Therefore, the six factors (A, B, C, D, G and H) related to these three pages are considered in modelling CR1. Similarly, we can find the related factors for other conversion rates, all of which are listed in .

Table 10. Conversion rates and related factors, simulated example.

The model for each conversion rate is then built with respect to its related factors listed in . We consider the linear regression model with only the main effects and 2fi's and use the same method as in the toy example. The fitted model for each conversion rate is shown in . CR21 and CR23 are both zero because CF21 and CF23 have no conversions for all 64 versions of the conversion system in the simulated results.

Table 11. Fitted models for each conversion rate, simulated example.

In theory, Product 2 page can have four different versions, which would give us six more factors I′, J′, I″, J″, I′″ and J′″ in the optimisation procedure. This is because v2 appears in four different funnels, and the pages before v2 are different for each funnel. However, the factors related to v2 do not appear in any of the models in . Therefore, such consideration becomes unnecessary. Finally, by putting the functions in together, we have the model for the total conversion rate to be

To maximise CRT, we try all possible combinations of the nine factors involved in the above model. The maximal CRT value is achieved by setting A, E, K and L to +, and B, C and F to −. The level settings of D and G do not affect the value of CRT if the remaining seven factors are chosen as above. The other three factors that do not appear in the above model can be set based on other considerations.

6. Concluding remarks

In this work, we propose a new framework and approach to analyse a system of pages that relate to the conversion of interest based on the concept of conversion funnels. Directed graph is used to represent the system and identify all the conversion funnels. Fractional factorial design is used to conduct the experiment. An analysis strategy consists of modelling the conversion rate of each funnel separately and putting them together in the total conversion rate to do optimisation.

So far the analysis strategy is developed in the context of a specific example. But the underlying ideas are general. It is our plan to further develop this framework into a general methodology.

Though the general idea of this methodology may seem intuitive and straightforward as illustrated by the examples provided, its implementation can be challenging as the size of the web system increases. For example, suppose the number of webpages to be studied is now 20, which is quite typical for a real world web system, and suppose each page has only 2 factors, each with 2 levels, and that there are only 20 conversion funnels in this system. There are two immediate challenges facing the user: how to find a feasible design and how to optimise the overall conversion rate. In this case, the total number of factors need to be included in the design is 40 and the total number of factors need to be considered in the optimisation can easily reach hundreds due to the complex structure of the conversion funnels and that some of the factors will need to have multiple versions as they would appear in different conversion funnels. For the design problem, it is not easy to find a 2-level fractional factorial design with 20 factors that is good according to the maximum resolution or minimum aberration criterion and whose run size is feasible for web system experimenters. Therefore, one might need to consider nonregular designs such as the Plackett-Burman designs or other orthogonal arrays. See the book by Wu and Hamada (2009) for details. In fact, since there are only limited numbers of interactions that may be important to the users, we can also use designs with relatively low resolution but has a few clear 2fi's to study the web system. For the second challenge, as the number of factors in the optimisation problem increases, exhaustive search might be infeasible so the user might want to try some other optimisation algorithms such as response surface or even integer programing. This problem becomes more daunting if the factors can have different numbers of levels because there is only a small dictionary of mixed-level orthogonal arrays with a large number of factors.

Acknowledgments

This research is supported in part by ARO grant W911NF-17-1-0007. The authors are grateful to the Associate Editor and a referee for helpful comments.

Disclosure statement

No potential conflict of interest was reported by the authors.

Additional information

Funding

ARO [grant number W911NF-17-1-0007].

Notes on contributors

Heng Su

Heng Su is a quantitative associate at Wells Fargo Bank.

C. F. Jeff Wu

C. F. Jeff Wu is the coca-cola chair in Engineering Statistics in Engineering Statistics at the School of Industrial and Systems Engineering, Georgia Institute of Technology.

References

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.