Abstract
In this paper the behaviour of various non-nested hypothesis testing procedures are studied, namely the J and JA tests and the same bootstrap-adjusted tests, using graphical methods. These methods are the P value plot, the P value discrepancy plot and the size-power graph. The size and power of the four tests are compared for all the possible nominal sizes and not only for 1% or 5%. It is found that the best test is the bootstrap-adjusted J test, given that its size is close to the nominal, independently of the nominal size, whilst it has a higher power than the JA test.