Abstract: Though genetically modified (GM) crops have been rapidly adopted in worldagriculture, concern has been expressed about the environmental risks that they may involve. Inorder to identify and measure the effects on NTOs (non-target organisms) of transgenic traits ofcrops such as Bt corn, which has been designed to kill target species, a tiered approach has beenproposed. When an effect is detected in laboratory or semi-field steps, it is measured insubsequent higher-tier steps with an increasing complexity and realism in field conditions. Insome cases this sequential testing scheme is not applicable because potential effects are notmeasurable in simple laboratory conditions and the testing procedure has to be initiated with fieldtrials. This is the case, for example, when indirect effects are expressed through the food web orwhen the main expected effect is a consequence of changes in agricultural technology. However,field trials have many disadvantages and, whenever possible, should be conducted after basicdata have been obtained in lower steps. After 13 years conducting field trials of Bt and herbicidetolerantmaize in Spain, the authors review the results to discuss how trials aimed at measuringthe effects of GM crops on NTOs can be improved. Two main aspects are considered: (i)statistical power, that is, the probability of a null hypothesis (no differences between a transgenicvariety and a non-transgenic comparator in the effects on an NTO) being rejected when it is nottrue; (ii) species or species assemblages that may be used as surrogates for testing effects onNTOs in the field according to the power for each one, calculated in a number of field trialsselected among those conducted in the last years. In the selection it was attempted to combine themaximum variation in trial characteristics: multiyear vs. one-year trials, few varieties vs. manyvarieties, single vs. stacked transgenic traits, and finally plot size. Results are discussed in orderto make recommendations on selecting those surrogates with a power above 0.7 to detect 50% or25% differences between the GM variety and a non-GM comparator. A classification ofsuitability of non-target arthropods as surrogates for ERA trials according to the number ofreplications required to reach at least 0.7 power is proposed.