ICARDA Caravan 6

Crop trials are not simple. Multilocation trials, covering different climates, soil types, management practices and pests and diseases, generate so much information that exploiting it needs a special tool. ICARDA has developed one.

By Bijan Chakraborty and Mike Robbins

CARDA's Computer and Biometric Services Unit, or CBSU, does not restrict itself to maintaining the Center's computing facilities. It must also develop a range of software tools for scientists who are performing a series of complex tasks. One of CBSU's operations is biometrics--the treatment of statistics to see what they are actually telling you (see Caravan No. 4). Another related, but separate, task is to assist the scientists with the decision support tools in the form of information management and statistical computation on diverse types of data collected from their experiments. It was the lack of sufficiently powerful and user-friendly software for this that led ICARDA to develop the Trials Management System--TMS.
        To understand why TMS is needed, and what its development involves, it is necessary to describe how and why trials throw up so much data. To the layman, a crop experiment must seem simple enough. One takes seeds of the variety one wishes to test, plants them in the appropriate environment and sees how they perform. But it is not that simple. The environments in the ICARDA region are diverse. A crop will face a huge variety of climatic conditions and pests and diseases from site to site--to say nothing of different soil types, fertilizer and pesticide use (or lack of them). Lines for possible release to farmers must be tested for all of these.
        So ICARDA collaborates with national scientists all over the West Asia and North Africa region and beyond to carry out multi-location, multi-year testing of advanced lines to identify improved germplasm. The heart of this cooperation is the international nursery system, which not only distributes ICARDA's improved germplasm but also functions as a cooperative testing vehicle. Candidate lines are evaluated at many key sites with stresses such as  drought, heat, cold, salt, disease and insects. Data returned from cooperators provides valuable information on the performance and adaptation of test genotypes. Such efforts are an integral part of ICARDA's collaboration with the national programs.
        Every year, ICARDA's crop improvement program receives data from an average of 30-35 locations around the region for around 80 yield and stress nurseries with an average of 40 test lines. The information received includes performance data such as seed yield,  total biomass and 100-seed weight, days to maturity and plant height. But it also includes data which could have a bearing on the test results--the fortnightly meteorological data, agronomic information such as amount of irrigation, types and quantity of pesticides, herbicides and fertilizer used, and damages due to pests and diseases, drought or cold.
        There is no suitable commercial software available that can be readily used  to first manage the complex distribution system, and then process and produce the reports based on such a mass of information, correlating all the factors involved. Line X performed well at 14 sites out of 16 but was lousy at two. Why? The scientist's comparison must incorporate all relevant factors. Line Y incorporates good yield and stress-resistance characteristics from several previous lines, but was a mediocre performer. Why? What's missing? Attempt to answer these questions from piles of computer printouts, and you will never have time to do anything else. And you may still overlook the answers. A tool was needed not only for statistical analysis, but also for administration of the collaborative testing system.
        Now CBSU has developed a powerful user-friendly software tool called Trials Management System (TMS) to automate the various functions of the international nursery system.TMS manages and produces the reports from the relational data of test lines and its parentage, information on experimental design and the field plan, site-specific information including the meteorological information, agronomic management and the observed attributes using a Relational Data Management System (RDBMS). The statistical analyses on the stored data are performed using commercial SAS (Statistical Analysis System) software.
        TMS correlates all the information, which will be gathered by the scientist in the field (often using a palmtop computer). TMS is able to report on a given line, or a group of lines at a given location or across all locations and years; it can present this information graphically if required.
        TMS has other analytical tasks besides comparing lines under different conditions.  For example, in a large number of field experiments there is bound to be the odd gap; a specific test line was not planted on one site, or the plants of a variety in a plot within a replication got damaged when a flock of sheep got loose in the field. Statistical analysis of such missing data requires special treatment (statistically expected mean with adjusted precision for that mean). This is difficult; a standard spreadsheet could "fill the gaps" by working out a simple average, but for the scientist this is not good enough. The standard margin of error from the other figures must be incorporated.
        Also, it would be very easy to analyze genotype by environment (G x E) interaction of the lines tested using data stored in TMS database. This is the extent to which

Making sense of data: Suhaila Arslan of ICARDA's Legume International Nurseries (left) and programmer and analyst Sawra Bitar put TMS through its paces. Small picture: results of chickpea trials in Algeria in 1994.

the line will or will not adapt to different environments (see Three among the millions in Caravan No. 4).  As part of its biometrics function, CBSU has developed a method for indexing the inter-site transferability of lines from the experiment data (see Yes, but will it grow in my field? in Caravan No. 4).
        But the extent of G x E interaction is not a simple arithmetical figure; before selecting lines for multilocation trials at different sites, the breeder must know over which environments that transferability extends. Agroecological characterization of a given zone  involves not just climatic data, but soils, slopes  and other factors. So TMS must take a range of site data and correlate it with that on, for example, pests and diseases. There is little point in sending a line for testing to a site where winter cold tolerance is required if that is the line's one weakness. TMS will eventually be able to warn the breeder. To strengthen this part of its capability, CBSU's engineers are working on links to other programs that ICARDA is developing and/or using. These include a germplasm bank database, a meteorological database and Geographical Information Systems (GIS). 
        TMS produces and prints fieldbooks for the recording of data, and labels for the dispatch of international nurseries, together with the quarantine certificate. This can help in the summer, when ICARDA prepares its International Nurseries for dispatch to collaborating scientists in the national programs.
        This is a lynchpin of the Center's collaboration with the countries of the region. It is also a major task performed under considerable time pressure. The Dispatch subsystem in TMS provides the tool for preparation of list of nurseries and the test lines in each nursery for the cooperators' nomination. Once the information from the cooperators is received, it calculates total seed weight and number of boxes for all nurseries to a cooperator, prepares quarantine  and donation certificates and finally prepares the follow-up letters for the airlines and the cooperators.
         This is a system designed for the real world. The modern crop breeder is computer-literate, but his first business is plants, not software. So TMS runs under Windows, making it a user-friendly tool.
         To the same end, it is set up to receive data from commonly used formats such as commercial databases and ASCII files. It is this that has allowed scientists to take palmtops into the field, thus avoiding double inputting of data. Linkages exist not only for input but also for output; for example, to Excel for better graphical presentation of data and to WordPerfect for report formatting.
          The program has been developed in collaboration with ICARDA's Legumes International Nurseries Scientist, Dr R. S. Malhotra, who has been testing its application to the legumes international nurseries. He feels significant benefits may emerge. "Some reports--such as finding the five best lines across all locations, or comparison of performance of common lines over two consecutive years--have hitherto been very difficult to obtain," he says. "With TMS's relational database, it is very easy."
Moreover, adds Dr Malhotra, TMS allows any query on a test line, sites, observed attributes and summary statistics. And information is gathered into a single database,  reports are produced. Dr Malhotra is also pleased by the way in which TMS ensures data validation. It will not permit incorrect entry of figures.
         And there are exciting possibilities for the future. Besides incorporating the links to other advanced applications mentioned above, ICARDA may arrange access to the system through the World Wide Web, allowing national scientists to gather/input data from their desktops. TMS will then become more than simply an advanced research tool. It will be a way to pull together plant breeders in tens of different countries--a scientific community working more closely than ever towards a common goal: more food.


Bijan Chakraborty is Scientific Applications Team Leader in ICARDA's Computing and Biometrics Support Unit. Mike Robbins is Science Writer/Editor, ICARDA.