M. Al-Gharabat, E. Campbell, D. Johnson, W. Nathan, J.R. Schoolar, F. Winters, and D. Mitra

An Expert System For Identifying Plants From Their Visible Features

In this paper we present the result of a project on developing an expert system for identifying plants from their visible features. The system is an interactive one which asks questions to a non-expert user and gradually narrows down the possibilities, eventually deciding about the identity of the plant in question. Currently it identifies forty-eight plants of types tree, vine, or shrub. In the paper we have discuss some possible use of this program. We also describe our experience in developing this system, reasons behind different choices made during the design, and our future plans for extensions of the software.


1. Introduction

One of the goals of Aritificial Intelligence (AI) is to simulate and apply a human's cognitive thinking ability towards solving problems. Often, researchers incorporate artificial intelligence within other disciplines (e.g. engineering, physical and life sciences, etc.) in order to develop automated systems for solving problems in those domains. Such is the case with the project presented in this paper. With a huge array of vegetation in this country, it is extremely difficult for a person other than a botanist or other expert to identify a specific type of plant simply by its features. A botanist may not always be available, or may be very expensive to employ all the time for identifying the plants. Keeping this problem in mind we wanted to develop an automated system which would encode a botanist's knowledge on how to identify some plants using only the visible features, so that any non-expert can use the system and get reliable help for accomplishing this task.

Exactly how do we represent the discerning features of various types of plants in a computer system? To begin with, the aforementioned expert (the botanist in our group) formulates a set of keys that identify plants based on their distinguishing visible features. An example would be: a Sweetgum plant has broad, alternate star-shaped leaves which are sometimes lobed; and it has toothed margins with a spiky ball fruit on a long stem. We have incorporated that knowledge into a knowledge-based system which consists of an underlying intelligent program, called the inference engine, which can reason with such knowledge. A program development environment, called the expert system shell, facilitates the creation of the knowledge-base on top of the intelligent program. The knowledge-base contains all of the relevant, problem-solving data that the expert has supplied. The inference engine examines the current knowledge and combines it with accumulated facts to derive additional facts and ultimately, the conclusion.

We discovered that the process of identifying plants could be compiled in terms of rules. For this reason we chose to represent the plant-identification knowledge with the rule-based knowledge representation scheme.

This project is mostly a student project with the involvement of a GIS-expert and a biologist from the Bureau of Land Management, and an AI-expert from the faculty of Jackson State University. The preliminary phase of the project, as reported in this paper, was designed mainly to demonstrate the concept. The paper contains seven more sections. The next section introduces the reader to the concept of rule-based reasoning, and the motive behind our choice of the particular scheme of rule-based reasoning. The third section describes the software tool we have used for developing the system and reasons behind its choice. The following section describes our experience with the development of the program. It is followed by a section which explores the possibilities of utilizing the program in the real life. These indicate the future direction of our work in this project. Section seven contains some immediate tasks in front of us for making the program more robust and useful. The last section concludes the paper.

2. Rule-Based Reasoning

How to represent knowledge within the computer is a major issue in achieving any automated intelligence. Rule-based reasoning is one of the knowledge representation schemes for artificial intelligence. Rules represent knowledge by using an IF-THEN format. The IF portion of the rule is the condition or premise, which tests the truth value of a set of facts at every stage of the reasoning process [Gonzalez et al 93]. The THEN portion of the rule, called 'antecedent' or 'action' part of the rule, describes what to do when the rule fires. There are two types of rule-based reasoning mechanisms, namely, the forward reasoning and the backward reasoning.

Forward reasoning is the process of working from a set of 'facts' toward a conclusion that can be drawn from this data. Thus, in the forward reasoning, the expert system produces the conclusion. In the forward reasoning, each potentially applicable rule is examined to see if the premises contained within the rule are true or not. If premises of a rule are true, then the facts in the antecedent part of the rule are added to the list of facts in the 'fact-base' of the program (or some facts are deleted from the 'fact-base'). These facts are placed in the working memory. Thus a dynamically changing set of facts drives the rule firing process iteratively, until a conclusion is drawn for the problem.

Backward reasoning is another rule-based reasoning strategy that is goal-directed or top down approach to reasoning. In this reasoning strategy a goal is first hypothesized and then it is attempted to be proved. If it cannot be proved from the initially given set of facts, then the goal is broken down into subgoals in each phase of the reasoning process until the conclusion is proved or disproved. So, the direction of reasoning on a rule is from its action part to the antecedent part. A rule is fireable when its action part contains the 'current' goal the system is trying to prove. If the premises of the rule is present currently in the fact-base, then the rule is fired and the current 'goal' is proved and removed from the goal stack. Otherwise, premises becomes new subgoals and added to the goal stack. Thus, the backward reasoning traces the logic backwards from the few possible conclusions, and try to prove them from known or 'collected' facts (through user-interaction), until a conclusion is proved. It works most efficiently when the number of goals to be checked are less. It is also known to be prone toward a domain where more user interaction is required [Gonzalez et al 93]. From this latter angle our domain appears to be more amenable to the backward reasoning technique than to the forward reasoning technique. Hence, we tried this technique first to implement the program. Backward reasoning is also a more favorable approach for applications involving diagnosis or identification. But our current implementation is with the forward reasoning approach. This is because the identification of a plant involves gradually narrowing down the possibilities by checking different discerning external features of plants. Thus the natural flow of control is toward the forward direction - facts to a conclusion. Refer to the section describing our experience for more detail.

3. The choice of the expert system shell: CLIPS

CLIPS is a forward-reasoning, pattern-matching knowledge-based system shell. It was developed by the Artificial Intelligence section of the Johnson Space Center of NASA. Its name stands for C Language Implementation of Production System. CLIPS is extremely popular because it is highly portable, low cost, and easily integrated with external programs developed in C [Girratano et al 89]. As mentioned previously, its forward-reasoning ability made it a natural selection for this project because of the natural "flow" of control in the knowledge-base derives new facts, and eventually the conclusion, from the existing set of facts. CLIPS represents this type of forward flow of reasoning extremely well. And with so many intended uses of this system, the low cost, low memory requirements of CLIPS increases its marketability across machines. Involvement of a federal agency in this project is also another reason for choosing CLIPS, as CLIPS is being promoted within federal agencies. Availability of interface with C, and object oriented programming-capability are other major reasons for its choice.

4. Development of the rules for identifying plants

In this project we wanted to develop an expert system for identifying some of the commonly grown plants in the Mississippi area [Monaghan, Petrides 86]. Our objective was to provide a hierarchical rule-like' structure for the visible features of those plants. Our broadest classification was regarding whether the plant in question is a tree, a vine or a shrub. Most of the features utilized in this key are based on leaves of the plants. We have covered 34 trees, 6 shrubs, and 8 vines at this moment. The key needs information like whether the leaves are needle or scale-like as opposed to being broad. In those two different cases the key will put the plant in different category. As an example, the part of the key for shrubs is provided below:

1. With alternate and simple leaves (otherwise it is an unknown shrub)
   2. Leaf margins are toothed
      3. Leaf margins are with spines, feel prickly
         4. Margins wavy:                              American Holly
         4. Margins not wavy, fine black dots beneath: Gallberry
      3. Leaf margins are without spines
         4. Leaf margin coarsely toothed:              Yaupon Holly
         4. Leaf margins not coarsely toothed, look for twigs green
            on one side and reddish on the other and branchlets that
            zigzag:                                    Blueberry
   2. Leaf margins are smooth
      3. Leaf stalks short or lacking:                 Swamp Cyrilla
      3. Leaf stalks not short, look for pinkish-white clustered
         flowers on a sticky stalk:                    Mountain Laurel

We are currently working, in collaboration with a botanist toward refining the key and including more plants into it.

5. Experience gained from developing the system

One of the first questions which comes to mind is why this program is not developed on a procedural language like C or Pascal. There are two answers to this question. The first one is from an aesthetic point of view, and the second one is from the engineering point of view. From the former angle, a rule-based program is more natural from the way a human expert would reason in this domain. From the pragmatic angle, rule-based program would be better maintainable because of its modularity. With C, the knowledge cannot be represented outside a series of nested if-then-else's. While updating or debugging the system, it is much more difficult to follow twenty nested if-then-else statements than to locate a section of code that is executed only if it's conditional fact has been asserted as in the following rule:

          (defrule determine-leaf-margin1 ""
             (broad-class shrub)
             ?leaf <- (leaf-margin toothed)
             (not (plant-is))
             =>
             (retract ?leaf)
             (if (yes-or-no-p "Are leaf-margins with spines, feel
               prickly? (yes/no)? ")
                then
                 (assert (leaf-margin toothed with-spines))
                else
                 (assert (leaf-margin toothed without-spines)))) 

As one can see, it is much more simple to find this section of code to modify the characteristics and determine that the plant is a shrub with toothed leaf-margins than to filter through a group of nested if-then-else's and still not be able to tell what type of characteristics it has (other than the plant is a shrub). CLIPS was designed for this type of reasoning, and therefore, we make full use of its potential. We also believe that the program written in CLIPS is smaller in size.

Our next question regarding the style of reasoning was involved with the backward reasoning technique versus the forward reasoning technique. Backward reasoning fits well with a knowledge architecture which needs more user interaction [Gonzalez et al 93]. From that angle, backward reasoning scheme appeared to us as more suitable for this program. We first implemented the key with a backward reasoning scheme; however, for different reasons we made a commitment in the beginning that our language would be CLIPS. In our attempt to develop the program, we discovered that it was very difficult to design, develop and understand the program. One of the reasons behind that may be because CLIPS is a naturally forward reasoning production system. Doing backward reasoning in CLIPS is in reality a simulation. All the rules in the key were actually coded within the fact-base of the program, whereas the rules were very general in nature manipulating the key represented as facts.

A deeper analysis of the key convinced us that the natural flow of the reasoning in this problem domain would be from zero fact toward gaining more and more facts, and utilizing those facts for finer and finer classifications, leading to the conclusion about the identified plant. Although the facts come though interaction with user, a forward flow of control is natural here. Moreover, the forward acquisition of facts would also make sure that a question is not asked twice (as the fact-base already contains the answer to that question), which might happen in the backward reasoning scheme. Asking the same question more than once is not only an 'unintelligent' behavior, it is also an inefficient programming style. These reasons led us to believe that we should use forward reasoning scheme, which is also natural to the CLIPS. Our experience in this latter scheme was very positive. Coding was easy, more or less straight forward from the key. This will make maintenance of the program much easier, as and when a human expert updates our key.

6. Future extensions and directions

All expert systems should have an embedded explanation feature. Presumably users of our program would be semi-experts, which means that many of the complex terms in the questions asked by the system may not be understood by the user. In that case the user should be able to ask the system what is meant by those terms. Currently the system expects a yes/no answer, or a short answer for most of its questions to the user. We plan to extend the system in such a way that an user can answer 'explain' in resins to a system-asked question. The system would then provide a textual (and in future graphical/image-based) explanation, before asking the same question again. This feature is essential for using it in the area of education.

Any useful system nowadays is expected to have graphical user interface. We plan to enhance our program toward this direction for the ease of use.

One of our original goals in developing this program was to use it as an embedded program within a broader program for the environmental decision making-purpose. For example, detecting whether a piece of land is wetland or not involves identifying plants growing on that land. In such an application some part of the decision making process could work parallely in a distributed environment. We are further exploring this aspect also.

The purpose of this project was to show the utility of the idea of developing an expert system for plant identification. Having made our point, our next step is to enhance the program for a real-life use. This involves updating the key with more discerning features, so that not only it identifies the existing plants (currently within the capability of our program) more accurately, but also identifies a larger set of plants, which would make the program much more acceptable. Such improvement of the program could be done by either re-building the key from the scratch, or by letting real experts (botanists) test the system and asking them to see if some plant is mis-identified or if some plants are not being identified by it. When such situations occur, update the key and then update the program accordingly. The incremental updating would benefit the sytem because there will probably never be a really 'exhaustive' key for all possible plants. However, we expect the system to keep on improving over time.

7. Utility of the program

A Geographical Information System (GIS) integrates data and presents it in a graphic format. A GIS will display a specific location for each type of tree or plants with its location to water, roads, building, or other items that can be used for references. A GIS is not only a visualization tool, but one that can help in planning and identification of current problems. GIS data can be gathered by satellites, aerial photographs, and survey maps [Miller 95, NASA 79]. Each method requires some human verification. Sometimes semi-experts need to go to the field to do this job of verification. School students, as part of their science classes can act as local researchers to collect data and provide verification of previously collected data, possibly by satellite or other indirect means. Some school's biology departments have adopted local wetlands for monitoring [Neal et al 95]. Here, the students study the effects of pollution on the surrounding area. With a computerized expert system that helps in identifying various plants, a student (or any non-expert) will be able to provide a listing of all of the specific types of plants and trees in their area. These observations (possibly collected on a predefined grid) can be entered in a GIS system for its direct use or for comparing it with previously collected information. This knowledge of comparison could then be utilized later for a better interpretation of satellite (or other indirectly collected) data. Eventually one could think of a GIS where such expert systems are embedded.

Availability of such an expert system in the hand of non-experts would also benefit in many other ways. For example, students will be able to determine the growth or decline of the plants under observations. They will be able to make inferences about the impact of manmade or natural changes on the development of plants under observation.

This type of program also has a great value in informal education. By placing computers with an expert system in a museum, the general public can learn about many plants in their community and state. Museums can maintain samples of the leaves and fruits from the various plants in the area. Students as well as the public can bring leaves from plants around their home and learn to identify the types of plants in their community. In a museum, the public can compare their methods of plants classification with some botanists' techniques as coded in the system.

This program could also be embedded in other broader programs for the purpose of environmental decision making. For instance, in order to identify whether an area is wetland or not, one needs to identify which plants grow there. It is very expensive to repeatedly send expert botanists to the field for this purpose. This expert system could do the job with minimal human interaction. It could act as a stand alone system within the wetland identification process, or be a component of a program which does the overall job.

8. Summary

Artificial Intelligence simulates and applies human cognitive thinking toward problem solving. In this project, we have developed a method for identifying various types of plants with the aid of a computer based expert system. The knowledge used in designing this systems is formulated by a botonist, based on a scheme of identifying various plants from their visible features.

We have identified some of the benefits and uses of this knowledge-based expert system which includes using it in conjunction with a GIS system, using it for educational purposes, and using it as a part of an environmental decision making-process. Our current research is moving toward these identified directions.


Acknowledgment

We are thankful to U.J. Parikh for coordinating between the Bureau of Land Management and Jackson State University on this project. This project is partly supported by the LBL-JSU-AGMUS Science Consortium, which is sponsored by the DOE. Jacqueline Moore has implemented the backward reasoning program in this project; we are indebted to her for that.

References

[Esri 95] "Understanding GIS: The ArcInfo method," Environmental Systems Research Institute, Inc., 1995.

[Girratano et al 89] Girratano, J. and Riley, G., "Expert Systems: Principles and Programming," PWS-KENT Publishing Company, 1989.

[Gonzalez et al 93] Gonzalez, A.J. and Dankel, D.D., "The Engineering of Knowledge-based Systems: Theory and Practice," Printice Hall, Inc., 1993.

[Miller 95] Miller, Laura, " Branches into Urban Forest Management," American City and Country, February, 1995.

[Monaghan ] Monaghan., Thomas, "Know your trees," Publication 146, Mississippi State University.

[NASA 79] "Surface Observations for LANDSAT Data Collection," Volume VI, National Space Technology Laboratories, Earth Resources Laboratory, National Aeronautics and Space Administration. August, 1979.

[Neal et al 95] Neal, O., Lyman, H., "Using wetlands to teach ecology and environmental awareness in general biology," American Biology Teacher journal, Vol. 57, No. 3, 1995.

[Petrides 86] Petrides, G., "Trees and Shrubs," Houghton Mifflin Co., New York, N.Y., 1986.

Authors

Mohammed I. Al-Gharabat malg0996@stallion.jsums.edu ,
Ezell Campbell          ecam0596@stallion.jsums.edu,
Demethria Johnson       djoh0596@stallion.jsums.edu,
Willie J. Nathan, Jr.   wnat0596@stallion.jsums.edu,
dmitra@ccaix.jsums.edu 
Jackson State University
Department of Computer Science
P.O. Box 18839
Jackson, MS 39217
Tel. 601-968-2105, Fax. 2478

J.R. Schooler and Faye Winters
Bureau of Land Management
Jackson District Office
411 Briarwood Dr., Suite 404
Jackson, MS 39206
Tel. 601-977-5400, Fax. 5440