A comparison between novice and experienced compiler users in a learning environment.

1. Introduction and previous work
2. Compiler modifications and performance
3. Comparison between different levels of users and errorcategories
4. Conclusion
References
Appendix A: Examples of compiler errors messages withcorrective hints
5. Appendix B: Categories of error

Stuart Lewis, Gaius Mulley
Department of Computer Studies, University of Glamorgan
Treforest, Mid Glamorgan, CF37 1DL, UK

ABSTRACT

A locally built Modula-2 compiler has been used for number of years in our department. Over the last two years we have been improving the usefulness of error and warning messages. The messages that are emitted from the compiler when run by first and final year students have been logged. This paper presents an analysis of these results.

Published in Image grohtml-15051-1.png Annual Conference on the Teaching of Computing/ Image grohtml-15051-2.png Annual Conference on Integrating Technology into Computer Science Education - ITiCSE ’98 Image grohtml-15051-3.png August, 1998, Dublin Ireland. 1998 ACM 0-89791-xxx/98/03

1. Introduction and previous work

It is our belief that production compilers do not always address the needs of the student engaged in either learning a language or learning to program. Indeed in the debate about choice of teaching language there is often confusion between merits of the semantics of language and the language implementation/environment. In this paper we concentrate on the language environment. It is also our belief that just as one learns to sail on a reservoir before sailing on the high sea, so too should students learn to program in a helpful environment before venturing out into a commercial production environment.

Previous work has been performed in checking programming style(5) and warnings, notably by Johnson(2) with the C tool lint. As we reported last year(8) our work differs from Johnsons in that lint does all its checking prior to the Portable C Compiler(4)(3) code optimization. We believe this to be a weakness as some of the transformations that are commonly used to optimize code actually do so by understanding the behaviour of certain code boundaries. This knowledge can be used, not only to improve code quality, but also to check the user’s use of source constructs.

The approach presented in(8) is to perform some of the semantic checking post intermediate code optimization. Intermediate code optimization transformations will remove unreferenced sections, fold constants, propagate variables, remove common subexpression etc. Furthermore the compiler will have built variable life lists which can be exploited to detect elementary infinite loops, manipulation of FOR indices and using indices outside the loop.

We chose to use a Modula-2 compiler that was written locally with the Ceilidh(1) computer based learning package under the Linux operating system(6). The Ceilidh environment provides interactive guidance for students, Linux provides the security necessary for automated marking and X windows and xemacs gives the students a GUI environment. One of the reasons for using this compiler was that it could easily be modified to provide extra warnings appropriate for first year students. In the previous year compiler warnings were added throughout the semester as required and although easy to implement were found very beneficial to students(7). These warnings include checking for identifiers which look like keywords, same name variables in different visible scopes, variables and parameters declared but never used and variables used before being initialized. We chose only to terminate compilation with this last warning classification as it would result in an errant runtime program. While the warning messages might be too strict for proficient users they appeared to help the students from producing sloppy code and it was an attempt by the lecturing team to cut out some of the bad habits which can all too easily be "learned" during first year programming. This process is still ongoing and feedback is being collected regarding the students perceptions of the changes. Already it is obvious that compilers can have a greater role in encouraging interactive learning during the program development process. In an attempt to discover whether more sophisticated users (in this case final year degree students) require a different kind of feedback or indeed tend to make different kinds of mistakes further data was collected from the modified compiler.

Careful monitoring of both groups of students in the tutorial situation together with in-formal feedback from individual users was used to gain a deeper insight into how students process the information returned from the compiler. The need for the compiler to give useful advice and guidance had already been proven, now we need to know how best to provide that assistance and in which circumstances it could successfully be delivered. The increasing reliance by the students on automated tools for evaluation and feedback about their programs. Particularly through using the Ceilidh marking system means that they view the compiler as more of a tool to help them learn as opposed to the traditional view of a tool to efficiently convert source code to an executable program. In the future we expect these students to be developing in a tool based environment and expect them to be demand more facilities to assist learning as well as completing the task at hand.

2. Compiler modifications and performance

This year we continued to improve the error messages of the compiler. Specifically where appropriate we try to make the compiler hint at how to fix the problem. Appendix A shows four examples of hints generated by the compiler for fixing source code errors.

If the error involved a conflict between declaration and usage then an error message containing both sections of conflicting source code are emitted together with a description of what the compiler believes it has seen. Last year this was applied to checking procedure parameters. This year it has been applied to checking: arrays, types, variable procedures, constants and expressions.

The semantic checking can be switched on by the use of the -students flag. It requests that each variable is checked for possible confusion with same named variables in different blocks and performing case less comparisons against keywords. However it does cause a 30% increase in compilation time. We believe that this functionality is worth the extra delay in compilation time as variable scope confusion is at best bad programming practice and at worst a runtime error. Thus it is enabled by default at Glamorgan.

3. Comparison between different levels of users and errorcategories

The first semester students are in most cases undertaking their first programming course. This consists of many small problem solving exercises, a few of which are formally marked. Typically students are familiarizing themselves with different looping constructs, array and record accessing. The second semester students are performing the same in addition to practicing functions, procedures and parameters. Semester 7 students are in their final year, having had a year in industry. During this semester they have been developing various modules associated with a microkernel. In all three student groups most exercises are undertaken by working on a given skeleton module.

All compilations are logged together with all error reports. We summarize the errors in this paper and compare the occurrences between the different student groups. We have grouped the errors into 19 categories (a full list is given in Appendix B). The differences between the years are summarized in Table 1. Last year we believed that the modifications made for the compiler would also aid experienced users. Our research shows this assumption to be largely wrong. Last year we reported that 22% of second semester students errors were due parameter mismatch. This prompted work in making the errors more informative, typically displaying source declaration, procedure call source, offending parameter and type expected. However surprisingly few errors generated by semester seven students were in this category, even though their work would obviously demand the services of microkernel procedures.

The final year students were declaring a very high level of variables which were never read. This was due to the number of functions used and results ignored. There are three solutions to this problem, either: the function should be rewritten as a procedure, the result of the function should be checked or the language Modula-2 should be extended so that users can explicitly state that the function result is not required. For example in C this is achieved through:

(void) functName();

The function in question was a microkernel service to turn interrupts on and off. The declaration for this function is given below:

(*
   TurnInterrupts - switches interrupts on or off depending
                    on Switch. It returns the old value.
*)


PROCEDURE TurnInterrupts (Switch: OnOrOff) : OnOrOff ;

With hindsight the Modula-2 microkernel should have contained two extra procedures; TurnInterruptsOn and TurnInterruptsOff.

Despite being relatively experienced the semester 7 students still had problems with missing semicolons. A more serious error is that they were still declaring variables and never initializing them.

It was rather pleasing to see that the work done last year in detecting identifiers which look like Modula-2 keywords accounted for 41% of all warning messages during this year semester 1 group compared to 6% of last years semester 2 group. Not surprisingly the experienced users did not fall into this trap!

Image grohtml-150511.png

Table 1

4. Conclusion

In conclusion we believe that a compiler can successfully be used as a learning tool for different levels of users. However certain types of errors manifest themselves differently depending upon the abilities of the students (variables never read, uninitialized variables, incorrect parameters). But some categories of errors are prevalent in all the levels (semi colon missing, record accessing using WITH statements). A possible explanation to this is that the simple syntactical errors might be dismissed by the user on the basis that the compiler will always find them and that once alerted the student can apply an immediate fix. Whereas the experienced users treat the complex problems with care. The problem of passing the correct parameters to a procedure cannot be picked up by the compiler should the parameters be the same data type.

The more advanced students caused more semantic errors and warnings to be generated (variables never read and uninitialized variables).

We hope to improve the semantic knowledge of the compiler about the source code, possibly by examining optimization techniques and drawing knowledge from these transformations eg loop unrolling. In the future we will attempt to port the useful error messages and warnings that we have developed over the last two years to a different language environment.

References

1.

S. Benford, E. Burke, E. Foxley, N. Gutteridge A.M. Zin, The Ceilidh System: A General Overview, Learning Technology Research, Computer Science Department, Nottingham University (1994).

2.

S.C. Johnson, “Lint, a C Program Checker,” Comp. Sci. Tech. Rep. No. 65 (1978). updated version TM 78-1273-3.

3.

S.C. Johnson, “A Portable Compiler: Theory and Practice,” Proc. 5th ACM Symp. on Principles of Programming Languages, pp. 97-104 (January 1978).

4.

S. C. Johnson and D. M. Ritchie, “UNIX Time-Sharing System: Portability of C Programs and the UNIX System,” Bell Sys. Tech. J. 57(6), pp. 2021-2048 (1978).

5.

B.W. Kernigham and P.J. Plauger, Elements of Programming Style 2nd Edition, McGraw-Hill (1978).

6.

S.F. Lewis, “Developing a Modula 2 course for Ceilidh,” CTI Computing 5th Annual Conference on Teaching of Computing, pp. 126-128, Dublin (1997).

7.

S.F. Lewis, G.P.C. Mulley, “Experiences gained from producing a compiler to guide first year programming students,” CTI Computing 5th Annual Conference on Teaching of Computing, pp. 129-131, Dublin (1997).

8.

G.P.C Mulley, K. Verheyden, “Enhancing a Modula-2 compiler to help students learn interactively within the Ceilidh system,” Knowledge Transfer 97 (1997).

Appendix A: Examples of compiler errors messages withcorrective hints

Image grohtml-150512.png

5. Appendix B: Categories of error

Image grohtml-150513.png

Image grohtml-150514.png

Image grohtml-150515.png

Image grohtml-150516.png

found total of 2055 errors ( 100 )

Image grohtml-150517.png

found total of 15839 errors ( 100 )

Image grohtml-150518.png

found total of 502 errors ( 100 )


This document was produced using groff-1.22.3.