Code Complete

(A detailed book review)

Book Information

Code Complete

A Practical Handbook of Software Construction

Microsoft Press, 1993, 857 pages

ISBN 1556154844

Summary

Code Complete is programming classic. It is 900 pages of intelligent and fascinating discussion about coding software.

Introduction

This book is not about how to program in a particular language, how to use SSADM or other methodologies. This book is about improving the way that you work throughout the development cycle.

The author describes the development process as being everything from the technical, detailed design stage, right through to the integrated testing. This is the jurisdiction, or domain, of the programmer.

With an immense bibliography, the author has combined theories, common practices and hard data to make his points. Sometimes, however, he completely contradicts the commonplace practices. Here is a man who supports established practices, but backs his own convictions where they differ. Subjectivity surfaces on occasion with statements such as "If you come across one of these clowns, ask him the following". On the whole, there is the distinct impression that the author was motivated to write this book in order to help programmers, and related IT staff.

References to personal experience punctuate bibliographic references in order to put the point across. Clearly this man knows his stuff, and is not simply trying to pedal his own particular point of view that has never been proven. The author has developed his techniques over time, and has then decided to write a book on the subject. Where the purpose of his methodologies is nebulous, he backs them up with hard data.

Summary of Points

Here is a summary of the points that I believe are particularly of note.

1. Software Accretion Method. The author recommends an accretion approach to software construction. This means the method of initially building the most basic working system possible, and then adding on layers piece by piece.

2. Prerequisites. It seems obvious, but it is important to have all of the requirements in place before time is spent on detailed design or construction. Ensure that all the system prerequisites are outlined. The author also describes the "Myth of Stable Requirements". The important thing is to manage changes properly.

3. Use of PDL. This is a section that I found particularly fascinating, in the section dealing with designing Routine. PDL stands for Programming Definition Language. PDL is a language similar in purpose to pseudo code, but is more abstract, using statements that resemble spoken English. The method here is to design a routine in PDL and then use the statements to form the basis of the routine. The PDL statements become the comment lines in the routine, and the functionality is filled in between the comments. This method allows the design of the routine to remain a constant, and to be clearly visible when viewing the code.

4. Routines and Modules. The author spends a lot of time describing quality. This is particularly apparent with coding modules and routines. The author lists good and bad reasons for writing routines, and how to write quality routines and modules.

5. Abstraction and Naming. One of the most recurring themes throughout the book is abstraction. This theme is prominent when the author discusses naming standards. For routines, modules, variables, constants and literals, abstract naming standards are extremely useful. They allow code reading to be made easier, and help to describe the program in the problem domain.

6. Layout and Style. After mentioning quality in routine design and use, the author outlines the correct layout and styles of coding to use. Good and bad examples are shown, and the reasons for the choice of layout are mentioned. Also mentioned is how much of a sensitive area layout and programming styles are, including the hundred years war that is the GOTO debate.

7. Management. The book is not only focused on the programmer. There is a section that is for the attention of the Programming Manager. This deals with everything from planning and scheduling right through to managing the people. When discussing planning, the importance of measurement is also mentioned. Apart from progress, quality also needs to be measured. Several methods for measuring are suggested, including formal and informal reviews.

8. Testing. There are more opinions and suggestions for testing than just about anything else in software development. This book is no exception, however, the author describes testing in sections of Unit testing, Functional Testing, Integration Testing and Live Testing. Methodologies for all of these areas are outlined.

9. Optimization. There may be times when a program needs to become more streamlined. The author discusses the use of code tuning techniques, and the most important issue of when to optimize and when to leave it alone.

1. Understanding Software Construction. 8

1.1. Metaphors. 8

1.2. Writing Code. 8

1.3. Summary. 9

1.4. Forwarding Actions. 9

2. Prerequisites to Construction. 9

2.1. Importance. 9

2.2. Problem Definition. 10

2.3. Requirements. 10

2.4. Architecture. 11

2.5. Language. 11

2.6. Programming Conventions. 11

2.7. Time to spend on Pre-Requisites. 11

2.8. Adapting Pre-Requisites. 11

2.9. Summary. 12

2.10. Forwarding Actions. 12

3. Building a Routine. 12

3.1. Summary of Steps. 12

3.2. PDL for Pros. 12

3.3. Design the Routine. 12

3.4. Code the Routine. 13

3.5. Formal Checking. 13

3.6. Summary. 13

3.7. Forwarding Actions. 13

4. High Quality Routines. 13

4.1. Valid Reasons to Create a Routine. 13

4.2. Good Routine Names. 14

4.3. Strong Cohesion. 14

4.4. Loose Coupling. 14

4.5. How Long Can a Routine be?. 14

4.6. Defensive Programming. 15

4.7. Use of Routine Parameters. 15

4.8. Consider the use of Functions. 15

4.9. Summary. 15

4.10. Forwarding Actions. 15

5. Modules. 16

5.1. Modularity: Cohesion and Coupling. 16

5.1.1. Cohesion. 16

5.1.2. Coupling. 16

5.2. Information Hiding. 16

5.2.1. Secrets and the Right to Privacy. 16

5.2.2. Common Secrets. 16

5.2.3. Barriers to Information Hiding. 17

5.3. Good Reasons to Create a Module. 17

5.4. Summary. 17

5.5. Forwarding Actions. 17

6. High Level Design. 17

6.1. Introduction to Software Design. 17

6.2. Structured Design. 18

6.2.1. Choosing Components to Modularize. 18

6.3. Object-Oriented Design. 18

6.3.1. Key Ideas.18

6.3.2. Design Steps. 18

6.3.3. Typical Components. 19

6.4. Comments on Popular Methodologies. 19

6.4.1. When to use Structured Design. 19

6.4.2. When to use Information Hiding. 19

6.4.3. When to use Object Oriented Design. 19

6.5. Round Trip Design. 19

6.6. Design is a Heuristic. 19

6.7. How to solve it. 19

6.8. Summary. 20

6.9. Forwarding Actions. 20

7. Creating Data. 20

7.1. Reasons to create your own Data Types. 20

7.2. Guidelines for creating Data Types. 20

7.3. Making Variable Declarations Easy. 20

7.4. Guidelines for Initializing Data. 20

7.5. Summary. 20

7.6. Forwarding Actions. 21

8. The Power of Data Names. 21

8.1. Considerations in choosing good names. 21

8.1.1. The Effect of Scope on Variable names. 21

8.1.2. Computed-Value Qualifiers in Variable Names. 21

8.2. Naming Specific Types of Data. 21

8.3. The Power of Naming Conventions. 21

8.4. Informal Naming Conventions. 21

8.5. Hungarian Notation. 22

8.6. Short Names. 22

8.7. Kinds of Names to avoid. 22

8.8. Summary. 22

8.9. Forwarding Actions. 22

9. General Issues in Using Variables. 22

9.1. Scope. 22

9.2. Persistence. 23

9.3. Binding Time. 23

9.3.1. Code Time Binding. 23

9.3.2. Compiler Time Binding. 23

9.3.3. Run Time Binding. 23

9.4. Relationship between Data Structures and Control Structures. 23

9.5. Use each Variable for one purpose only. 23

9.6. Global Variables. 23

9.6.1. Common Problems. 23

9.6.2. Reasons to use Global Data. 23

9.6.3. How to Reduce the Risks of Using Global Data. 23

9.6.4. Use Access Routines instead of Global Data. 24

9.7. Summary. 24

9.8. Forwarding Actions. 24

10. Fundamental Data Types. 24

10.1. General Numbers. 24

10.2. Integers. 24

10.3. Floating Point Numbers. 24

10.4. Characters and Strings. 24

10.5. Booleans. 25

10.6. Enumerations. 25

10.7. Named Constants. 25

10.8. Arrays. 25

10.9. Summary. 25

10.10. Forwarding Actions. 25

11. Organizing Straight Line Code.25

11.1. Statements that must be in a specific order. 25

11.2. Statements who's order does not matter. 25

12. Using Conditions. 26

12.1. IF statements. 26

12.2. IF chains. 26

12.3. CASE statements. 26

12.4. Tips. 26

13. Controlling Loops.26

13.1. Select the kind of Loop to use.26

13.2. Controlling the Loop.26

13.3. Creating Loops form the inside out.27

13.4. Correspondence between loops and arrays. 27

14. Unusual Control Structures. 27

14.1. GOTO. 27

14.2. RETURN/EXIT. 27

14.3. Recursion. 27

15. General Control Issues. 28

15.1. Boolean Expressions. 28

15.2. Compound Statements. 28

15.3. NULL Statements. 28

15.4. Taming Dangerously Deep Nesting. 28

15.5. The Power of Structured Programming. 28

15.6. Control Structures and Complexity. 28

16. Layout and Style. 29

16.1. Fundamentals. 29

16.2. Layout Techniques. 29

16.2.1. White Space. 29

16.2.2. Parentheses. 29

16.3. Layout Styles. 29

16.3.1. Pure Blocks. 29

16.3.2. End-line Layout30

16.3.3. BEGIN-END Block Boundaries. 30

16.4. Laying Out Control Structures. 30

16.4.1. Fine Points of Formatting Control Structures. 30

16.4.2. Other Considerations.30

16.5. Laying Out Individual Statements. 30

16.5.1. Statement Length. 30

16.5.2. Using spaces for clarity. 31

16.5.3. Aligning Related Statements. 31

16.5.4. Format Continuation Lines. 31

16.5.5. Use Only One Statement Per Line. 31

16.5.6. Laying Out Data Declarations. 31

16.6. Laying Out Comments.31

16.7. Laying Out Routines.31

16.8. Laying Out File, Modules and Programs.31

16.9. Summary.31

16.10. Forwarding Actions. 32

17. Self-Documenting Code.32

17.1. External Documentation.32

17.2. Programming Styles as Documentation.32

17.3. Commenting.32

17.3.1. Types of Comments. 32

17.3.2. Commenting Efficiency. 32

17.4. Commenting Techniques.32

17.4.1. Individual Lines.32

17.4.2. Commenting Paragraphs. 32

17.4.3. Commenting Data Declarations. 33

17.4.4. Commenting Control Structures. 33

17.4.5. Commenting Routines. 33

17.4.6. Commenting Files, Modules and Programs. 33

17.4.7. Using the "Book" Paradigm for commenting. 33

17.5. Summary. 33

17.6. Forwarding Actions. 34

18. Programming Tools. 34

18.1. Design Tools. 34

18.2. Source Code Tools. 34

18.2.1. Editing. 34

18.2.2. Browsing. 34

18.2.3. Analyzing Code Quality. 34

18.2.4. Restructuring Source Code. 35

18.2.5. Version Control35

18.2.6. Data Dictionaries. 35

18.3. Executable Code Tools. 35

18.3.1. Code Creation.35

18.3.2. Debugging.35

18.3.3. Testing.35

18.3.4. Code Tuning.35

18.4. Tool-Orientated environments.35

18.5. Building your own tools.36

18.6. Summary. 36

18.7. Forwarding Actions. 36

19. How Program size Affects Construction. 36

19.1. Effect of Project Size on Development Activities. 36

19.2. Effect of Project Size on Errors. 36

19.3. Effect of Project Size on Productivity. 36

20. Managing Construction. 36

20.1. Encouraging Good Coding.36

20.1.1. Considering in Setting Standards.36

20.1.2. Techniques.37

20.2. Configuration Management. 37

20.2.1. What is configuration Management?. 37

20.2.2. Software Design Changes. 37

20.2.3. Software Code changes. 37

20.3. Estimating a Construction Schedule. 37

20.3.1. Approaches. 37

20.3.2. Establish Objectives. 37

20.3.3. Influences on Schedule. 37

20.3.4. Estimation vs. Control38

20.3.5. What to do if you are behind. 38

21. Software Metrics.38

21.1. Treating Programmers as people. 38

21.2. Variations in performance and quality. 38

21.3. Religious Issues. 38

21.4. Physical Environment. 39

21.5. Summary. 39

22. The Software Quality Landscape. 39

22.1. Characteristics of Software Quality. 39

22.2. Techniques for improving Software Quality. 39

22.3. Relative Effectiveness of techniques. 39

22.3.1. Percentage of Errors found. 39

22.3.2. Cost of finding defects. 40

22.3.3. Cost of fixing defects. 40

22.4. When to do a QA. 40

22.5. General Principle of Software Quality. 40

22.6. Summary. 40

22.7. Forwarding Actions. 40

23. Reviews. 40

23.1. The role of reviews. 40

23.1.1. Reviews complement other QA techniques. 40

23.1.2. Reviews remove corporate structure.40

23.1.3. Reviews assess Quality and Progress.40

23.1.4. Reviews also apply before construction.41

23.2. Inspections.41

23.2.1. Roles During Inspections. 41

23.2.2. Procedure for Inspections. 41

23.3. Other kinds of reviews. 41

23.3.1. Walkthroughs.41

23.3.2. Code Reading.41

23.3.3. "Dog and Pony shows"42

24. Unit Testing. 42

24.1. The Role of Unit Testing. 42

24.2. Testing During Construction. 42

24.3. The Testing Bag of Tricks. 42

24.3.1. Incomplete Testing. 42

24.3.2. Structured Basis Testing. 42

24.3.3. Data Flow Testing. 42

24.3.4. Equivalence Partitioning. 42

24.3.5. Error Guessing. 42

24.3.6. Boundary Analysis. 42

24.3.7. Classes of Bad Data. 43

24.3.8. Classes of Good Data. 43

24.3.9. Use test cases that allow easy manual checks.43

24.4. Typical Errors. 43

24.4.1. Which routines contain the most errors?. 43

24.4.2. Errors by Classification. 43

24.4.3. Proportion of Errors Resulting from faulty construction. 43

24.4.4. How many errors should you expect to find. 43

24.4.5. Testing itself43

24.5. Test Support Tools. 44

24.5.1. Scaffolding. 44

24.5.2. Results Comparators. 44

24.5.3. Test Data Generators. 44

24.5.4. Coverage Monitors. 44

24.5.5. Symbolic Debuggers. 44

24.5.6. System Perturbers. 44

24.5.7. Error Databases.44

24.6. Improving Testing. 44

24.7. Planning to test. 44

24.7.1. Re-testing. 44

24.8. Keeping Test Records. 44

24.9. Summary. 44

24.10. Forwarding Actions. 44

25. Debugging. 45

3.1 Overview of Issues. 45

25.0.1. Role of Debugging. 45

25.0.2. Variations in Debugging Performance. 45

25.0.3. Errors as Opportunities. 45

25.0.4. An Ineffective approach. 45

25.0.5. Debugging by superstition. 45

25.1. Finding an Error. 45

25.1.1. Use a Scientific Method. 45

25.1.2. Tips on Finding Errors. 46

25.1.3. Syntax Errors. 46

25.2. Fixing an Error. 46

25.3. Psychological Considerations. 46

25.4. Debugging Tools. 46

25.5. Summary. 46

26. System Integration. 47

26.1. Importance of the Integration Method. 47

26.2. Phased vs. Incremental Integration.47

26.2.1. Phased Integration. 47

26.2.2. Incremental Integration. 47

26.2.3. Benefits of Incremental Integration. 47

26.3. Incremental Integration strategies. 47

26.3.1. Top Down Integration. 47

26.3.2. Bottom Up Integration. 47

26.3.3. Sandwich Integration. 47

26.3.4. Risk Orientated Integration. 47

26.3.5. Feature Orientated Integration. 47

26.4. Evolutionary Delivery. 48

26.4.1. General Approach. 48

26.4.2.Benefits. 48

26.4.2. Relationship of Evolutionary Delivery to Prototyping. 48

26.4.3. Limitations. 48

26.5. Summary. 48

26.6. Forwarding Actions. 48

27. Code Tuning Strategies. 48

27.1. Performance Overview.. 48

27.1.1. Quality Characteristics and Performance. 48

27.1.2. Performance and Code Tuning. 48

27.2. Introduction to Code Tuning. 48

27.2.1. Old Wives' Tales. 49

27.2.2. The Pareto Principle. 49

27.2.3. Measurement.49

27.2.4. Compiler Optimizations. 49

27.2.5. When to use Code Tuning. 49

27.2.6. Iteration. 49

27.3. Common Sources of Inefficiency. 49

27.4. Summary of Approach to Code Tuning. 49

27.5. Summary. 49

28. Code Tuning Techniques. 49

28.1. Loops. 49

28.2. Logic. 50

28.3. Data Transformation. 50

28.4. Expressions.50

28.5. Routines. 50

28.6. Re-Code in Assembler. 50

29. Software Evolution. 50

29.1. Guidelines. 50

29.2. Making New Routines. 51

30. Themes in Software Craftsmanship. 51

30.1. Conquer Complexity. 51

30.1.1. Ways to reduce complexity. 51

30.1.2. Hierarchies and Complexity. 51

30.1.3. Abstraction and Complexity. 51

30.1.4. Pick your Process. 51

30.2. Write programs for people first, and computers second. 51

30.3. Focus your attention with the help of conventions. 51

30.4. Programming in terms of the problem domain. 51

30.5. Watch for "Falling Rocks"51

30.6. Iterate. 51

30.7. "Thou Shalt Render Religion and Software Asunder"52

30.7.1. Software Oracles. 52

30.7.2. Eclecticism.. 52

30.7.3. Experimentation. 52

1. Understanding Software Construction

1.1. Metaphors

This section noted the importance of metaphors in software construction. Metaphors are a highly useful means of communicating ideas and concepts.

When technical people need to communicate with non-technical people, the language barrier typically separates them. This is because non-technical people do not have a deep understanding of the fundamentals of software construction, nor should they. It is important for technical people to be able to communicate with the non-technical.

This makes sense, as it is then possible to communicate an idea to another person, using something that both people understand, as a metaphor.

If a technical person is trying to communicate an idea to a non-technical person, metaphors and analogies can be used to help instigate a clear understanding of the concept.

The use of metaphors can be referred to as “Modeling”.

Many different types of metaphors are used when interpreting technical issues. It is important to remember that there is never a “perfect” metaphor. This means that there is not a definitive “right” versus “wrong” metaphor. It is inevitable that, over time, different metaphors will arise that may describe technical issues better than older ones. This does not mean that the old metaphor was wrong, and the new one is right. They are simply different, with one being more correct than another: “better” and “worse”.

Software metaphors are not algorithms. They are heuristics. To define:

An algorithm is a definite method that gives an answer.

A Heuristic is an approach that helps to find an answer.

Many errors that occur in Software Construction are conceptual errors. They are problems with what is being achieved not how it is being achieved.

1.2. Writing Code

Quotes:

A report issued in 1980 stated that on average, 50% of software development happens after the first release.
A book in 1975 stated that programmers should “plan to throw one away”.

These quotes imply that a lot of work will prove to be pointless, and is typically discarded and started again. This comes from the old school of programming.

A new approach to software construction should be considered. Take the metaphor of farming. Code is generated gradually, a little at a time. The author states: “If you buy into this idea, you will end up talking about Fertilizing the system plan, Thinning the detailed design, increasing code yield through effective land management and Harvesting code.”

This metaphor is useful, but a better one is that of Oyster Farming. An oyster makes a pearl by the gradual addition of material around an irritant. A computer program can be initially developed to perform the most basic function possible, as long as the full cycle of the program is followed. This is essentially a skeleton program. From here, more functionality can be added little by little. The whole process is incremental.

This can help identify any conceptual or design issues earlier than developing the entire system on a screen-by-screen, or module-by-module basis. This helps to check the integrity of the design.

1.3. Summary

Metaphors help break the technical to non-technical language barrier.
Metaphors explain one thing, using something else as a model.
There is no “Right” and “Wrong”, only “Worse” and “Better”.
Metaphors can help to educate non-technical people.
The Accretion approach to software development will help to identify any problems with the integrity of the design sooner than other methods.

1.4. Forwarding Actions

I do not believe that there is a place in a Functional Specification for metaphors.
It could be worth trying communication with non-technical people using metaphors.
If someone non-technical is trying to write down what they are being told, try to stop them, and get them to understand using a metaphor.
Use the Accretion approach to development.

2. Prerequisites to Construction

2.1. Importance

Why is software written? The answer should be “To solve a problem”. At the most basic level, this should be the goal of a software product. With this in mind, it is important to have a few essentials before development starts. At the very least, the reason for the software is required.

Prerequisites are a list of items that the developer really should have before coding starts. Even if the developer knows what the problem is, that does not always constitute everything they need before they start coding.

By identifying all of the prerequisites before coding starts, problems and issues can be identified and resolved before they are encountered, therefore not holding up the development process while the problems and issues are resolved.

There are two main contributors to this problem:

Developers. Developers have a tendency to want to start coding as soon as possible. This should be corrected with self-discipline to ensure that the prerequisites are met first.
Managers. Managers want to see Developers doing what they’re paid for – Developing. If they are seen writing documents and other non-code activities, managers can demand that they start coding.

Developers have had a tendency to think “Well, he must know what he’s talking about…” and then get the grief when problems arise later.

The author described an occasion when working on a US military project. The Project Manager was a large Army General. He arrived at the office one day and wanted to see some code. He was told that the project was in its requirements phase, and everyone was writing documents, and talking to customers. Nevertheless, the general wanted to see some code. He went around all 100 developers until he found one of them writing what looked like code. In fact, the developer was writing a document-formatting utility. However, because the general wanted to see some code, and found what looked like code, he went away happy.

If a manager demands that a developer starts coding, there are four options identified by the author:

Say “No”. Refuse to start coding before the prerequisites are in place. “This is dependant on your relationship with your manager, and the state of your bank account.”
Pretend. Dig out some old program listings and place them on your desk. Meanwhile do the prerequisites. Let the manager think that you’re coding.
Educate the manager. Explain the reasons for the prerequisites. “There are few enough managers that understand how things should be done, and why”.
Change Jobs. “There is a shortage of good programmers”.

A useful metaphor is to think of the development process as a food chain. It passes from requirements, through to architecture, and down to the developers. If the environment is polluted, that is to say, the requirements are erroneous, these problems pass down to the developer.

Problems detected in the requirements phase are much less costly than ones discovered during development.

A report identified that if an error in requirements is found in development or maintenance, then it will cost 50-200 times more than if it is found during requirements or design.

IBM issued the following table to show the increase in cost of resolving errors during particular phases.

Time Detected	Time Introduced
Time Detected	Requirements	Design	Coding
Analysis	1x	-	-
Design	2x	1x	-
Testing (Passive)	5x	2x	1x
Testing Structural)	15x	5x	2x
Testing (Functional)	25x	10x	5x

This table should communicate clearly the reason for fulfilling prerequisites before coding. It costs a lot less to start things correctly, than it does to constantly change things.

Another consideration is the time involved, and late nights that may arise. Getting it right at the start makes life a lot easier.

2.2. Problem Definition

As mentioned earlier, software is written to solve a problem. Ensure that the problem is clearly articulated. Then problem definition should be just that, a definition of a problem. It should not be a statement of how something is to be improved. This is an entirely different statement.

The problem definition is the basic building block of the software development. The foundation.

This ensures that the software does not solve a different problem to the one that is required.

2.3. Requirements

Once the problem is identified, the requirements of the solution can be identified. As mentioned earlier, it costs more to fix fundamental requirements problems further down the development cycle than it does to correct them at the requirements stage.

Getting the requirement right will reduce the number of changes later.

The following table describes the increases in cost of an error in requirements:

Stage	Cost Ratio
Requirements	1x
Design	5x
Coding	10x
Testing	50x
Maintenance	100x

“Stable Requirements” are a myth, or at least, the developer’s equivalent to the Holy Grail. A situation where a customer will never change their minds is unheard of. A report concluded that, on an average system of 1million lines of code, 25% of the code would be changed.

The trick is to manage the customer. The author says “When customers get an idea for a new feature, their blood thins, they become excited and giddy, and completely forget the many meetings that they have had before. The best way to deal with them is to say ‘Gee that sounds like a good idea, I’ll write a specification and change control for it, and provide you with a revised schedule and cost estimate’. The words ‘Schedule’ and ‘Cost’ are much more sobering than a cup of black coffee and a cold shower under those circumstances”.

When presented with these circumstances, there are several options available.

Implement Change Control Procedures.
Accommodate the changes by using short life cycles, and prototypes.
Dump the Project.

The third option should not always be considered. However, an imaginary state must be identified, one that would justify dumping the project. Work out how close to these state things are.

2.4. Architecture

The architecture provides a high level design of the system. This does not involve itself with such issues as screen layouts, report layouts and field definitions. The purpose of the architecture is to test the conceptual integrity of the system.

The main components of architecture are:

Program Organization. How does the system fit together? Where are the boundaries of the modules?
Change Strategy. This shows allowance for new features. How easy it will be to implement them, and what is likely to come up.
Buy vs. Build. What is to be developed, and what is to be bought from a 3^rd party. The biggest gains in software come from re-use of software.
Major Data Structures. This will determine what data is to be held. Alternatives are to be included, and justifications for the choices.
Major Objects. Containing Interactions, Hierarchies, States and Persistence. Include alternatives and justifications.
Key Algorithms. Again, specify alternatives and justification. Also state any assumptions made.
Generic Functionality. State which routines are modularized, which forms, menus and reports have the same look/feel and functionality.
Error Processing. Over engineering – make modules more robust than they need to be. Include Assertions – self-checking code, e.g. if a number will never be more than 256, assert an error if it is larger. Assumptions must be included.
Fault Tolerance. Detecting Errors and recovering. Choices are typically: Backing up through steps until no errors have occurred; Auxiliary code, to be used if an error occurs; Replace an erroneous value with a “safe” value that will not corrupt any other parts of the system.
Performance. List goals.
General Quality. Discuss modules and data. Don’t do things in a certain way “because we’ve always done it that way”. Make software environment independent. Don’t compromise time and quality in one area for another. Nothing about the system should make you uneasy, if it does, discuss it.

2.5. Language

It is a good idea to define the language that is to be used to develop the software. Using the wrong tool for the job is an inefficient use of time. Use a language that will help to get the job done.

Using the right language increases productivity, as does using familiar languages. It has also been shown that using high-level languages is 5 times more productive as using low level languages. This is due to the nature of the languages.

2.6. Programming Conventions

It is good practice to identify programming standards for a system. In-house development usually has it’s own set of standards. It is always easier to maintain a system with good programming standards used, than one that does not.

2.7. Time to spend on Pre-Requisites

Project planning should typically take 20%-30% of the time. This does not include detailed design, this is done as part of development.

It is important to allow for uncertainty when planning. It is impossible to schedule time for tasks that are uncertain.

2.8. Adapting Pre-Requisites

Every project is likely to be different, so it is up to the developer to decide how formal the pre-requisites should be, based on the current project.

2.9. Summary

Developers need to know what is required before development starts.
Correcting errors is design is much cheaper at the start of the development process, than at the end.
Prerequisites help identify problems and issues before they arise in development.
Define the Problem.
Get the requirements right at the start. Correcting them later costs more.
Manage changes to requirements correctly.
Ensure the Architecture is sound.
Use the correct language.
Define Programming standards.

2.10. Forwarding Actions

Ensure that everything is specified fully and correctly.
Derive a list of requirements before coding.

3. Building a Routine

3.1. Summary of Steps

Begin

Design the Routine Check the Design

Check the Code Code the routine

Done

3.2. PDL for Pros

PDL stands for Program Design Language. This is similar to Pseudo Code, however PDL is more removed from the programming language level. When using PDL, use the following guidelines:

Use English statements.
Avoid Programming Language Syntax.
Write PDL at the level of intent. Describe the meaning of the approach, not how the approach will look in the target language.
Don’t make it too high-level. It should be simple to generate code from PDL in any language.
PDL statements should be easy to follow. A good test is to write a routine In PDL, then write the PDL statements as comments into the target language.

Benefits of using PDL:

Reviews are easier. It is easier to correct approach methods in PDL than in code.
PDL designs can be refined again and again. It is easier to get the design right before coding starts.
It is easier to change PDL than code.
Commenting the final code is easier. PDL forms that basis of the comments.
PDL is easier to maintain than other forms of design documentation. If the comments in the code match the PDL statements, then the two are in line.

3.3. Design the Routine

Check the Prerequisites. Ensure the purpose is clearly defined.
Define the Problem. Architecture should provide this.
Name the routine. This should clearly describe what it does.
Decide how to test it.
Efficiency. If design is modular, then routines can be replaced later with faster low-level components.
Re-use code. This is efficient, and a great time saver.
Write the PDL.
Think about the data. If data manipulation is required, design the data structures.
Check the PDL. Regress. Ask someone else. Review.
Iterate. Review the design as many times as necessary. Investigate alternatives.

3.4. Code the Routine

Write the Procedure Declaration. Name and Parameters.
Comment introduction, description.
Use the PDL for comments.
Fill in the code between comments.
Informally (mentally) check the code.
Clean up leftovers. These include unused parameters, inaccurate variable names, infinite loops, one off errors, documentation.

3.5. Formal Checking

Mentally check the routine for errors.
Compile the routine. This will highlight errors and warnings. The goal is to have no errors or warnings.
Use a debugger to test the routine.
Remove errors.

3.6. Summary

Design the routine outside of code (PDL).
Naming should be clear.
Re-use any available code.
Loop the design cycle as many times as necessary.
Code the routine when the design is sound.
When designing, regress, or ask someone else’s opinion.
Explore alternatives.
Check the routine informally, and formally.

3.7. Forwarding Actions

Experiment with PDL for design.
Use PDL statements as comments.
Get input from others before coding.
See how often the design is different to the resulting code.

4. High Quality Routines

4.1. Valid Reasons to Create a Routine

Reducing complexity.
Allows re-use of Code.
Avoids duplication of large chunks of code.
Limit the effect of change. Changing 20 lines of code 10 times takes longer than changing once.
Hiding Sequences. If possible hide the order in which events happen. Decide once, and place in a routine. This routine is then called from x places.
Performance. Optimizing code in one place, rather then 50.
Centralize Control Points. Reading or writing to a file can be controlled from a single point.
Hiding data structures. Complex data that is used only in one area will be contained.
Hiding global data. If the data format is changed, the change only needs to be implemented once.
Hiding pointer operations. These are difficult to follow. A routine would have a name that reflects the intent.
Planning for additions to the program. These additions can use the same routines.
Readability. Bundling code into a routine with a sensible name is more readable than a block of code.
Improves portability.
Isolates complex operations.
Isolates non-standard language functions.
Simplifies complicated Boolean tests.
Do not write a routine for the sake of it.
Even small routines are valid. Even 2 or 3 lines of code in a routine can be helpful.

4.2. Good Routine Names

Use a strong verb followed by an object.
For a function, use a description of the return value.
For Objects, just use the verb.
The length can be as long as necessary.
Establish naming conventions.

4.3. Strong Cohesion

This means that each routine does one thing, and does it well. If it does more than one thing, then it may need to be split into other routines. If the name can clearly define what the routine does, then it is OK.

Acceptable Cohesion:

Functional Cohesion. The routine performs does one thing.
Sequential Cohesion. Routines do things in a pre-defined order (OpenFile, ReadFile, CloseFile). If steps are required, then convert to a single routine (GetFileData – this does all three steps).
Communicational Cohesion. If two different types of operation are carried out on the same data. If this happens often, then it is OK. However, it does not do just one thing.
Temporal Cohesion. Things that happen at the same time (e.g. Startup). Convert to have Startup call other routines that are functionally cohesive.

Unacceptable Cohesion:

Procedural. Created to perform things in a certain order, for the sake of it.
Logical. Many different sections of code in the same routine, only one is run each time, based on a control parameter.
Coincidental Cohesion. No reason to be in a routine.

4.4. Loose Coupling

Coupling refers to the link between routines.

Good Coupling Criteria:

Size. Small is better. The less parameters you have, the less connection there is between them.
Intimacy. Data passed as parameters, as opposed to global data.
Visibility. Point out the connections. Use a definite parameter list of values required.
Flexibility. If a routine needs two parts of a data structure, pass these parts as parameters. Don’t use the structure. This way, the routine can be used without the need for a data structure.

Levels of Coupling:

Simple Data Coupling. “Normal Coupling” individual values are passed as parameters, and are not structured. This is best.
Data Structure Coupling. Values are passed as a single parameter, which is a data structure. This is not as flexible. This is acceptable.
Control Coupling. This is when one routine passes data to another, which tells the first routine what to do. This is bad.
Global Data Coupling. Data is passed via global data, not parameters. This is risky, because other parts of the system can access global data.
Pathological Coupling. One routine uses another routine’s internal data. This fails all requirements for good coupling. There is a definite connection between the two.

4.5. How Long Can a Routine be?

This debate has been going on since routines were invented! Theoretically, 66 to 132 lines are sufficient. IBM once limited routine size to 50 lines. It is said that smaller is better, however, some reasons for larger routines are:

Number of errors in a routine is inversely proportional to its size.
Larger routines (65+ lines) are cheaper to develop, and have a lower fault rate.
Small routines (-145 lines) have 23% more errors than large ones.
A student comprehension study showed that, compared with programs with no routines, programs with routines of 10 lines showed no increase in comprehension. However, programs with routines of 25+ lines showed a 65% increase.
Routines of 100 to 150 lines are changed least.

4.6. Defensive Programming

This means that the routine will check any parameters that may be passed, to see if they are damaging. The routine will check, even though it is someone else’s fault that the data was bad.

Allow for bad data. Check it.

Use assertions. Check assumptions.
Take responsibility for checking data, parameters and return values.
Don’t produce an error if passed bad data.
Decide how to handle bad parameters.
Handle exceptions.
Anticipate changes. If change is likely, there should be little impact.
Remove debugging aids. Have debug and release versions.
Use Debugging aids early.
Contain Errors (Firewall). Hide information about routines, ensuring fewer assumptions about the routines, therefore, less errors can be produced. Define “safe” areas of the program.

How much defensive code to leave in a released version:

Leave checks for important errors.
Remove trivial error checks.
Remove code that causes program crash.
Leave “Gracefully” crashing code, e.g. code that saves everything first.
Ensure that error messages are helpful.

Be defensive about defensive programming. Otherwise programs will be slow.

4.7. Use of Routine Parameters

Guidelines:

Make sure that local variables match parameter variables (e.g. Not: Parameter = integer, Local = byte).
Order parameters by Input, Changeable and Output.
For similar routines, use the same parameter order.
Don’t pass parameters that are never used.
Don’t work with parameter variables. Use local variables that are set to the parameters, and then used.
Pass Status and Error parameters last.
Document any assumptions about parameters.
Limit the number of parameters. For Psychological reasons, seven is the most to use. Use Data Structures if more are required.
Use a naming convention for parameters.
Pass only the parts of a data structure that a routine needs.
Don’t assume anything about the parameter passing mechanism.

4.8. Consider the use of Functions

Functions are routines that return a value. Include a check of the returning value as well as parameters.

4.9. Summary

Even though there are other types of cohesion, Functional Cohesion is the best to use. This performs one function, and is named accordingly.
Loose Coupling. Limit interface to only the required parameters.
Be defensive. Check parameters and return values.
Use parameters well.

4.10. Forwarding Actions

Routines and parameters should be named well.
Ensure Routines are Functionally Cohesive.
Ensure that parameters are used correctly.
Check routines that return values. They may be better as functions.

5. Modules

Modules are a collective bundle of data structures and routines. These are essentially “black boxes” containing large chunks of reusable code.

5.1. Modularity: Cohesion and Coupling

By following the rules of Cohesion and Coupling, “black box” routines can be created that are reusable, and reliable. If all similar operations within a large program use the same modular routines, then maintainability is increased. It has also been proven that system comprehension is made easier.

It is not always possible to create the perfect module, because there may be shared data between modules.

5.1.1. Cohesion

Module Cohesion is similar to Routine Cohesion. Essentially, the principle of Modular Cohesion is to place together routines and data that belong together.

5.1.2. Coupling

The ideal Module will allow clean interaction. If a module does not offer complete services, then the internal workings may need to be published. This turns the “black box” into a “white box” or a “glass box”.

Use routines to hide code, by using useful and appropriately abstract names.

5.2. Information Hiding

Information Hiding is one of the most useful practices to make use of. The simple rules to follow are:

Don’t publish internal data, make use of Routines.
Don’t use literal values, use named constants.

5.2.1. Secrets and the Right to Privacy

A well-written module will be like an iceberg: only a small amount is visible from the surface.

Hide complex calculations.
Hide volatile areas that are likely to change.
Only publish what is necessary.
Make use of Routines. Use abstract and useful naming.

5.2.2. Common Secrets

These are common examples of module “secrets”, and how to minimize the effects.

5.2.2.1. Volatile Areas that are likely to change

Accommodating change is always difficult. The goal is to isolate volatile areas so that only one module requires changes. Tips:

Identify. If requirements have been done well, a list if changes and likelihood should be available.
Separate. Compartmentalize each volatile area into a separate module.
Isolate. Design the module interfaces so that the changes are contained.

Typical areas of change:

Hardware Dependencies.
Input and Output.
Non-standard Language Features.
Difficult Design and Implementation Areas.
Status Variables. Use access routines to check the states, not the variables directly. (Usually status variables are Boolean. Sometimes these may need more than two states. Best to use enumerations).
Data Size Constraints. (Maximum records or collections. Use named constants, as they require change in only one area).
Business Rules (e.g. Payroll calculations. These may change every year, but the goal of the calculation routine would be the same).

5.2.2.1.1. Complex Data

Complex data structures are likely to change. Accessing the data via routines will reduce the impact outside of the module.

5.2.2.1.2. Complex Logic

Similar to the business rules section above, complex logic, such as nested “ifs”, can be bundled into a single Boolean function. Use “black box” routines.

5.2.2.1.3. Operations at the Programming Language Level

Convert several lines of code into a single function or routine. This will improve readability, and allow the section to be dealt with at a higher level of abstraction. Also, if error checking is required, the change only needs to be made in one place, rather than many more.

5.2.3. Barriers to Information Hiding

These are mostly habitual practices from using other methods.

Excessive Information Distribution. Such as frequent use of literals, instead of named constants.
Circular Dependencies. Where one routine refers to another, which refers to the first.
Module Data mistaken for Global Data. Data used only within the routine is made global.
Perceived performance penalties. Apprehension about making central references.

5.3. Good Reasons to Create a Module

User Interface. Centralize these routines.
Hardware Dependencies.
Input and Output.
System Dependencies. These may change with each installation.
Data Management.
Creating “real world” objects and abstract data types.
Reusable Code
Related operations that are likely to change. (Damage Management).
Related Operations.

5.4. Summary

Modules should be “Black Boxes” of functionality.
Modules should minimize impact of changes.
Allow a clean interface.

5.5. Forwarding Actions

Ensure guidelines are followed.

6. High Level Design

6.1. Introduction to Software Design

Levels of Design:

Divide the System into Sub-Systems.
Divide the Sub-Systems into Modules. Complex Modules are to be divided further.
Divide Modules into Routines.
Internal Routine Design.

6.2. Structured Design

Consists of:

System Decomposition. Organization into “black boxes”.
Strategies for developing designs.
Criteria for evaluating designs.
A Clear statement of the problem to be solved.
Graphical and verbal tools for expressing designs. (Structure charts and PDL).

6.2.1. Choosing Components to Modularize

6.2.1.1. Top Down Decomposition

This methodology takes a system and builds it from the largest top levels, and expands into smaller sub-areas, using the old “Divide and Conquer” theme. The process is as follows:

Design top levels first.
Don’t indicate the target language.
Don’t bother with low-level details.
Formalize levels.
Verify each level.
Refine each level into lower levels.
Keep refining each level until it becomes easier to code a level than to decompose it further.

6.2.1.2. Bottom Up Composition

This methodology builds a system from the functionality of low levels into larger modules:

Build from known functionality.
“What do I know it needs to do?”
Identify low-level capabilities.
Group common low-level aspects.
Continue upwards, or try top-down.

6.2.1.3. Top Down vs. Bottom Up

TD – Easier to break up a problem.
BU – If top level is vague.
BU – Systems are not always hierarchical by nature.
TD – A system may require a single function at the top level.
BU – Identifies common routines early.
TD – BU can be hard to use.

The most important point is to use all methodologies, or parts of them, and not rely on one.

6.3. Object-Oriented Design

6.3.1. Key Ideas.

The main idea of OO design is to identify “Real World” and abstract objects. Replace these with Programming Language objects.

Ideas to bear in mind:

Abstraction. Don’t worry about small, low-level issues.
Encapsulation. Think of what data is public, and what is hidden.
Modularity. As with non-OO design.
Hierarchy and Inheritance.
Objects and Classes. Classes are programming concepts. Objects are dynamic, with data.

6.3.2. Design Steps

Identify Objects and attributes.
Determine methods (what can be done to the objects).
Determine what objects can do to the other objects.
Identify Public, Friend and Private properties/methods.
Define the Public Interfaces.

6.3.3. Typical Components

Problem Domain Component. This is the highest-level component.
User Interface Component.
Task Management Component. System functions.
Data Management Component.

6.4. Comments on Popular Methodologies

The most important thing to remember about methodologies is to never restrict yourself to one.

All methodologies have the following in common:

A way of decomposing a system.
A way of communicating this decomposition.

6.4.1. When to use Structured Design

When the system needs breaking down.
When there are lots of functions that don’t interact.
When the functionality is likely to change.

6.4.2. When to use Information Hiding

As often as possible.

6.4.3. When to use Object Oriented Design

When a higher level of abstraction is required.
Once the objects and interfaces have been designed, switch to Structured design.

6.5. Round Trip Design

This means that the design is iterated until a satisfactory design is reached. Make use of all design methodologies.

“Design is a Sloppy Process”. This is because the right answer is hard to distinguish from the wrong one. Ask 3 people to design a system, and you will get 3 different designs. These design needs to be refined again and again.

“Design is a Wicked Problem”. This is because it is sometimes necessary to solve the problem once in order to understand it fully, whereby a second design can be done.

6.6. Design is a Heuristic

It is important to remember that the point of design is to help write the solution. This can be accomplished by using any number of methodologies. The following are useful heuristics:

Trial and Error.
“Brute Force” this means a non-elegant solution that works. This is better than an elegant one that doesn't.
Pictures.
Don’t get stuck on one single approach.

6.7. How to solve it

Remember the following:

Understand the problem.
Devise a plan.
Carry out the plan.
Look back.

6.8. Summary

Many design methodologies are available.
Use a combination of methodologies and design tools.
Do not get stuck using one single design methodology.
Design is a Heuristic.
Iterate the design process.

6.9. Forwarding Actions

Discuss design methodologies used.

7. Creating Data

7.1. Reasons to create your own Data Types

Custom types are very powerful.
Changes can be made centrally.
Promotes Information Hiding by reducing Information Distribution.
Reliability.
Compensates for language weaknesses.

7.2. Guidelines for creating Data Types

Use Function-orientated names.
Avoid using pre-defined data types.
Don’t redefine a defined type.
Define substitute types for portability.
Create Types using other types.

7.3. Making Variable Declarations Easy

Use a template for variable declarations.
Turn off, and don’t use implicit declarations.
Declare all variables.
Use naming conventions.
Check variable names. Check for different variables for same function, and unused variables.

7.4. Guidelines for Initializing Data

Improper data initialization is very costly of debugging time. These are typical reasons for data initialization errors:

The variable has never been assigned a value.
The value is outdated.
Only part of the variable has been assigned.

Useful tips:

Check Input parameters.
Initialize variables near to declaration.
Pay special attention to counters and accumulators.
Check the need for re-initialization.
Initialize named constants once.
Initialize variables when they are defined.
Make use of compiler warnings.
Use compiler settings to auto-initialize variables.
Use memory-access checks to find bad pointers.
Initialize working variables at the start.

7.5. Summary

Custom Types can be very useful.
Use Naming Conventions.
Ensure that variables are initialized.

7.6. Forwarding Actions

Use Naming Conventions.
Ensure that variables are initialized.

8. The Power of Data Names

8.1. Considerations in choosing good names

The main point of naming variables is that the name accurately describes the purpose.
The name should be problem-orientated. This means that the name is related to what is being achieved not how it is being achieved.

· The optimum name length is suggested to be between 8 and 20 characters, typically about 16.

8.1.1. The Effect of Scope on Variable names

Typically, longer names are better than shorter ones. This is not always the case: if a variable is to be used as a loop counter, it is acceptable to call it “a”, or “i”. The reason for this is that the variable is a temporary variable only, which will not exist outside of the scope of the procedure in which it is used.

If a variable is to be used throughout a module, then it should be named more meaningfully.

Basically, the length of a variable should reflect its importance.

8.1.2. Computed-Value Qualifiers in Variable Names

Many programmers use prefixes to denote calculated values (e.g. Ttl, Sum, Avg). If using this approach, the important thing to remember is to be consistent.

8.2. Naming Specific Types of Data

Loop Indexes. Use simple names, except if they are used afterwards.
Booleans. The name should imply a true condition. Don’t use the word “Not” at the beginning, as this could be confusing: IF NOT (NotSomething).
Enumerations. Ensure that the names are similar.
Constants. These should always have an abstract name.

8.3. The Power of Naming Conventions

More can be taken for granted. If a good convention is used, then it is possible to make assumptions about variables.
Knowledge can be transferred more easily across projects, by using the same convention for each project.
Code can be learnt more quickly on a new project.
Naming proliferation is reduced.
Naming Conventions can compensate for language weaknesses. Named constants and enumerations can be emulated.
If non-structured data is used, naming conventions can emphasize relationships between variables (e.g. All Customer fields start with “Cus”).

· Any naming convention is better than none at all.

8.4. Informal Naming Conventions

Here are some guidelines for creating a language-independent naming convention:

Identify Global Variables. Ensure that all Global Variables have a specific prefix.
Identify Module Variables. As above.
Identify Type Definitions.
Identify Named Constants.
Identify Enumerations.
Identify I/O variables in languages that don’t enforce them.
Format all names to enhance readability.

8.5. Hungarian Notation

This naming convention consists of naming the variable in three parts:

Variable Type.
Prefix.
Qualifier.

This convention is widely used in C.

The main disadvantage with this is that the variables will never have abstract names.

8.6. Short Names

When using short names, follow these guidelines:

Use standard abbreviations.
Remove all non-leading vowels.
Use the first (typically four) letters of each word.
Truncate words.
Use up to three significant words.
Remove useless suffixes.
Keep the most noticeable sound in each syllable.

It is noted that some programmers use phonetic names (before = b4), but I would not recommend this.

8.7. Kinds of Names to avoid

Misleading names and abbreviations.
Names with similar meanings.
Similar names with different meanings.
Similar sounding names.
Numerals.
Bad spelling.
Commonly misspelled words.
Don’t differentiate by using capitals.
Avoid Library routine names.
Don’t use irrelevant names.
Don’t use hard to read characters.

8.8. Summary

Establish a naming convention and stick to it.
Any Naming Convention is better than none at all.

8.9. Forwarding Actions

Review Naming Conventions if necessary.

9. General Issues in Using Variables

9.1. Scope

The key is to minimize the scope of variables. If variables are global, the program is likely to be easier to write, but if they are not, the program will be easier to read.

Ensure that variable references are kept together.

9.2. Persistence

Avoid misreading the persistence of variables.
Use Debugging code and assertions.
Use code that doesn’t make assumptions about variables.
Initialize Data just before use.

9.3. Binding Time

There are three types of data binding.

9.3.1. Code Time Binding

This refers to hard-coded variables that are assigned values in the source code.

9.3.2. Compiler Time Binding

This refers to variables that are assigned values from constants.

9.3.3. Run Time Binding

This occurs when the program is running, and variables are assigned dynamic values.

9.4. Relationship between Data Structures and Control Structures

Data Structured design: Modify the Input to get the Output.
Sequential Data. This refers to a list of sequential statements and actions.
Selective Data. This refers to IF statements.
Iteration. This refers to repeated actions, such as for-next, do-until.

9.5. Use each Variable for one purpose only

It is confusing to use a single variable for more than one purpose. It is possible to do so subtly, but it is not recommended.
Avoid variable names with hidden meanings. The meaning may be clear to the developer, but not to anybody else.
Declare all used variables and remove declarations of unused variables.

9.6. Global Variables

Global variables are tricky. They can be very useful, but also extremely risky to use.

9.6.1. Common Problems

Inadvertent changes to global data.
Aliasing. This is a strange situation, where a global variable is passed to a routine as a parameter, and the routine changes the global data.
Re-entrant code problems. These can occur when multiple threads of an application use the same global data.
Hinders code re-use. A routine can’t be plugged in if global data needs to be set up.
Can’t Modularize. If global data is used, the system can’t be separated into modules.

9.6.2. Reasons to use Global Data

Preserves Global Values.
Allows substitution of named constants in languages that do not support this.
Streamlines use of very commonly used data.
Reduces “Tramp Data”. This refers to data that is passed as parameters to a routine that are only passed to another routine within the first. They are not actually used within the first routine.

9.6.3. How to Reduce the Risks of Using Global Data

Only use Global Data if necessary.
Differentiate between global data and module data.
Use Naming Conventions.
Create a list of Global Variables.
Lock Global Variables when they are in use. Do this by implementing a status variable for each global variable.
Don’t simply create one single global structure and pass it everywhere. This is just pretending to not use Global Data.

9.6.4. Use Access Routines instead of Global Data

Advantages are:

Centralized control.
Ensures that References are firewalled.
Promotes Information Hiding.
Allows Abstract Naming.

Advantages:

All Routines must go through one Access Routine.
Splits Module Data and Global Data.
Builds a level of abstraction into the code.
Keeps accesses at the same level of abstraction.

9.7. Summary

Use Variables correctly, for one purpose only.
Name variables appropriately.

9.8. Forwarding Actions

None.

10. Fundamental Data Types

10.1. General Numbers

General things to remember:

Use Named Constants instead of literals.
Use 0s and 1s if necessary, for initialising loops and incrementing.
Anticipate divisions by zero.
Make type conversions obvious.
Don’t make comparisons between different data types.

10.2. Integers

Check when dividing. Remainders, and decimal values.
Check for overflow values.
Check for overflows in intermediate results.

10.3. Floating Point Numbers

Avoid + and – operations on numbers with vastly different magnitudes. (5,000,000.01 + 0.01)
Avoid exactly equal comparisons. Instead define an acceptable range of accuracy, such as + or – 0.01.
Check for rounding errors.

10.4. Characters and Strings

Use Named constants.

These allow for changes more easily.

Allow for changes in international versions.
Strings can take up a lot of memory. These problems are easier to solve if the string values are independent of the code.
Cryptic values can be easier to understand with abstract naming.

10.5. Booleans

Use these to document programs.
Use them to simplify complex evaluations.
They can be defined if needed.

10.6. Enumerations

These improve readability.
Also improve reliability.
Programs can be modified more easily.
These can be used as alternatives to booleans for system state variables. This is especially useful where a new state is introduced.
Invalid values can be checked for.
When using enumerations, ensure that the first value is invalid/unset.

10.7. Named Constants

Use them in Data declarations, to define lengths, array limits etc.
Avoids using literals.
These can be simulated with global variables if named constants are not supported.
Use them consistently.

10.8. Arrays

Ensure that all indexes are within the bounds of the array.
Consider the use of sequential data structures, such as stacks and queues if dynamic access is not required.
Check the end points of the array.
The array is multidimensional, then ensure that the subscripts are used in the correct order.
Ensure that the correct subscript value/variable is used.
Add an extra element at the end of the array.

10.9. Summary

Always ensure that the correct data type is used for the correct purpose.

10.10. Forwarding Actions

None.

11. Organizing Straight Line Code.

11.1. Statements that must be in a specific order

Organize the code so that dependencies are obvious.
Name routines so that dependencies are obvious.
Use routine parameters to make dependencies obvious.
Document unclear dependencies.

11.2. Statements who's order does not matter

Make Code read from top to bottom.
Localize variable references.
Keep variable references close to where they are used.
Keep variables live for as short a time as possible.
(Span = number of lines where between variable uses)
(Live = Number of lines that use the variable)
Group Related Statements.

12. Using Conditions

12.1. IF statements

Write the normal program path, and then introduce conditions.
Branch correctly on equality.
Put the normal path in the IF condition, and the unusual conditions in the ELSE.
Follow an IF with a meaningful statement.
Consider what to put in the ELSE statement. Implement the ELSE with a comment if code is not required.
Check for inadvertent confusion between the IF and ELSE conditions.

12.2. IF chains

Simplify complex IF blocks with a boolean function.
Put the most common cases in the IFs and less common in the ELSE.
Ensure that all possible cases are covered.
Consider replacing with CASE statements.

12.3. CASE statements

Order cases alphabetically or numerically.
Place the normal case first.
Order cases by likely frequency.

12.4. Tips

Keep actions simple.
Don't make up fake variables just to use CASE. Use IFs if it is correct to do so.
Use the ELSE/Default clause to identify unexpected cases.

13. Controlling Loops.

13.1. Select the kind of Loop to use.

If you know the number of times to loop, use FOR/NEXT.
If you do not know, use DO/WHILE.

When to use WHILE:

Always test the condition at the start of the loop.
Only test at the end of the condition if it is impossible not to test at the start.
It is correct to test at the end of the loop when the loop will always require at least one pass.

When to use EXIT DO:

Sometimes it is necessary to conditionally exit a loop part way through the code block.
Put all EXIT DOs together in one place.
Use comments.
Don't perform a GOTO into the middle of a loop.

When to use FOR/NEXT:

When the number of times to loop is known in advance.

13.2. Controlling the Loop.

When building a loop code block, use the same principles for building a routine.

Simplify the code.

Entry points:

There should be one point of entry.
Initialize the loop variables directly before the start of the loop.
Don't use a FOR/NEXT loop when a WHILE loop is more appropriate.

Process the Middle of the Loop:

Use BEGIN/END, or {} (where language supports it).
Avoid having empty loops.
Perform loop housekeeping operations at either the start or end of the loop.
Each loop should perform only one function.

Exiting the Loop:

Be assured that the loop will terminate, and not run infinitely.
Make termination conditions obvious.
Don't budge a FOR/NEXT loop just to force it to exit.
Avoid code that relies on the final value of a loop counter.
Consider the use of safety counters.
Only use EXIT DOs with care.
Be wary of loops with lots of EXIT DOs

Checking End Points:

Manually check all exit points.
Check for errors.

Using Loop Variables:

Use enumerations for loop limits.
Use integers (whole number variables) only.
Use meaningful variable names when nesting loops.
Use meaningful variable names to avoid referencing the wrong variable in the wrong loop (Cross Talk).

How long should a Loop be?

Short enough to view on one page (printed or screen).
Avoid nesting more than three levels.
Make long loops especially clear.

13.3. Creating Loops form the inside out.

Code the Main condition.
Code other conditions.
Fill in the rest of the code.

13.4. Correspondence between loops and arrays

Use FOR EACH or other language functions where possible.

14. Unusual Control Structures

14.1. GOTO

Use GOTO only in non-structured languages.
Other use violates structured programming.
GOTOs can eliminate duplicate code.
ON ERROR GOTO use is acceptable.
It is best not to use GOTOs.

14.2. RETURN/EXIT

Minimize the use of.
Use only if it enhances readability.

14.3. Recursion

These are routines that call themselves.
They can provide very elegant solutions.
These can fill up stack space.

Tips:

Ensure that recursion stops.
Use safety counters to prevent infinite recursion.
Limit recursion to one routine.
Watch the stack.
Don't use recursive routines for Factorial or Fibonacci numerals.

15. General Control Issues

15.1. Boolean Expressions

Use "True" and "False" not "1" and "0".
Use implicit comparisons.

Simplifying Boolean Expressions:

Break them up into multiple tests.
Move complicated tests into boolean functions. These are easier to read.
Consider using lookup tables.

Formatting Boolean Expressions positively:

Write the IF statement to read as a true condition.
This can contradict writing the IF with the normal condition first. Consider what is best to use.
Convert: "IF NOT <A> OR NOT <B> THEN" to "IF ( NOT (<A> AND <B>) ) THEN".
Use parentheses to clarify expressions.
Write Numeric expressions in numerical value order (<MIN> < <X> AND <X> < <MAX>)

15.2. Compound Statements

Write BEGIN/END or {} first.
Fill in the code.

15.3. NULL Statements

These are conditional statements that do nothing.
Call attention to them.
Use comments.

15.4. Taming Dangerously Deep Nesting

If there is dangerous nesting, restructure the tests:

Simplify by re-testing conditions.
Convert nesting to IF/THEN/ELSE.
Consider replacing with CASE statements.
Factor Code into routines.
Simply redesign the tests!

15.5. The Power of Structured Programming

Have one entry point and one exit point.

Use three components:

Sequence.
Selection.
Iteration.

15.6. Control Structures and Complexity

Good design reduces complexity.

How to reduce complexity:

Measure complexity by using a count:

Start with 1 for the main path.
Add 1 for each IF, DO, FOR, AND, OR, ELSE statement.
Add 1 for each case in a CASE statement.

Evaluate Complexity:

OK.
Consider simplifying code.
10+ Split the routine up into smaller routines.

Consider other ways of measuring complexity:

The amount of data.
Nesting levels.

16. Layout and Style

16.1. Fundamentals

Layout should show the control structure.
Layout should be better than just "Pretty".
Make use of indentations and blank lines to separate code blocks.

Objectives of good layout:

Accurately represent the structure of the code.
Consistently represent the structure of the code.
Improve Readability.
Withstand modifications to code.

Layout preferences are largely subjective. Discuss styles with colleagues to come up with a standard.

16.2. Layout Techniques

16.2.1. White Space

These are spaces, tabs, blank lines and the like. Use these to group related statements.
Use blank lines to separate control blocks.
Align blocks of '=' in order to show that they are related.
Use tabs to indent control blocks.

16.2.2. Parentheses

Use parentheses in complicated calculations to show the order of processing. A calculation will follow the order of precedence, but use of parentheses will make the order more obvious to the reader.
Ensure that the calculation is not changed.

16.3. Layout Styles

16.3.1. Pure Blocks

These are where indentation is easily used to highlight the blocks:

IF ----- THEN

------

END IF

The IF appears leftmost, the conditional code is indented and the end of the condition is un-indented. Clear layout.

16.3.2. End-line Layout

This is a slightly different method:

IF ---- BEGIN

-----

END

This example shows what would happen if a language does not support explicit 'END IFs'.

Where the previous example clearly showed the bounds of the conditional code, this example shows the start of the conditional code, but does not clearly show the end.

It is possible to have a large number of statements within the conditional section. If some are separated by blank lines, then this method could be confusing.

Nested conditions and loops become confusing

Pure blocks can be emulated:

IF ---- BEGIN

-----

END

16.3.3. BEGIN-END Block Boundaries

A substitute for pure blocks is to substitute BEGIN-END pairs as the boundaries of blocks:

IF ---- THEN

BEGIN

-----

END

16.4. Laying Out Control Structures

16.4.1. Fine Points of Formatting Control Structures

Avoid Un-indented BEGIN-ENDs.

Avoid Double Indentation:

IF --------- THEN

BEGIN

-----

END

END IF

16.4.2. Other Considerations.

Use blank lines between paragraphs.
Format single-statement blocks consistently.
For complex expressions, separate conditions on separate lines.
Avoid using GOTOs.

16.5. Laying Out Individual Statements

16.5.1. Statement Length

Typically, 80 characters is said to be the longest length, for the following reasons:

It is intellectually hard to read and take in longer than 80 characters.
This discourages deep nesting. By indenting 5 characters for every nested control structure, the statement limit is shortened for deep nestings.

Also:

"It won't fit on 8.5" x 11" listing paper."
"Bigger paper is harder to file."

16.5.2. Using spaces for clarity

Use spaces between arithmetic operators, as this will make expressions more readable.
Array references will be more readable.
Passed Parameter Lists are more readable.

16.5.3. Aligning Related Statements

By aligning the '=' clause of a number of similar or related assignments, their relationship is implied.

16.5.4. Format Continuation Lines

Line continuations should be obvious to the reader. By having 'AND' or '+' as the last part of a line, the incompleteness is highlighted.
Keep closely related elements together. For example, function calls: don't make the function name appear on one line, and the parameters on the next.
Indent all continuations by the standard amount.
Make it easy to find the end of a continuation line.
Indent Control-Statement continuation lines by the standard amount.
Indent assignment continuation lines after the assignment operator.

16.5.5. Use Only One Statement Per Line

Don't use multiple statements per line, in languages that support it.

16.5.6. Laying Out Data Declarations

Use Alignment.
Use only one variable declaration per line.
Order declarations sensibly.

16.6. Laying Out Comments.

Indent comments accordingly.
Have a blank line above the comment.

16.7. Laying Out Routines.

Use blank lines to group and separate related parts of a routine.

· Use standard indentation for routine arguments.

16.8. Laying Out File, Modules and Programs.

Typically have only one Module in a file.
Separate Module Routines clearly.
Identify multiple modules clearly.
Sequence routines in alphabetic order.

16.9. Summary.

The purpose of layout is to make code reading easier. The reader may be another person, or yourself re-reading code after a long period of time. Code that is easy to read is easy to understand, and generally makes life easier for the reader.

16.10. Forwarding Actions

Establish layout guidelines and encourage adherence to them.

17. Self-Documenting Code.

17.1. External Documentation.

These are common types of documentation:

Unit Development Folder. This is an information repository of all documentation concerned with a project. This will also include notes by developers.
Detailed Design Document. This is the low-level design of the system, and includes all information for a programmer to build the program.

17.2. Programming Styles as Documentation.

By using good Naming Standards and Commenting, then code listings can be a useful form of documentation.

17.3. Commenting.

This provides a higher level of abstraction, to the point where just reading the comments should tell the reader what the routine accomplishes, without the need to understand the source code.

17.3.1. Types of Comments

Repetition of code. This is simply stating a code line in another way. Of no more use than the code itself.
Explanation of code. This is where comments are needed before a complex section of code. It is better to make the code clearer, than to try to explain it.
Code Markers. Used by programmers to mark a section specifically.
Code Summary. Summarize the purpose of a small number of lines. This is better than Explaining code, because the code should be largely self-explanatory.
Description of code intent. This is a more abstract comment, which focuses on the problem, not the solution.

17.3.2. Commenting Efficiency

Follow these guidelines to make commenting efficient:

Use styles that are easy to maintain.
Use the PDL to Code method as described during design.
Comment code as you go along. It is easier to add a few comments as you work on a routine, than to laboriously go through several thousand lines afterwards.

17.4. Commenting Techniques.

17.4.1. Individual Lines.

Avoid Self-Indulgent Comments. Anything that is not pertinent is better off not being there.

Using End-line Comments:

These are problematic, as they have to be aligned separate to the main code.
Avoid using them, as it can be difficult to comment for a single line of code.
It is even worse when using an end-line comment for multiple lines of code.

When to use single line comments:

Variable declarations.
Maintenance notes.
End Blocks.

17.4.2. Commenting Paragraphs

Use these guidelines when commenting blocks of code:

Write them at the level of intent.
Focus on why, not how.
Prepare the reader for the code that follows.
Make comments count. Don't fill up a routine with lots of unhelpful comments.
Document surprises. Anything that is out of the ordinary should be noted.
Avoid abbreviations. Comments are not as helpful when the reader has to wonder what abbreviations mean.
Differentiate between major and minor comments.
Document ny error avoidance or an undocumented feature in an environment. Explain why something unusual is being done. This may be the only record of how and why.
Justify Violations of good practice.

Most importantly, don't spend time commenting bad code. Rewrite it.

17.4.3. Commenting Data Declarations

Comment units of numerical data.
Comment allowed value ranges.
Comment coded meanings, such as enumerations.
Comment limitations of input data. Expected results, etc.
Document the meanings of bit parts of binary numbers, or strings.
Comments that are related to a variable, place them next to the variable name.
Document global data. This is important, as global data is notoriously tricky.

17.4.4. Commenting Control Structures

Place comments before the control structures. Describe the intent of the block.
Comment the end of a control structure.

17.4.5. Commenting Routines

Keep comments close to the code that is being described.
Describe routines in the form of one or two sentences.
Document Input/Output variables where they are declared.
Differentiate between Input and Output data.
Document any Interface assumptions. Such as legal or expected values.
Track change history.
Comment on limitations of the routine.
Document any global effects.
Document the source of algorithms. Where they came from.
Use comments to mark parts of your program. Allows string searches for common comments.

17.4.6. Commenting Files, Modules and Programs

General guidelines:

Describe the purpose of the file.
If working in a large company, include your name and telephone extension. Other people may need to speak to you about your code.
Include a copyright statement if your company prefers to do so.

17.4.7. Using the "Book" Paradigm for commenting

As with a book, commented code can include the following equivalents:

Preface.
Table of Contents.
Sections.
Cross References.

17.5. Summary

Commenting Code is very useful as documentation. By reading the comments in a code listing, the purpose and functionality of the routines can be identified abstractly.

17.6. Forwarding Actions

Agree on a commenting methodology.

18. Programming Tools

18.1. Design Tools

The main types of design tools available are:

Graphical Tools. These are packages that allow program design to be made easier by way of graphical representation. Such as the top-down methodology.
C.A.S.E. Tools. This stands for Computer Aided Software Engineering. These are applications that allow various parts of the design process to be done using software. I have had personal experience of Accelerator.

18.2. Source Code Tools

Old days of computer programming consisted of writing code in very basic text editors, and then submitting the source code to compilers, and then going through linking stages and eventually arriving at the end with a packaged executable application. Nowadays, the coding process is easier.

18.2.1. Editing

Editors now provide a wide range of useful features:

Compilation and Debugging.
Compressed views of programs. Such as viewing routine names without the code.
Interactive Help.
Automatic Begin/End matching.
Control Structure templates. For example, completing a FOR expression.
Smart indenting.
Macros.
List of variables.
Search and Replace. Including across files.
Multiple file edits.
Multi-level Undo.
File comparators. Comparing versions of code.
Code Beautifiers. These perform the job of good layout.
Templates. Allowing skeleton programs to be written, and used many times as a template.

18.2.2. Browsing

Browsers are utilities that can look through multiple files, and find references to a particular variable, or function call. Examples of these are:

Cross Reference Tools.
Call Structure Generators. These list information about how routines call each other.

18.2.3. Analyzing Code Quality

Syntax and Semantic checkers.

These look for common mistakes, such as detecting:

IF (X = Y)

Rather than

If (X == Y)

Metrics Reporters.

These produce a large amount of information to measure aspects of a program. From number of variable declarations, right through to which programmers produce the most errors.

18.2.4. Restructuring Source Code

Two main types of tools are available:

Restructures. These can convert code that is full of GOTOs into structured code. However, if the design of the code is bad to start with, it will probably stay bad even if restructured.
Code Translators. These convert source code from one language to another. Again, though, if the code is bad, it will always be bad.

18.2.5. Version Control

Any software development that is being worked on by more than one person should be using a version control system. Even if not, Version control provides advantages:

Controls Source code. Only one person at a time can make a change.
Allows different versions to be compiled for different types of deployment.
Allows retrieval of old versions.

18.2.6. Data Dictionaries

These are databases of all variables, file definitions and other data relevant to a program.

18.3. Executable Code Tools

18.3.1. Code Creation.

Linkers. These allow modules from more than one language to be linked.
Code Libraries. The biggest saving in programming cost and time is in code re-use. If the code has not been written, buy it from someone who has.
Code Generators. These are reasonable to use when building prototypes. However, complex programs are better to write from scratch.
Macro Pre-Processors. Replaces named constants before compiling. These are also useful for building debug versions.

18.3.2. Debugging.

Compiler Warnings.
Scaffolding.
File version comparators.
Execution Profilers.
Interactive Debugging.

18.3.3. Testing.

Scaffolding.
Results comparators.
Auto Test Generators.
Test Case recording and playback.
Logic analysers.
Symbolic debuggers.
Memory checkers.
Databases of defects.

18.3.4. Code Tuning.

Execution Profilers. These show code usage and where hotspots are.
Disassemblers. These convert the machine code into assembly code.

18.4. Tool-Orientated environments.

UNIX. This is a programmer's operating system. It has lots of development tools.

CASE. These are promoted as being helpful for the development process, however, any one tool will never support the complete development lifecycle.

APSE. Ada-Programming-Support-Environment. Consists of:

Code Creation Tools.
Code Analysis tools.
Code maintenance tools.
Project Support tools.

18.5. Building your own tools.

Ask the average programmer that if he could do a job in 5 hours or write a tool in 4 3/4 hours that could perform the job in 15 minutes. They would typically opt for the tool. It is better to write tools once that can perform repeated jobs. There are several kinds of tools that are generally written:

Project-Specific tools. These are tools that are unique to the project, and of no real use otherwise.
Scripts. These are batch files that perform usual sequences of commands.
Batch File Enhancers. These are tools that increase the functionality and power of batch files.

18.6. Summary

There are a large number of Programming Tools available. Modern development makes use of some of these.
If you decide to make use of a tool, be sure of the benefits and limitations. Ensure that you have the right tool for the job.

18.7. Forwarding Actions

CASPIAN makes use of development tools. I would be interested in seeing the output from a Disassembler, but this is purely an academic interest, and not of any practical use at the moment.

19. How Program size Affects Construction

19.1. Effect of Project Size on Development Activities

· Where there are larger teams, there are more communication paths. This means that there are more possibilities for communications errors.

· It is best to streamline communications, such as by using documents.

· In smaller projects, the construction takes up a larger percentage of the time than in a larger project. Larger projects are managed more formally.

19.2. Effect of Project Size on Errors

It is usually the case that the larger a project, in terms of lines of code, the greater the error density.

19.3. Effect of Project Size on Productivity

The key to productivity in a small team is usually down to the programming skills of the members. In larger teams, the key is more to do with the team itself, and the organization of the team.
Code output is also noticeably greater with smaller projects.

20. Managing Construction

20.1. Encouraging Good Coding.

20.1.1. Considering in Setting Standards.

Don't! Programmers usually resent standards. Instead, set guidelines or suggestions.

20.1.2. Techniques.

Assign two people to each part of the project.
Review all code.
Require Code sign-offs.
Publish examples of good code. This will help communicate your objectives.
Emphasize that code listings are public assets. Programmers can get protective of their code.
Reward Good Code. It is important to remember to give the programmers something they want as a reward. Also ensure that only exceptionally good code is rewarded.
"I must be able to understand it." This will discourage tricky code.

20.2. Configuration Management

20.2.1. What is configuration Management?

Change control. If changes aren't managed well enough, then problems will occur.

20.2.2. Software Design Changes

Use the following methods:

Formal Change Control procedure.
Establish a Change Control review board.
Handle changes in groups, not one at a time.
Costs estimate each change.
Be wary of major changes.

20.2.3. Software Code changes

The best control of software changes is by way of a Version Control System. The advantages are:
File locking. Only one person can make a change at a time.
Get Latest Version.
Review old versions.
Personal Backups are not necessary.

20.3. Estimating a Construction Schedule

20.3.1. Approaches

Use Scheduling Software.
Use an algorithmic approach (Such as COCOMO).
Use external estimation experts.
Have walkthrough meetings for estimates.
Have individuals estimate on their own parts, and add together.
Estimate time for the whole project, and then divide into sections.
Use previous experience.
Keep old estimates, to see how accurate they were.

20.3.2. Establish Objectives

Allow time for estimation, don't rush it.
Formalize Requirements.
Estimate at a low, detailed level.
Use different estimation techniques and compare the results.
Periodically re-estimate.

20.3.3. Influences on Schedule

The largest influence is the size of the program.
Others are:
Team Motivation.
Management Quality.
Amount of Code that is reused.
Personnel Turnover.
High or Low Level languages.
Stability of requirements.
How good a relationship there is with the customer.
User participation in requirements.
Customer experience with the type of application.
How much programmers are involved with requirements.
Classified Security environment for hardware and software.
Amount of documentation.
Project objectives.

20.3.4. Estimation vs. Control

Once a project is estimated, you must be able to keep up with it.

20.3.5. What to do if you are behind

There are several options:

Hope that you'll catch up. You probably won't.
Expand the team. This typically has the opposite affect, and makes the project even later.
Reduce the scope. Separate the things that are "Must Haves", "Nice to haves" and "Optional".

21. Software Metrics.

The key is that any measurement is better than none at all. Not measuring means not knowing what is going on.

21.1. Treating Programmers as people

It is important to remember that programmers are not simply tools. Treating them well results in better performance. Programmers do not just spend their time coding.

21.2. Variations in performance and quality

Individual Performance. Experience does not always mean better performance and productivity.

Team Variation. It has been shown that good programmers work better together, than a mixture of good and bad.

21.3. Religious Issues

Programmers develop their own style, and trying to force compliance with certain areas could incite resentment.

Here are typical religious areas:

Use of GOTOs.
Use of language.
Indentation.
Placing of BEGIN/END keywords.
Choice of editor.
Commenting style.
Efficiency vs Reliability.
Choice of methodology.
Programming utilities.
Naming conventions.
Use of Global Variables.
Metrics, such as number of lines of code per day.

If you need to control programmers, consider the following:

Be aware that this is a sensitive area.
Use suggestions, not rules.
Rather than dictate how code is to be written, review it until it is "Clear".
Let programmers develop their own standards.

21.4. Physical Environment

This affects productivity as well. If people are uncomfortable or interrupted, the performance will be lower.

If the manager is being awkward, use the following approaches:

Refuse to do things the wrong way.
Pretend to do things his way.
Plant ideas, and wait for him to suggest them.
Educate the manager.
Find another job.

It is best to try to educate the manager.

21.5. Summary

Any measurement is better than none at all.

22. The Software Quality Landscape

22.1. Characteristics of Software Quality

System quality:

Correctness.
Usability.
Efficiency.
Reliability.
Integrity (security).
Adaptability.
Accuracy.
Robustness.

Programming Quality:

Maintainability.
Flexibility.
Portability, ability to run in more than one environment.
Reusability.
Readability.
Testability.
How easy it is to understand.

22.2. Techniques for improving Software Quality

Declare Objectives.
Set explicit QA activities.
Testing Strategy.
Software Engineering Guidelines.
Informal Technical Reviews.
External Audits.
Change Control.
Risk Management.
Measure results.
Use Prototyping.

Setting Objectives.

It is worth noting that of all the areas of quality, emphasizing one typically reduces others. It is difficult to ensure quality in all areas.
Remember to clearly define the quality objectives.

22.3. Relative Effectiveness of techniques

22.3.1. Percentage of Errors found

There are several methods to identify defects:

Personal checking of design.
Informal group design reviews.
Formal design inspections.
Formal code inspections.
Modeling and Prototyping.
Personal checking of code.
Unit Testing. Single modules.
Functional Testing. Related Modules.
Integrated Testing. Complete system.
Field Testing. Live.

It is very important to note that each of these methods can detect a certain number of defects. However, by combining any two methods, the number of defects found is increased dramatically. Hence, it is best to combine several methods of checking.

22.3.2. Cost of finding defects

The cost of different methods is different. However, more costly methods can show proportionally more defects.
Efficiency of defect detection improves with a company's experience.

22.3.3. Cost of fixing defects

One-step techniques for defect detection are usually cheaper transmute-step methods, such as testing, which require analysis to find the reason for the error.
Apply error detection methods to all stages of construction.

22.4. When to do a QA

Ensure that QA is done at all stages of construction.

22.5. General Principle of Software Quality

Improving Quality reduces cost.

22.6. Summary

Quality is important.
Good quality code is cheaper overall.

22.7. Forwarding Actions

Establish quality standards.

23. Reviews

23.1. The role of reviews

23.1.1. Reviews complement other QA techniques

These help to find errors by incorporating other peoples' viewpoints that can get round blind spots.
The purpose is to improve the quality of the code.
Reviews are more effective at identifying errors, as well as different types of errors than would be found during testing.

23.1.2. Reviews remove corporate structure.

This helps to guide young programmers with senior programmers experience.

23.1.3. Reviews assess Quality and Progress.

These assess from a managerial level where things are up to.

Also, from a technical point of view, the way things are being done can be assessed.

23.1.4. Reviews also apply before construction.

Use reviews when designing and planning as well.

23.2. Inspections.

These are different from reviews in several ways:

Checklists focus the reviewers' attention on previous problem areas.
The purpose is defect detection, not correction.
List of problems are prepared.
Roles are assigned to the participants.
The moderator is never the author.
The moderator has been trained in moderating.
Data is fed back to future inspections.
General management is not involved, it is technical.

Inspections can typically expect to find up to 60% of defects.

23.2.1. Roles During Inspections

Moderator. Sets up the inspection, and controls it.
Author.
Reviewer. Never the author, typically a tester or architect.
Scribe. The secretary, takes notes.
Management. NO! This is technical, if management are involved, then the meeting would become political.

23.2.2. Procedure for Inspections

Planning. The moderator sets up the inspection.
Overview. The author describes the technical environment.
Preparation. The reviewer is familiarized with the code.
Meeting. Author reads and explains the code. Possible errors are discussed. When errors are agreed, the type and severity are noted. Typically, up to 500 lines of code can be reviewed per hour.
Inspection Report. Contains a list of errors and the time spent.
Reworking. The moderator assigns defects to be fixed.
Follow-up. If more than 5% of the design or code is changed, meet again.
3rd hour meeting. This is to discuss solutions. The main meeting is to find defects.
Fine Tuning. Modify the process if a better way is found. It is always a good idea to list common errors.

23.3. Other kinds of reviews

23.3.1. Walkthroughs.

These are hosted and moderated by the author.
These improve quality, not measure it.
All attendees prepare by reading the code beforehand.
Senior personnel pass on their experience to juniors.
Juniors can suggest new methodologies.
Walkthroughs last typically 30-60 mins.
These detect erros, not correct them.
Management does not attend.
This is a flexible process.

These are more informal than inspections, and can be preferable. However the inspection typically pays of better, due to the formality.

23.3.2. Code Reading.

The author gives about 2-4000 lines of code to two separate people. These three then meet and discuss the errors. With his kind of review, most of the errors are found during the preparation, rather than in the meeting.

23.3.3. "Dog and Pony shows"

These are reviews that are for the benefit of management. These are not technical.

24. Unit Testing

24.1. The Role of Unit Testing

The intention is to break the software. Testing cannot prove the absence of errors.

Testing is not meant to improve the quality of software. It is used to measure it.

You must expect to find errors. If you are convinced that errors do not exist, then you may consciously or subconsciously avoid weak areas.

24.2. Testing During Construction

Testing in this stage should consist of:

Write the routine.
Mentally check the code.
Test the logic of the routine, by stepping through each line.

24.3. The Testing Bag of Tricks

24.3.1. Incomplete Testing

This means that the test cases are selected so that they don't produce duplicate results. This is done to make the most use of the time available.

Test likely errors. Decide what is most likely to produce errors, and test those areas.

24.3.2. Structured Basis Testing

This is like "White Box" testing. That is to say, the test cases are selected in order to test every part of the logic. As the logic is known to the tester, all logical paths through the system can be tested.

24.3.3. Data Flow Testing

Data and variables can be defined as being in the following states:

Defined. Initialized, but not used.
Used.
Killed. Terminated or undefined.
Entered. A routine has been entered before a variable has been acted on.
Exited. Leaving a routine after a variable has been acted on.

Check combinations of these states to see if for example, a variable is being assigned a value before it is defined.

24.3.4. Equivalence Partitioning

If two test cases are going to produce exactly the same results, then only one of these is necessary.

24.3.5. Error Guessing

Use experience to guess at which areas are likely to contain errors. This is not advised for very inexperienced programmers.

24.3.6. Boundary Analysis

Write test cases that test the boundaries of variables. That is to say, test the maximum and minimum allowed values. Test disallowed values as well, and see what the results are. These tests identify 'off by one' errors.

24.3.7. Classes of Bad Data

Bad data can be defined in the following ways:

Too little.
Too much.
Invalid.
Wrong size.
Uninitialized.

These can be used to test error control.

24.3.8. Classes of Good Data

These are nice values that test the normal functionality:

Nominal Cases.
Minimum.
Maximum.
Compatibility with old data.

24.3.9. Use test cases that allow easy manual checks.

If arithmetic operations are being performed, use cases that are easy to manually calculate as a check.

24.4. Typical Errors

24.4.1. Which routines contain the most errors?

Errors are not evenly distributed throughout the code. They usually collect together in certain places.

24.4.2. Errors by Classification

The scope is fairly limited. 83% of errors can be fixed by changing only one routine.
Many errors occur outside construction.
95% of implementation errors are the programmers’ faults.
Clerical errors are common.
Some errors are down to misunderstanding the design. This is a recurring theme.
Avoid errors in assignment statements.
Most errors are easy to fix.
Measure your experiences with errors.

24.4.3. Proportion of Errors Resulting from faulty construction

Implementation Errors constitute up to 40% of all errors.

Construction Errors are costly to fix.

24.4.4. How many errors should you expect to find

Typically there are 15-50 errors per 1000 lines of code.
The main point is that it is cheaper to build less buggy software.

24.4.5. Testing itself

There are occasions when test results can leave you spending hours checking code and finding none. This is because the test cases were erroneous.
Verify each test case manually.
Plan test cases as the software is being developed.
Keep old test cases for reference.

24.5. Test Support Tools

24.5.1. Scaffolding

This can be creating a low-level routine that does not do much at all, but is used to test a high level routine. This allows you to take certain functionality for granted.

24.5.2. Results Comparators

These are tools that compare text output files to see if they are identical.

24.5.3. Test Data Generators

These tools are useful because they can create test cases that you would not normally think of.
They test more thoroughly.
They can be refined to test more thoroughly in specific areas.
A modular design is useful during testing.
These test drivers can be reused if the application code is changed.

24.5.4. Coverage Monitors

These can show what code is executed, and highlighting the most used code.

24.5.5. Symbolic Debuggers

These allow a compiled program to be stepped through.

24.5.6. System Perturbers

These are tools such as:

Memory fillers.
Memory shakers.
Selective memory failers.
Bounds checkers.

24.5.7. Error Databases.

These are information repositories that are maintained by testers. Use them as knowledge bases.

24.6. Improving Testing

24.7. Planning to test

It is important to allow for testing when the project is being planned.

24.7.1. Re-testing

This means that when code is changed, re run the old test cases as well as running new ones.

24.8. Keeping Test Records

By precisely recording testing results and errors found, it can be seen whether or not changes damage the project.

24.9. Summary

Testing is an integral part of development
It must be taken seriously, and allowed for during planning.

24.10. Forwarding Actions

Establish set guidelines for testing quality.
Allow more time for testing and test statements.

25. Debugging

3.1 Overview of Issues

Bugs. The original term "Bug" was used when the first large-scale digital computer was first used. An error was traced to a moth in the circuitry.

25.0.1. Role of Debugging

This is a last resort. There should be no bugs in a properly designed, coded and tested system.

25.0.2. Variations in Debugging Performance

Variations can be measured by time spent, number of errors found, and number of errors caused. Not everybody can debug well.

25.0.3. Errors as Opportunities

Opportunities can arise such as:

Learning about the program.
Learn about different kinds of mistakes.
Learn about the quality of your code by reading it.
Problem solving.
Learning how to fix errors.

25.0.4. An Ineffective approach

The "Devil's Guide to Debugging" is as follows:

Find the error by guessing.
Insert debugging Print statements randomly.
Keep changing code until it seems to work.
Don't back up the original copy.
t waste time understanding the problem, try to fix it.
Use the most obvious fix. Hard code exceptional values.

25.0.5. Debugging by superstition

The wrong attitude to take is the "Not my Fault" position. It is better to take responsibility for an error, until you can prove that it is someone else's fault.

25.1. Finding an Error

25.1.1. Use a Scientific Method

1. Gather data and get repeatable results.

2. Form a hypothesis.

3. Experiment to prove the hypothesis.

4. Prove or disprove the hypothesis.

5. Repeat until a hypothesis is proven.

Alternatively:

1. Stabilize the error.

2. Locate the source of the error.

3. Fix the error.

4. Test the fix.

5. Look for similar errors.

25.1.2. Tips on Finding Errors

Use all available data to form a hypothesis. If data doesn't fit the model, then find out why.
Refine the test cases to find the error.
Reproduce the error in different ways.
Generate more test data.
Use results of negative tests.
Brainstorm with others in order to find a hypothesis.
Narrow suspect regions of code.
Be suspicious of previously corrected errors.
Check for recent changes.
Expand suspicious areas until the error is contained.
Integrate parts of the system incrementally.
Set a maximum time limit for a quick debug. After that, sart analysing.
Check for common errors.

Talk about the problem.
Take a break.

25.1.3. Syntax Errors

Don't trust compiler line numbers. The problem may be before the line that is listed.
Don't trust the error messages. The compiler may try to be clever.
Don’t trust 2nd, 3rd errors, because they could be result of the first.
Divide and Conquer. Remove parts of the system and deal with them individually.

25.2. Fixing an Error

Understand the problem first.
Understand the program, not just the problem.
Confirm the error diagnostics.
Relax. Don't hurry a fix.
Save the original code.
Fix the problem, not just the symptom.
Change code only for good reasons.
Handle one change at a time.
Check your fix.
Look for similar errors.

25.3. Psychological Considerations

Paris in the spring.

Did you notice the second "the" in the previous sentence? The lesson here is that you can see what you expect to see, not what really is.

25.4. Debugging Tools

Source Code comparators.
Compiler warning messages. Set the compiler options to the strictest, and remove all warnings. Standardize these settings.
Extended Syntax/Logic checkers. These check for common mistakes.
Execution Profilers. These show the code hotspots.
Scaffolding.
Debuggers.

25.5. Summary

Debugging ought not to be needed.
If it is required, ensure that a good methodology is used.

26. System Integration

26.1. Importance of the Integration Method

It is important to integrate in the right order. Systems can become unstable if components are added in the wrong order.

The system should be self-supporting at each stage of integration.

Integration is not just about testing.

26.2. Phased vs. Incremental Integration.

26.2.1. Phased Integration

Phased Integration happens like so:

1. Develop all the routines

2. Unit Test all the routines.

3. Add all of the routines to the system.

4. Test and debug the whole system.

There are inherent problems with this approach, especially the fact that if errors arise, it is hard to identify which routine(s) contain the errors.

26.2.2. Incremental Integration

Incremental Integration happens differently:

1. Develop the minimal system in order to function.

2. Develop and test a routine.

3. Add the one routine to the system.

26.2.3. Benefits of Incremental Integration

Error tracking is easier.
The system succeeds early on.
Units are tested more thoroughly.
The development schedule is shorter.

26.3. Incremental Integration strategies

26.3.1. Top Down Integration

This allows the high level routines to be developed first, before the low level details are finalized.
However, problems in low level routines can cause problems at the high level.
It is better to integrate in sections, or functional paths.

26.3.2. Bottom Up Integration

This allows the fiddly low-level routines to be tested first.
This method is incremental.
This requires the whole system to be designed before work begins.

26.3.3. Sandwich Integration

This method means that the low-level routines and the top level ones are written first.

Afterwards, the in-between routines are written.

26.3.4. Risk Orientated Integration

This method is where the hard part of the system is written first, the part that contains the most risks.

26.3.5. Feature Orientated Integration

This is similar to the top down method, but instead, the system is written in paths.

26.4. Evolutionary Delivery

26.4.1. General Approach

Evolutionary Delivery is all about delivering in stages, not waiting until the whole system is written.

This method allows for changes along the way, not all at one at the end.

26.4.2.Benefits

Integration takes care of itself.
There is a shorter gap in between deliveries.
Product shipping cycles are shorter.
Customer satisfaction is built in.
The project status is easier to assess.
Estimation Error is reduced.
Development and testing resources are more evenly distributed.
Morale improves.
Completion is more likely.
Implicit code quality.
You can see if the program supports changes.
The system needs less documentation.

26.4.2. Relationship of Evolutionary Delivery to Prototyping

Prototyping is an exploratory exercise.

26.4.3. Limitations

There are drawbacks:

More planning is required.
More technical overhead is required.
This methodology is sometimes used as an excuse for skimping on design, analysis and planning.

26.5. Summary

Incremental Integration is a much better approach than phased integration.

26.6. Forwarding Actions

Use the accretion method for integration, as mentioned much earlier.

27. Code Tuning Strategies

27.1. Performance Overview

27.1.1. Quality Characteristics and Performance

Performance does not always mean speed. There are other qualities to consider.

27.1.2. Performance and Code Tuning

Think about efficiency from these viewpoints:

Program Design.
Modular Routine Design.
O/S Interaction.
Compilation.
Hardware.
Code Tuning.

27.2. Introduction to Code Tuning

27.2.1. Old Wives' Tales

There are many false beliefs about performance:

Less lines = faster code.
Certain operations are faster than others.
Optimize the system as you go. Don't, the bottlenecks only appear at the end.
A fast program is as important as a correct one. No, it is better to have a working system than one that does not.

27.2.2. The Pareto Principle

80% of the system can be done with 20% of the effort. This is the same with completed systems. Some areas are used more than others. It is these hotspots that you should concentrate on improving.

27.2.3. Measurement.

It is not possible to simply guess how efficient code is it has to be measured.

Measurements must be precise.

Sometimes optimizations don't make code any faster, but make the code harder to read.

27.2.4. Compiler Optimizations

Compiler settings can allow the compilers to optimize simple code.

27.2.5. When to use Code Tuning

Only optimize code if it is absolutely necessary, and do so at the end of the project.

27.2.6. Iteration

If a piece of code is optimized by a small amount, but is used again and again in the program, the cumulative effect will be grater.

27.3. Common Sources of Inefficiency

Input/Output operations.
Formatted Printing.
Floating Point operations.
Memory Paging.
System calls.

27.4. Summary of Approach to Code Tuning

1. Use highly modular designs that are easy to understand.

2. Find the hotspots in the code.

3. Find the cause of the weak performance, is code tuning required?

4. Tune the bottleneck. If there is no improvement in performance, then undo the change.

5. Repeat from step 2.

27.5. Summary

Only tune code when it is absolutely necessary.

28. Code Tuning Techniques

28.1. Loops

Unswitching. Instead of having a loop with an IF statement inside it, convert it to be an IF statement with two loops in the conditions.
Jamming. Combine two looped operations into one loop, make them use the same loop counter, etc.
Unrolling. Reduce loop housekeeping, for example, use operations on the loop counter, and also on loop counter +1, and then increment the loop by 2. This will reduce the number of iterations.
Minimize the work inside the loop. Instead of performing the same calculation inside the loop, calculate a variable and then use this.
Sentinel Values. Instead of using compound evaluations at the start of the loop, combine the calculations to set a flag, or a value.
Put the busiest loop on the inside. For nested loops, make the most iterated loop innermost.
Strength Reduction. The idea here is to replace multiplication operators with additions, by use of pre calculated variables.

28.2. Logic

Stop testing when you know the answer. If you are running a loop to find a certain value, exit the loop once the answer is known.
Order condition tests be frequency.
Substitute table lookups for complicated expressions.
Use Lazy evaluation. Rather than populate 2K values at the start of the program, calculate them as needed, and cache them.

28.3. Data Transformation

Use integers instead of floating point variables wherever possible.
Use the fewest number of array dimensions as possible.
Minimize array references. With nested loops, move any references that do not change out of the innermost loop.
Use supplementary indexes. Use the string length index as opposed to calculating the length.
Use caching. Use for the most common results.

28.4. Expressions.

Exploit algebraic identities. Not A and Not B => Not (A Or B)
Use strength reduction.
Initialize variables at compile time.
Be wary of system routines.
Use the correct constant types.
Precompute results.

Eliminate common sub expressions. Assign them to a variable, calculated once.

28.5. Routines

Rewrite the code in-line.

28.6. Re-Code in Assembler

Sometimes, the only way to dramatically improve performance is to rewrite the program in assembly code.

Follow these steps:

1. Write the program in a high level language.

2. Test the program and ensure that it is correct.

3. Identify the hotspots.

4. Re-code the hotspots in assembler.

This approach works well if the program follows a modular design.

29. Software Evolution

29.1. Guidelines

These will help to improve the quality of the software during its evolution:

Increase modularity, write smaller routines.
Reduce the use of global variables.
Improve your programming style.
Manage changes properly.
Review code changes.
Re-test the system.

The philosophy is to use your experience to improve the software.

29.2. Making New Routines

Create new routines in order to reduce the complexity.
Change routines to share and reuse code.

30. Themes in Software Craftsmanship

30.1. Conquer Complexity

30.1.1. Ways to reduce complexity

Divide the system into subsystems.

Move complex tests into Boolean functions.

30.1.2. Hierarchies and Complexity

It is natural for the brain to see hierarchies, it allow you to not worry about all levels at once.

30.1.3. Abstraction and Complexity

Use abstract naming in order to achieve this.

30.1.4. Pick your Process

Define your own working methods and stick to them.

30.2. Write programs for people first, and computers second

Readability has positive effects on these aspects of a program:

Comprehensibility.
Reviewability.
Error Rate.

Debugging.
Modifiability.
Development Time.
External Quality.

30.3. Focus your attention with the help of conventions

The conventions will have been defined in order to produce quality. Adhering to these will help to produce high quality.

30.4. Programming in terms of the problem domain

Making use of abstraction can do this. Not dealing with technical solutions.

30.5. Watch for "Falling Rocks"

Look for anything that is unstable. Look for low quality routines and rewrite them.

30.6. Iterate

Iterate early on, by using prototypes.

This method costs less.

30.7. "Thou Shalt Render Religion and Software Asunder"

30.7.1. Software Oracles

These are people who claim that their way of doing things is better than any other. Don't believe that anything is better until it is proven to be so.

30.7.2. Eclecticism

These are people who cling to a single way of doing things. A good programmer will build up an intellectual toolbox of different methodologies.

30.7.3. Experimentation

It is worth experimenting with new ideas all the way through the development process. This is necessary in order to expand. If you are not willing to change your beliefs after an experiment, then the experiment is pointless.

The important thing is to not be afraid of making mistakes.

Code Complete

Table of Contents

1. Understanding Software Construction

1.1. Metaphors

1.2. Writing Code

1.3. Summary

1.4. Forwarding Actions

2. Prerequisites to Construction

2.1. Importance

2.2. Problem Definition

2.3. Requirements

2.4. Architecture

2.5. Language

2.6. Programming Conventions

2.7. Time to spend on Pre-Requisites

2.8. Adapting Pre-Requisites

2.9. Summary

2.10. Forwarding Actions

3. Building a Routine

3.1. Summary of Steps

3.2. PDL for Pros

3.3. Design the Routine

3.4. Code the Routine

3.5. Formal Checking

3.6. Summary

3.7. Forwarding Actions

4. High Quality Routines

4.1. Valid Reasons to Create a Routine

4.2. Good Routine Names

4.3. Strong Cohesion

4.4. Loose Coupling

4.5. How Long Can a Routine be?

4.6. Defensive Programming

4.7. Use of Routine Parameters

4.8. Consider the use of Functions

4.9. Summary

4.10. Forwarding Actions

5. Modules

5.1. Modularity: Cohesion and Coupling

5.1.1. Cohesion

5.1.2. Coupling

5.2. Information Hiding

5.2.1. Secrets and the Right to Privacy

5.2.2. Common Secrets

5.2.2.1. Volatile Areas that are likely to change

5.2.3. Barriers to Information Hiding

5.3. Good Reasons to Create a Module

5.4. Summary

5.5. Forwarding Actions

6. High Level Design

6.1. Introduction to Software Design

6.2. Structured Design

6.2.1. Choosing Components to Modularize

6.3. Object-Oriented Design

6.3.1. Key Ideas.

6.3.2. Design Steps

6.3.3. Typical Components

6.4. Comments on Popular Methodologies

6.4.1. When to use Structured Design

6.4.2. When to use Information Hiding

6.4.3. When to use Object Oriented Design

6.5. Round Trip Design

6.6. Design is a Heuristic

6.7. How to solve it

6.8. Summary

6.9. Forwarding Actions

7. Creating Data

7.1. Reasons to create your own Data Types

7.2. Guidelines for creating Data Types

7.3. Making Variable Declarations Easy

7.4. Guidelines for Initializing Data

7.5. Summary

7.6. Forwarding Actions

8. The Power of Data Names

8.1. Considerations in choosing good names

8.1.1. The Effect of Scope on Variable names

8.1.2. Computed-Value Qualifiers in Variable Names

8.2. Naming Specific Types of Data

8.3. The Power of Naming Conventions

8.4. Informal Naming Conventions