Thursday, December 03, 2009

The Design of Future Things

Donald A. Norman is big in design, partly because his work on “The Design of Everyday Things.”
This time, Norman put some effort discussing how intelligent machines should interact with human. This book is largely about UI, but I feel the concept is the same for API design as well. API is the human interface of the lower level software component, anyway.

The book is interesting, and provides many examples of failed and successful interactive design. If you don’t have time to read the whole book, the author also provides nice summary of his ideas:

Design Rules for Human Designers of Smart Machines
• Provide rich, complex, and natural signals
• Be predictable
• Provide good conceptual models
• Make the output understandable
• Provide continual awareness without annoyance
• Exploit natural mappings

Design Rules Developed by Machines to Improve their Interactions with People
• Keep things simple
• Give people a conceptual model
• Give reasons
• Make people think they are in control
• Continually reassure

Use finite state machines whenever possible

In the software engineering program I took in CMU, one of the core courses is Models of Software Systems. This course was considered as one of the hard ones, because it taught students techniques available out there to “proof” the software to be correct or safe.

I won’t say the problem of proving software correctness is already solved, but one important part of the course is about using finite state machines and tools to proof the correctness of them. C or C++ statement sequence is hard for exhaustive simulation, but finite state machines are much easier to handle. There have been a lot researches on state machines.

Even without the tools, state machines are far more easier to trace and understand then code or flow chart, for anything nontrivial. Simply asking this question in design review would save you a lot of headache at debugging:

If something bad happened, how do I reset the state machine?

Many of our API documentations focus on describing the parameters and return values. They are important, but the mind set, the state machine behind the API, is at least equally important. API is not just a function, but also a protocol between the client and server. More often than not, we design and compose a larger system based on this protocol, instead of the detailed arguments. Design review should also first check the correctness of the protocol.

Academically, we can ask for many questions to a state machine, but some of them are simple enough to be listed here:

Safety: something (bad) will NEVER happen. You can specify this (set of) API shall never touch other part of the system.
Liveness: something good will EVENTUALLY happen. You shall never get stuck.
Deadlock: is it possible for multiple state machines get locked

Development teams in Microsoft

I just spent a week in Redmond, supporting the integration of our new driver with the next Windows Mobile.

During that week, my counterpart in Microsoft told me how Microsoft does after a product is released: the team would be disbanded and most people would find jobs in other area.

Take him for example: he was working on Visual Studio 2008 when he started with Microsoft 3 years ago. After that programming tool was released, he was re-assigned to work on Windows Mobile, in the GPS area.

This is something new to me. In all my past full time positions, we let some person work in the same domain for very long time. Especially if he or she does not move to the manager position, the person would be rather stuck with the same area. We do change aspects, but usually you would stay in the same part of the same product line.


When companies get larger, it seems natural that departments grow into silos. Each department would have different way of doing things, and the thinking in management would be fixed. If we rotate people into different groups every 2 to 4 years, they might get more touch to different people and technologies used by the company, thus have better chance to grow into stronger developers and experienced managers.



The way Microsoft does it seems to bring new management style and thinking constantly. In addition, it helps to spread knowledge and discipline widely within the company. All code released by Microsoft looks similarly okay, and the managers I met have strong sense of project.

However, I also heard that people avoid touching some difficult code base because they don’t understand it enough. I somehow feel they have less ownership of the component they are working on. If all you have to do is for this release, why plan for the future?

It’s difficult to say which model is better, but I feel a company in constant flow would be stronger, because the thoughts would be fresher.

Timesheet and checklist

The management side of software development can be simplified down to two tricks: timesheet and checklist.

Timesheet is not what we put in the QTime. That might be an accounting tool, but not a software engineering tool. Just like the expense list in every family, a timesheet should tell you how the most precious resource in many projects, the person, is spent into different activities. For example, if your timesheet tells you the amount of time spent into meetings are higher than usual, it might be better to check it out.

Timesheet also provide the basis for process improvement and estimation. How do you estimate the amount of effort? We usually based on “experience.” Timesheet shall provide you some evidence for the experience.

The categories of your timesheet can be as simple as Meeting, Design, Coding, Testing. If you feel like it, it’s also possible to make finer categories. I don’t recommend to go too far, as people would be confused.

Checklist is a simple way to remind you what have to be there. For example, a checklist might say the agenda and reference material must be sent out some amount of time before the meeting, and a note taker must be assigned. Also, a checklist would remind you a team dynamics survey is due every 2 months.

However, timesheets and checklists shall not replace education and buy-in. If people don’t believe them, then the extra time spent in filling out the forms or reading the checklist becomes waste. They are there to help, and shall not impede progress.

Has anyone used real timesheet in your development project? How do you feel about them?

MISRA C and MISRA C++

MISRA - The Motor Industry Software Reliability Association, the name says it all. The current member list of MISRA is:

· AB Automotive Electronics

· Bentley Motor Cars

· Ford Motor Company

· Jaguar Cars

· Land Rover

· Lotus Engineering

· MIRA

· Ricardo UK

· TRW

· University of Leeds

· Visteon Engineering Services

MISRA C 2004 and MISRA C++ 2008 are sets of rules MISRA recommends to developers in the safety critical business. MISRA C is quite famous, if you are in the static code analyzer field, but MISRA C++ is new. I have written a report of MISRA C 2004 for MediaTek, when it first came out.

Basically, MISRA C and C++ limits the language features you can use, with brief explanation and code examples. For example, MISRA C requires, in rule 16.10 (required), “If a function returns error information, then that error information shall be tested.” This sounds easy, but more often than not, people just assume system calls would succeed.

The C++ version has many C++ specific restrictions that I am not familiar with, like Argument Dependent Lookup and unnamed namespace. I have concluded that C++ is not safe for mere mortals long ago, anyway.

Interestingly, MISRA C++ lifted the ban of “goto” in MISRA C, but still put restrictions on how it can be used.

Personally, I believe one should study MISRA standards before he or she is allowed to work on mission critical code. Your project may not want to be fully compliant, but it is better if you understand the reason.

Why our design reviews don’t work

I have been to three design reviews in this group so far, and find them… not very effective.

Let’s define effectiveness first. Everyone can have difference ideas, but what I pick is to ensure the design would fulfill the external requirements, align with current architecture, achievable within the scheduled resource, and other internal implicit limitation or requirements.

The very first problem I can see is: few attendees can say they understand the requirement. Since there are no written requirements listed anywhere, who just received the meeting invitation and saw the design for the first time naturally wouldn’t be able to understand all aspects of the design. Actually, for several times, we began discussion of requirements in a design review, which is not a bad thing, just reminds you the requirement is not solid enough. A requirement spec, written for understanding, should familiarize the reviewers, especially those external ones, with reasons behind the design. The requirement spec should also list all important use cases or conceptual call flows for reviewers to “see” how would the design function at runtime.

Secondly, few understand the current architecture. We were doing some Windows Mobile driver development, but none of the reviewers could be considered expert in Windows Mobile. If we have expert of driver design in the review meetings, I believe it would be easier for all of our drivers to achieve the same level of quality.

Thirdly, no one played “devil’s advocate” role in the reviews to challenge the design with resource needed, quality attributes like extensibility, stability, and performance, or other internal requirements. It might be helpful if we have a concise checklist which contains the important areas people should look at. For example, we should always highlight shared resource and synchronization. Security issues are also quite common in drivers for new OSes.

How are design reviews conducted in your teams, and how do you feel about them?