Mac's Blog

Friday, October 01, 2010

Think twice before modifying your process under the name of quality

I feel, more than ever, that I work in the same company Dilbert does. Maybe I do.

We just got another process coming, a Software Quality Team targeting improving our software quality. They provided a superb tool which searches for code snippets throughout all our Perforce repositories in no time, and ban any build that contains “known buggy code.”

This is not necessarily a bad thing, if you imagine how nice it will be if all known bugs are fixed before a release goes out. Because all releases contain all bug fixes, they must be “good”. They are going to apply this to every “bug” that is marked as highest priority, so the impact should be limited.
There was, as usual, no communication, no training, no documentation, just short emails sent to you as commands to follow.

I got 5 requests last week to provide a fix on different branches that we do not support. The integration teams create branches as they like and do not take new fixes from us the moment they branch out. This is okay as long as their releases are stable and customers do not complain. This is not okay if “I” have to merge code for them and warn them I cannot even compile for them. Each product has a slightly different build procedure and there is no way I can be expert for all of them, let alone testing for them. I don’t even have the hardware.

It’s bad to not plan for bug fixes but stop a build like that. Why, because the pressure to “make a build” is on the guys/gals at the bottom of food chain, who usually got very short dead line. All the requests I got are like, “I need a fix for this issue RIGHT NOW.” No test resources are allocated, and usually fixes would be taken as orphan files instead of tested, packaged, components. How much quality would you gain from this?

Maybe you’d be interested to know how critical that bug was. Like I said, our people are smart and they only do this on highest priority issues. How many levels of priorities do we have? Two. Almost every issue is marked as the highest these days, especially those comes from customers. I have to fix bugs specific to a certain customer on all branches, while we officially only support one branch. This is our concept of software product lines, I guess.

Monday, May 24, 2010

We need an issue tracking system that is more project based

It’s hard to imagine that a big company as we are, we don’t have any issue tracking for software projects. We do have Microsoft Excel and OneNote, but we all know they are not project-centered and there is no place to keep the discussion history.

For ages, we have an internal tool which is used to track issues found in testing on formal release builds. Super slow performance aside, the system serves its purpose of identifying a specific build and describing the issue. There are stages to solve an issue, and managers depend on the stage to monitor how a specific build is free of known issues.

Note I said ‘free of known issues’ instead of ‘finished with according to test plan’ or even ‘completed in development/delivery plan’. The system simply has no idea what’s in your development or test plan. Would you feel safe driving a car shipped with just a few known issues, but no one is tracking properly what are supposed to be designed or tested?

As a development team, we need a tracking system which starts with a software project, with all features listed, and track if they are all completed first development and then test, in the end. From time to time, we need to report and track issues based on target builds, as no one knows what the cause of that crash is yet. But we should usually work under project, implement this feature, test according some plan, or fixing bugs for that feature.

Now you can see the duality here. We need a system which can track status of plain issues, and also provide structure for planned features and milestones. It would be better if we have links among internal documentation, and code changes, and there are systems like that. However, that’s a bit too much for a weekend’s project.

For this purpose, I chose Mantis. This is a popular issue tracking system which I have used when I was in the software engineering program. We have used FogBugz, yeah, interesting name, ClearQuest and heavily customized Bugzilla in my previous job. All kind of famous and powerful, but not what I can setup in a weekend.
I linked it to our internal SMTP server so it sends email on issue status change, and even linked to our LDAP system so it authenticates using the same credential, but later dropped it because of security concerns.

I asked around and learnt our hardware department just started using ClearCase and ClearQuest, IC design is actually more closer to software design than most people would think these days. That’s really weird because we software teams are on Perforce but hardware teams switched from cvs to ClearCase. Maintaining yet another huge system takes time and money, especially when ClearCase is showing its age now. I know some guy who worked for Rational, the company created ClearCase. Guess what his team was using internally for revision control…

Monday, May 17, 2010

A planned approach to interview developers

The way our team interview new developers is very democratic. The resume and schedule is sent to several interviewers a week before the interview. On that day, the candidate would be led to each individual interviewer for a 45-min session. Usually the poor candidate would go through 4-6 sessions in one day. After a week or two, a short meeting would be called and everyone reports what he or she thinks about the candidate. What we actually ask during the interview is totally up to the interviewer and usually no one cares to communicate beforehand, or after.

I won’t say it doesn’t work, but we might have a better chance to improve over time if we plan ahead, execute, review, and adjust the plan for next operation. I feel the current approach has two obvious drawbacks.

First, there is simply not much time to actually understand the candidate. Without planning and communication, we all might have to start with introducing each other, what do we do in this department, how the work environment is like, and going through the resume. A newly grad did a project in X would end up explaining the same project several times. An experienced developer would have to explain why he/she left the previous employer to every interviewer. This is not efficient use of our time.

Second, we do not probe different aspects of the candidate. Since no one cares to say I am going to test this aspect, we have to test every aspect. The result is we can only probe a little bit on every aspect we care the most, but not with any depth or looking for better overall coverage.

For example, a good candidate for us needs to understand the C language, so every interviewer asks simple C questions. However, we are seeing more and more work done in C++ and Java, but no one has addressed that.

Another example would be how a senior candidate can document requirement, design a system, plan for a project, or improve quality. I am not sure how much you can do within a 45-min session.

I believe we should have templates for interviews. For newly grads, the first would be simple tests, introduction to the company/department/position, and going through the resume for general understanding. I don’t know if psychological tests are legal in the States or not, but I believe it works to some extend based on past experience.

After that would be a programming language or capability test. This can be done by the candidate alone for an hour and half. During the interview lunch, others can review the answer sheet and decide if we want to send he/she home or push really hard to see how he/she communicates under pressure.

In the afternoon, we should be working on two or three aspects of the ideal candidate we want, like OS specific knowledge, fix-point arithmetic, DSP, debugging other’s code, real-time multi-threading, and coding practice.

If we have a plan, execute according to it, and review the results, we should have a better chance to improve over time.

Thursday, December 03, 2009

The Design of Future Things

Donald A. Norman is big in design, partly because his work on “The Design of Everyday Things.”
This time, Norman put some effort discussing how intelligent machines should interact with human. This book is largely about UI, but I feel the concept is the same for API design as well. API is the human interface of the lower level software component, anyway.

The book is interesting, and provides many examples of failed and successful interactive design. If you don’t have time to read the whole book, the author also provides nice summary of his ideas:

Design Rules for Human Designers of Smart Machines
• Provide rich, complex, and natural signals
• Be predictable
• Provide good conceptual models
• Make the output understandable
• Provide continual awareness without annoyance
• Exploit natural mappings

Design Rules Developed by Machines to Improve their Interactions with People
• Keep things simple
• Give people a conceptual model
• Give reasons
• Make people think they are in control
• Continually reassure

Use finite state machines whenever possible

In the software engineering program I took in CMU, one of the core courses is Models of Software Systems. This course was considered as one of the hard ones, because it taught students techniques available out there to “proof” the software to be correct or safe.

I won’t say the problem of proving software correctness is already solved, but one important part of the course is about using finite state machines and tools to proof the correctness of them. C or C++ statement sequence is hard for exhaustive simulation, but finite state machines are much easier to handle. There have been a lot researches on state machines.

Even without the tools, state machines are far more easier to trace and understand then code or flow chart, for anything nontrivial. Simply asking this question in design review would save you a lot of headache at debugging:

If something bad happened, how do I reset the state machine?

Many of our API documentations focus on describing the parameters and return values. They are important, but the mind set, the state machine behind the API, is at least equally important. API is not just a function, but also a protocol between the client and server. More often than not, we design and compose a larger system based on this protocol, instead of the detailed arguments. Design review should also first check the correctness of the protocol.

Academically, we can ask for many questions to a state machine, but some of them are simple enough to be listed here:

Safety: something (bad) will NEVER happen. You can specify this (set of) API shall never touch other part of the system.
Liveness: something good will EVENTUALLY happen. You shall never get stuck.
Deadlock: is it possible for multiple state machines get locked

Development teams in Microsoft

I just spent a week in Redmond, supporting the integration of our new driver with the next Windows Mobile.

During that week, my counterpart in Microsoft told me how Microsoft does after a product is released: the team would be disbanded and most people would find jobs in other area.

Take him for example: he was working on Visual Studio 2008 when he started with Microsoft 3 years ago. After that programming tool was released, he was re-assigned to work on Windows Mobile, in the GPS area.

This is something new to me. In all my past full time positions, we let some person work in the same domain for very long time. Especially if he or she does not move to the manager position, the person would be rather stuck with the same area. We do change aspects, but usually you would stay in the same part of the same product line.

When companies get larger, it seems natural that departments grow into silos. Each department would have different way of doing things, and the thinking in management would be fixed. If we rotate people into different groups every 2 to 4 years, they might get more touch to different people and technologies used by the company, thus have better chance to grow into stronger developers and experienced managers.

The way Microsoft does it seems to bring new management style and thinking constantly. In addition, it helps to spread knowledge and discipline widely within the company. All code released by Microsoft looks similarly okay, and the managers I met have strong sense of project.

However, I also heard that people avoid touching some difficult code base because they don’t understand it enough. I somehow feel they have less ownership of the component they are working on. If all you have to do is for this release, why plan for the future?

It’s difficult to say which model is better, but I feel a company in constant flow would be stronger, because the thoughts would be fresher.

Timesheet and checklist

The management side of software development can be simplified down to two tricks: timesheet and checklist.

Timesheet is not what we put in the QTime. That might be an accounting tool, but not a software engineering tool. Just like the expense list in every family, a timesheet should tell you how the most precious resource in many projects, the person, is spent into different activities. For example, if your timesheet tells you the amount of time spent into meetings are higher than usual, it might be better to check it out.

Timesheet also provide the basis for process improvement and estimation. How do you estimate the amount of effort? We usually based on “experience.” Timesheet shall provide you some evidence for the experience.

The categories of your timesheet can be as simple as Meeting, Design, Coding, Testing. If you feel like it, it’s also possible to make finer categories. I don’t recommend to go too far, as people would be confused.

Checklist is a simple way to remind you what have to be there. For example, a checklist might say the agenda and reference material must be sent out some amount of time before the meeting, and a note taker must be assigned. Also, a checklist would remind you a team dynamics survey is due every 2 months.

However, timesheets and checklists shall not replace education and buy-in. If people don’t believe them, then the extra time spent in filling out the forms or reading the checklist becomes waste. They are there to help, and shall not impede progress.

Has anyone used real timesheet in your development project? How do you feel about them?

MISRA C and MISRA C++

MISRA - The Motor Industry Software Reliability Association, the name says it all. The current member list of MISRA is:

· AB Automotive Electronics

· Bentley Motor Cars

· Ford Motor Company

· Jaguar Cars

· Land Rover

· Lotus Engineering

· MIRA

· Ricardo UK

· TRW

· University of Leeds

· Visteon Engineering Services

MISRA C 2004 and MISRA C++ 2008 are sets of rules MISRA recommends to developers in the safety critical business. MISRA C is quite famous, if you are in the static code analyzer field, but MISRA C++ is new. I have written a report of MISRA C 2004 for MediaTek, when it first came out.

Basically, MISRA C and C++ limits the language features you can use, with brief explanation and code examples. For example, MISRA C requires, in rule 16.10 (required), “If a function returns error information, then that error information shall be tested.” This sounds easy, but more often than not, people just assume system calls would succeed.

The C++ version has many C++ specific restrictions that I am not familiar with, like Argument Dependent Lookup and unnamed namespace. I have concluded that C++ is not safe for mere mortals long ago, anyway.

Interestingly, MISRA C++ lifted the ban of “goto” in MISRA C, but still put restrictions on how it can be used.

Personally, I believe one should study MISRA standards before he or she is allowed to work on mission critical code. Your project may not want to be fully compliant, but it is better if you understand the reason.

Why our design reviews don’t work

I have been to three design reviews in this group so far, and find them… not very effective.

Let’s define effectiveness first. Everyone can have difference ideas, but what I pick is to ensure the design would fulfill the external requirements, align with current architecture, achievable within the scheduled resource, and other internal implicit limitation or requirements.

The very first problem I can see is: few attendees can say they understand the requirement. Since there are no written requirements listed anywhere, who just received the meeting invitation and saw the design for the first time naturally wouldn’t be able to understand all aspects of the design. Actually, for several times, we began discussion of requirements in a design review, which is not a bad thing, just reminds you the requirement is not solid enough. A requirement spec, written for understanding, should familiarize the reviewers, especially those external ones, with reasons behind the design. The requirement spec should also list all important use cases or conceptual call flows for reviewers to “see” how would the design function at runtime.

Secondly, few understand the current architecture. We were doing some Windows Mobile driver development, but none of the reviewers could be considered expert in Windows Mobile. If we have expert of driver design in the review meetings, I believe it would be easier for all of our drivers to achieve the same level of quality.

Thirdly, no one played “devil’s advocate” role in the reviews to challenge the design with resource needed, quality attributes like extensibility, stability, and performance, or other internal requirements. It might be helpful if we have a concise checklist which contains the important areas people should look at. For example, we should always highlight shared resource and synchronization. Security issues are also quite common in drivers for new OSes.

How are design reviews conducted in your teams, and how do you feel about them?

Thursday, October 22, 2009

What is a defect?

In a software shop, we used to think of defect as software bugs. Those bugs you can blame the poor lowly coders. Fixing of defects only related to code changes.

If we take a broader view of it, defects can reside in many places. If your requirement failed to catch customer needs, there is a defect. If your design does not fulfill the entire requirement, there is a defect in design. Similarly, there could be defects in other phases or documents, according to your process.

How can we look back and say, “There is a defect in design stage”? We need some evidence of design, some architecture document or design spec. For the same reason, we would need a requirement document.

This could be a little bit heavier than we would like, and many agile processes do not encourage people to do documents. However, without any evidence of design, we will never know what happened in design-related activities, and what we can do to improve it.

Divide-and-conquer strategy is used a lot in software engineering. If we can cut the long, complex process of software production into pieces, we have a better chance of debugging each piece. For example, how do we know our design review is working or not? The answer could be: by seeing how many design-related defects are still reported down the line.

Introduction to scrum (software development)

Scrum is a lightweight, or agile, process for project management. Due to the simplicity, it’s quite popular among software teams. It’s so lightweight that you can very easily say “we are doing scrum” when asked by the management. However, it may not help you much if your team is not disciplined to start with.

Scrum is simple, as described here (wiki). You just need to remember three things:

1) Sprint: short, usually a month, time frame for project review and deciding what to do in next sprint

2) Backlog: the list of task items, plus estimations, you have to solve for the whole project, as well as this sprint

3) Daily scrum meeting: 15min. max, daily status report of the team

See, you just need a list of tasks with estimated effort attached, chop them into monthly milestones, and hold a meeting everyday to collect status. Sounds simple, isn’t it?

Scrum even provide a template for the daily meeting. Everyone on the team has to answer three questions:

1) What have you done since yesterday?

2) What are you planning to do by today?

3) Do you have any problems preventing you from accomplishing your goal?

It’s really not that simple when you think again. How do I know and decompose the tasks need to be done for the project? How do I estimate them? How do I pick what items to do in the next sprint? What if the estimation is off by 2x, 3x, or 10x?

Creating software architecture could help you decompose the tasks; timesheet and function point can help you estimate for effort; Risk management should be able to help in picking the right tasks; If the estimation is off by too much, it might be better to start re-negotiation with your client or management early in the project. All these need discipline to generate reliable outcome.

Even you don’t do anything like those mentioned above, a short daily meeting with your team members could still reveal the situation early, i.e., increase visibility of your project.

Quality, Fitness, and Quality Attributes

We always say we shall provide high quality product to our customers. What does quality mean, anyway?

If you ask a programmer, I bet the answer would be “no bugs.” Is that it? Would you please define “bug”? Say, if I deliver a stone to you, and tell you there is not a single bug in it, does that count as high-quality software?

Quality is a vague concept, and could be better defined as “fitness to purpose.” If the software doesn’t help our customers achieve their purpose, how bug-free doesn’t even matter.

What is our customers’ purpose? Selling the phones, yes, but through right functionality, smooth execution, and robust performance. The functionality, or feature set, can be elicited through several proven methods, and we might discuss that later. The other two lead us to “Quality Attributes.”

Quality Attributes cover all those vague items we usually use to describe a software system. Some of the common ones are listed here:

Extensibility: how difficult it is to extend the feature set

Maintainability: how difficult it is to fix a bug in it

Performance: how fast it is

Robustness: how strong it is when abused

Usability: how easy it is to use the system

Capacity: how many requests it can handle

One major goal of requirement elicitation phase of the project would be to nail down use cases for these “-ilities.” For example, usability can be defined as how much time or how many key strokes it would take a novice user to accomplish a certain task.

How important these Quality Attributes, or the requirements describing them, are? I’d say very important, because our architecture would be decided by them.

Faster, Better, Cheaper: Pick Two (1)

This is (the first part of) my submission to this year's QTech

Introduction to Software Requirements and Architecture

1.Introduction

In software development, we have this unsolvable problem of pursuing shorter development time (faster), fitter to the ever changing needs (better), and lower development cost (cheaper). It has been said that you can only pick two from these three. By hiring more people, you would end up with better quality and shorter development time for the price of higher cost.
That is only the theory. Most of the time, managers don’t even have sound evidence on what their decisions may lead to. For example, spending more money won’t necessarily give out better product if the requirements fail to address the customers’ current and future needs properly. Similarly, extending development time may not yield less buggy software if the design is flawed or you team members are stressed by too many concurrent tasks.
A common misunderstanding of software engineering is it’s all about process. It’s like if we follow some heavy process then all our quality or schedule issues will go away.
No, not like that. Software engineering is about continuously optimizing the things we do, and providing evidence of that, according to our or our customers’ needs. There is neither perfect process nor framework to solve everyone’s problem. The only thing that matters is how we systematically measure and improve what we are doing, from team management, requirement acquisition, design/test methodology, project/risk management, to quality control. Everything happens in the lifecycle of a software-intensive system have been a topic of interest, and we might find something useful from the researches.
Among all the major topics of software engineering, requirements and architecture might appear to be least useful to firmware developers. The common response is our requirements have been decided by the operating system or client, and our design is really simple. However, I oftentimes see very old code which was designed years back. Further extensions and customer base make it more and more difficult the change. If a design is going to live for more than five years or fifteen products, it might worth the time to think carefully before we first put it there.
While coding convention and code review [2] oversee the quality of implementation, software architecture, or high-level design as some reading might call it, is the key how the system can be maintained and extended. Also, many other implicit considerations like robustness or security cannot be expressed at code or API level.
In this paper, I try to show how requirements and architecture might help firmware development in the sense of improved communication cross the whole life cycle of development. Most of the references come from the Carnegie Mellon University Software Engineering Institute, or the SEI, which is famous for their contribution in CMM/CMMI [1].

10. REFERENCES
[1] http://www.sei.cmu.edu/cmmi/
[2] Go home early: How to improve code quality while reducing development and maintenance time. QTech forum 2007
[3] Paul Clements, Felix Bachmann, Len Bass, David Garlan, James Ivers, Reed Little, Robert Nord, and Judith Stafford, Documenting Software Architectures: Views and Beyond, http://www.sei.cmu.edu/publications/books/engineering/documenting-sw-arch.html
[4] Attribute-Driven Design (ADD), Version 2.0, http://www.sei.cmu.edu/publications/documents/06.reports/06tr023.html
[5] Quality Attribute Workshops (QAWs), Third Edition, http://www.sei.cmu.edu/publications/documents/03.reports/03tr016.html
[6] Len Bass, Paul Clements, and Rick Kazman, Software Architecture in Practice, Second Edition, http://www.sei.cmu.edu/publications/books/engineering/sw-arch-practice-second-edition.html
[7] Architecture Documentation, Views and Beyond (V&B) Approach http://www.sei.cmu.edu/architecture/arch_doc.html
[8] Frank J. van der Linden, Klaus Schmid, and Eelco Rommes , Software Product Lines in Action

How do you know?

Everything I learned from that 16-month program of software engineering I attended can be summerized into one question: "how do you know?"

From beginning of the 16-month project, we have to answer this question for God knows how many times in public, under ruthless critics from software researchers and professors.

This is the problem our client needs us to solve. (How do you know?)

We will finish requirement elicitation in Nov. (How do you know?)

These are the requirements.(How do you know?)

This design could satisfy our customer's needs. (How do you know?)

Our team dynamics is good and improving. (How do you know?)

Our meetings are efficient. (How do you know?)

CMMI/Agile/Scrum/TSP process will lead us through this chaos. (How do you know?)

Our client is usually happy about our results. (How do you know?)

We spent too much overhead in meetings. (How do you know?)

The main risks are X, Y, and Z. (How do you know?)

Our estimation and plan makes sense. (How do you know?)

The quality of our software is good. (How do you know?)

One of the major goals of software engineering, not programming skill or computer science, is to provide evidence and tool for management.

Visibility is something we need but seldom have in a software project. Like my favorite professor always said: if you cannot tell where you are, a map is not going to help you much. If you don't know how far you are to the destination, how do you know when you are gonna finish?

Software architecture is another major pillar of software engineering. One important use of architecture is to facilitate requirement ellicitation and project management. We can talk about that later.

If you are also interested in these topics, please leave a message. At least I would know somebody is reading :-)

Tuesday, February 03, 2009

I could not get a car loan

Everyone tells me I should take out a car loan to build up credit in the US. They didn't tell me it's hard to get a car loan without credit history first. I already agreed with the ridiculously high interest rate of 22%, but still the financial specialist of the car dealer couldn't find a banker for me.

My wife and I didn't like the specialist in Mossy Nissan because of her high pressure style of selling, so we went out and try ourselves. Since Bank of America didn't even want to issue a credit card to me, we skipped that one. We walked into a big local banking institution named San Diego County Credit Union. The specialists there were very nice to us and they said the interest rate should be much lower than 15% even for first buyers like us. However, they said they could not even start the application process without a formal Driver License. Since the nearest date I could book for the road test was Feb 3^rd, there was no hope I could get the car loan from them in time.

In the end, I just went to the car dealer and wrote a check to pay it off. I hope I don't have to buy anything from Mossy Nissan ever again.

The 2nd intelligent remote and key

Nissan Sentra has a nice keyless feature, which allows you to unlock the car and turn on the engine without drawing your key from your pocket. The "key" actually consists of a remote controller and an embedded metal key in it. The metal key is considered a backup solution in case the remote fails.

When we bought this used Sentra from Mossy Nissan, they only gave us one of this key set. They never told us there should be TWO intelligent key sets for each car. I called them today and ask about it. The answer was exactly what I expected, "it is sold as is." The price for another intelligent key set is more than USD250.

Don't go to Mossy Nissan unprepared. They will eat you alive…

Thursday, March 27, 2008

Attribute Driven Design

One of our mentors in the studio project, Felix, is an expert in Attribute Driven Design. He suggested us to use this technique, and we found it useful for the initial point.

The SEI's view of software architecture is, every architecture can do any function requirement, so you need those quality attributes to find the right design. The quality attributes are traditionally called non-functional requirements, but SEI people hate that name. First, we have to define some scenarios for each quality attribute the stakeholders care. For example, we may want the system to be "high performance" in the sense it can handle 1000 requests per second. This simple scenario becomes a limiting factor you need to consider when you design the architecture, and SEI provides some very general techniques or architecture patterns for high performance designs.

The idea is to pick several most important quality attribute scenarios, make up a design to fulfill for each of them, and then try to combine them together. After this process, you have your first-cut of the architecture. The next step is to further enrich it so it supports other minor quality attributes as well.

Monday, January 28, 2008

Challenges in building a design collaboration tool

The topic of our studio project is to create a much better collaboration tool enhancing the shared whiteboard concept to be more UML-aware. The UML part is relatively easy with the support from eclipse community. The whiteboard isn't.

Although there have been a whole zoo of shared whiteboard software, nobody likes them. We wish to create a better whiteboard so the designers/architects/programmers in Bosch across the world could communicate better.

The other part of the tool is to provide warning if a design change, expressed in UML, violates some rules set by the architects. It should be much more high level than the new OCL in UML, it must perform some kind of model checking.

I am not sure if it is the best topic one can get for a 16-month project, but I am happy to think about something that can help people to communicate better.

Is software architecture really more difficult?

When they talk about software architecture here in CMU/SEI, they always say that documenting software is hard because you cannot see the real thing, you cannot draw software architecture like drawing a 3D representations of a machine or bridge. That is a myth. They say that because they do not understand mechanical or civil engineering at all.
The truth is, engineers in other disciplines also need to document their designs very carefully in various ways much complex than the three orthogonal views you may have seen in a mechanical/civil engineering handbook.

You have to calculate/simulate the force flows through a beam using a force flow diagram, you have to calculate the gear structure of your gearbox using gear diagrams, and you have to calculate the flow of air/water/oil inside the system using yet another set of flow field diagrams.

You also need diagrams at different level of detail to show how to manufacture or assemble your creation to what quality level, which could be very complicated.
Software is special because it is indeed so cheap and too easy to blow up। If we isolate every module or even function with process boundaries, like a failed component in the real world usually has, the quality of software could be much better.

In the real world, architects rarely use new/customized material or structures. They only buy tested and well understood parts and assemble/construct them using well understood methods. Software is so cheap that everyone wants to invent his version of sort/database connection, etc...

The Internet, as a whole, is quite stable because all components are isolated। A blown up file server usually doesn't crash the intranet, let alone the whole Internet. If we can somehow make that level of isolation achievable in an economical manner, we can make everyday software much more stable than they are now.

Wednesday, January 03, 2007

CSS

It is mad to review CSS at this time. CSS is not bad considering the sales it has driven over the past years, and it should not be surprised if someone eventually broke it. CSS still works even for me, a lazy engineer in this field. CSS will not work, nor will AACS, to prevent professional pirates with proper tools.

Recently in the DVD Forum and related organizations, people are working hard to make"managed copy" possible for CSS protected movies. The idea is all right, but there are already better protection schemes like CPRM, VCPS, or even AACS. The major problem is there are different forces trying to create multiple solutions for this simple idea.

I cannot discuss all the movement openly, but I think it is crazy to reinvent almost everything for this old format.

Wednesday, November 15, 2006

GPL fever

In general, people in my company know very little about open-source software. Our legal department also spend a lot of time to keep people stay away from them. It caused a lot of attention when I introduced an open-source AES package in our reference firmware.

Recently, people are talking about GNU/Linux because some of our customers want us, not the department I work for, to switch from other commercial RTOS to it. My people are very worried about the reference firmware would be forced into open-source when we release it with GNU/Linux.

The license term, GPLv2 in specific, and the license model in the building of GNU/Linux created a very difficult situation for embedded system builders like us. In an low end, washing machine to cell phone, embedded system, you link everything together to create a single image as much as possible. Every part is essential in the distribution. Some contributors in GNU/Linux might think the whole package should be open-sourced. It is just not possible. We cannot afford to open up every source line we have written to our competitors.

The license model in GNU/Linux also let every contributor to file a law suite everywhere. We, as a commercial company, cannot afford that potential cost. It would be much easier if there exists some single licensing entity to negotiate with. The collective work of Linux is great, but it would be a nightmare to deal with when it comes to the legal issue.

I think it might be better if people can authorize some single entity when they contribute the code into GNU/Linux. Just like what MySQL/BSD does. It is still free and open, but we know who to pay if we, no matter what the reason is, want to embed it into a toothbrush, closesourced.

Saturday, October 21, 2006

What Do You Get From A Co-Verification Tool?

We used SoC Designer to evaluate the performance of CPU core with a pretty slow FLASH ROM chip for some new product. A suitable cache module would be sized to match the speed difference, too.
It should be a simple task to compile this system providing the new tool provides so many CPU cores, buses, cache components, and memory chips. We just had to wire them together, like in LabView or Lego. It is not too difficult, in deed, and the benchmark numbers was easily acquired using the software profile feature in the simulated CPU core.
Things got a bit nasty when the bosses wanted to see the benchmark results from yet another cheaper core. To cut a long story short, we cannot explain the enormous difference between the result from the two ought-to-be similar CPU cores.
Any pure numerical simulation project faces the same problem. We cannot say which one is more true without a real platform or really strong technical support from the vendor, especially when the behavior of pipelined and cached CPU core is quite difficult to explain. It would be easier for us if we had the source code for these components, but that was not possible.
May be the vendor should provide a large FPGA board with some preconfigured scenario before the customer know when should they "trust" the result or not. The CAE field in Civil and Mechanical Engineering has been doing this for very long time. New users learn the limit of simulation tools with experiments conducted in real world. Digital system is much easier to understand, but that doesn't mean you can get a precise result without consulting your hardware designers about the detailed description of each component on the canvas.

Tuesday, October 17, 2006

AACS

Advanced Access Content System
One of my job is to implement the security system to fullfill the AACS requirement.
AACS is the more up-to-date version of CPRM, which is not very popular. The idea is to encrypt the content using very strong encryption, and the embed the key on the disc. Only valid software plus valid drive can retrieve the key and then playback properly. If the software or drive is hacked to do anything AACS doesn't like, it is revoked.
The software can be hacked to produce perfect unencrypted AV stream. The drive can be hacked to produce or accept perfect copies.

The revokation mechanism is doomed to failure, with an open platform like our PC. You have to know who to revoke in the first place. They do have forensic mark mechanism, but I do not believe it would work, either.

The problem for AACS is the copy-protection part is too strong, and the on-line transaction part is too weakly addressed. People would have to replace their digital TVs and LCD monitors to watch HD content, but no any player, sw or hw, supports any new transaction model till today.

SoC Designer

One of my recent job is to pick a new microcontroller for our SystemOnChip, which has been using 8051-family for years. Since the ARM core is the new 8051 for SoC design, we contacted ARM for performance evaluation tool or kit. Their answer is SoC Designer, a complete software emulation of the whole SoC.

SoC is not the microcontroller/processor only. When we say System On Chip, it is really a system. We have various buses running different protocols at different clock rates. We have multiple processors, usually MCU+DSP. We have FLASH, EEPROM, DRAM and SRAM. We also have countless hardware components which work concurrently to off-load the main processors. And we are talking about a humble CD-ROM drive.

Precise and meaningful performance evaluation is really a tough job, so we just focused on the processor, cache, and FLASH subsystem.

The nice point for SoC Designer is you may model the system completely in software before the hardware is available. You may also simulate bugs or exceptions in software again and again without spending hours re-producing the situation with the real "embedded" hardware.

The problem for SoC Designer is you have to model all, or at least large part of your system before you can really evaluate the performance of it with real firmware. To make things worse, the ready-to-use model for processors, cache, bus, and memory components are not so reliable when it comes to cycle-accuracy. Without strong support, which we do not have, we cannot explain many performance differences and micro behavior of those models.

The idea is great, but we need more experience to trust this tool.