oda | [Dark Man] Geographically Distributed Coding (Reply)

This is from my husband's comments on my post over on Diaspora. As you can see by the fact that I'm mirroring it and not him, he's not very into blogging, but he wanted to talk about the methods he uses to get a geographically distributed coding team on the same page. I feel this is highly applicable to many open source software projects so thought it would be useful to capture as a top level post.

Dark Man - This turned into a bit of a wall of text. Hopefully it'll be useful.

Hello, I'm Gretchen's husband, responding to her request a couple posts up to talk about this issue. For reference, I work for a multinational manufacturing company and tend to get put on the kinds of projects where if something breaks, we lose a critical function in doing business (can't take orders, or bill product, or ship product, or know what is in the warehouse, etc). I'm internal IT, I don't do shrinkwrap software and while I have worked on ecommerce applications, they're small scale compared to something like what Amazon does. I know plenty of professionals who do those things though, and when we talk shop, the stories are pretty similar for what works and what doesn't.

Your situation isn't identical. But hopefully some of this will be useful food for thought.

Around in the late 90s, our organization within our company had started to get a reputation for stable code, drama-free rollouts and ability to do large, complex projects while maintaining some quality. Then, as our responsibilities increased we needed to work more and more with people outside of our organization - some were internal IT groups, some were consultants from vendors, some were offshore programmers brought in for cost reasons. We needed to work with these people to accomplish the tasks set to us by our prior success, but the processes for organizing a project and the code itself was very uneven, very difficult to integrate with our own work, and difficult to bring up to the quality standard we'd established within our own group.

Reworking or refactoring everyone else's work to our standards would have been a tremendous waste of our time and resources and frankly wasn't practical. Over time we evolved strategies to spread our own culture across those who we worked with. These approaches worked for us.

The most important thing you have to do is to capture what it is about your own coding culture that produces good results, and communicate it to your partners. This is not a sexy process, it's usually called "writing coding standards". Essentially if anyone wants to work with you, they have to agree up front to agree to your standards. There are two major categories here (deployment standards are a whole extra conversation, this is just for the progress from handing a requirement to a developer and receiving unit tested code from them for deployment in your test server)

1. Deliverables. These are things like requiring design documents, requiring tkprof traces run on all SQL code, data dictionaries, API specs, release notes, readme files, maybe even minimal training materials to be delivered in addition to the raw code. The type of deliverable is in many ways more important than the precise format - have a format you prefer, that gets people past the "blank page" problem, but be flexible if the partner has a similar thing they usually provide that fits the same purpose.

2. Coding standards. These are specific to a particular language, but go beyond basics like commenting styles, variable declarations and similar low level code stuff. They include any restrictions you might have on things like outside libraries, compilers, what hardware or software you're supposed to be compatible with (eg, do you support IE, Firefox, Chrome? What versions?). They might include requiring use of a particular developer platform, possibly with debug traces on certain types of things.

A simple test to know if something is important enough to capture in your coding standards or deliverables is if you can remember a project that went sour for lack of it, or succeeded because of it, or a significant defect that escaped into production because somebody slacked on this thing. The strongest argument for commenting and API definitions and similar 'busywork" is that the guy trying to bring the system back to life after it blew up in their face isn't going to remember what a developer was thinking 3 years ago, unless it's written down. This is true even in the rare case when the dev is the guy doing the troubleshooting.

If you have a story, it's easier to convince your partners to stick to the discipline you've found valuable. You try to get your partners to agree to your standards up front. Any push-back needs to be settled between you before the first line of code is written.

Then there is the other side. It isn't enough to ask for your partners to work a certain way. You have to be firm enough to reject their code if it arrives without the expected deliverables, or if it isn't written to your standards. You kick it back to them and say "you aren't done. Let us know when you're actually ready. Here is what we need you to do to meet our standards. Remember, you agreed to this at the beginning."

Being firm about your standards is the sign of a mature organization. This usually means you will also have to not just hold your partners accountable, but also go to your own stakeholders and explain to them why a giant hole just got knocked into the project plan and possibly the expense account (if your partners aren't on fixed bid or working as volunteers). If you have a track record of success, and can articulate the risks of not working to the standard, these discussions are uncomfortable, but in the end just make your reputation stronger. If by contrast you cave in, accept substandard code and it blows up in production, you take the blame (or worse, your application is thought of as buggy, unreliable crap. Not something anyone wants, and usually much more serious than missing a marketing deadline. )

Regarding security - I think it is more difficult to design in than people think. Adhering to best practices helps, but isn't enough on its own and tends to lead to unwarranted confidence. Usually it is better to do your best, focus on usability and then have a third party who is not involved in the project and doesn't know anything about how you built it try to attack it. Quite frankly if it hasn't be beat up by people trying to break in in ways similar to what the bad guys actually do, you can't have any confidence that it is secure. Even then you need to check your ego at the door. If after your best efforts at exposing weaknesses via testing somebody out there exposes a security vulnerability after you've gone live, you need to just accept it, admit that it happened and make the required changes.

Regarding comments - my opinion is that commenting on what the code is doing is not that useful unless it is very obscure (some ksh regular expressions pack a lot of action in what looks like a jumble of random characters). You comment on WHY the code is doing something, or in the case of a revision, why the revision was made and often when it was made. The goal should be that if somebody is troubleshooting who never knew the developer or the project where the code was created/modified, never saw any design docs or discussions of the requirements, he can still understand why a piece of code is there and what it is trying to do - while under extreme stress and time pressure from a production down situation.

[Dark Man] Geographically Distributed Coding

Post a comment in response: