October 30, 2009

Build vs Buy? Apply the WASABI principle: “We always should avoid building it” http://ping.fm/SkqDz


The WASABI Principle

October 30, 2009

Joel Spolsky got something right when he says  If it’s a core business function — do it yourself, no matter what.

When code generation technology (think community projects like AppFuse and Grails) are becoming mainstream, comprehensive and open source, some companies will still find reasons not to adopt them for side projects that are non-critical to the core business.  Some reasons that I have heard:

* Organizational Knowledge: “we don’t have anyone else who knows that open source project if you leave”

* They’ll break us: “what if the release a new non-backward compatible version?”

* Optimistic NIH Syndrome: “we have half the features implemented in house already – can’t we just build the rest ourselves?”

Now sometimes, these reasons for not adopting a side-technology may be relevant, if we think that the side technology may be a core technology in the future, or if it is a non-mainstream technology.

But I think many seasoned developers make the mistake of thinking “how am I going to build it” rather than “who has built this before” as a first line of thinking, especially for non business-critical projects.

So I’ll put a twist on Spolsky and YagNi:  -

If it ain’t core business, think WASABI – “We always should avoid building it”.


Hidden-in-plain-sight cost of @Transactional

October 25, 2009

This week I discovered why @Transactional(readOnly=true) is not cost free.
The whole concept of read only transaction is platform-dependent and confusing to start with.

A read-only transaction with proxies / aspects will still grab a connection and install it in a thread local to implement the readonly transaction.  While this fulfils what you said, it often will not be what you want – it leads to connection pool thrashing if you put in onto a ‘hot’ method.

In the case I saw this week, we had put the annotation on a cache-backed database-free method in ’sympathy’ while adding the annotation for a couple of new write-only api methods.  This cache method turned out to be called 14 times per page and thrashed the database pool so much that even with pool retries enabled we were getting pool errors and throughput losses.

So, my current thinking is not to annotate read-only transactions and only annotate the top-level transaction entry points, with proxies around all your Daos to ensure you only get one transaction.  And never add annotations in sympathy without a good reason other than ‘it can’t hurt’.  Because maybe it will!


Slicehost firewalls

August 24, 2009

Installed Apache, configured your DNS but STILL having trouble accessing your domain at slicehost? You probably need to adjust your iptables (firewall) rules.

This article really helped me understand iptables at a very basic level. It’s been a couple of years since I really delved into linux and things have become really nice.

I also found out that slicehost correctly configures your firewall pretty tight, allowing new connections for ssl only. Wow, another thing done right at slicehost!

From: http://ping.fm/mIqx2


Using slicehost + DNS + MySQL + Subversion setup

August 23, 2009

I bought a slice on slicehost since I had heard good things.

This morning in 2 hours I have a running mysql and subbversion daemon and have pointed one of my domain names to the sliced host. I used the Fedora 11 image, mainly because I think at work we will be going with that distribution, no other reason.

I learned that yum is a great package manager to use with Fedora.

Here are some links that got me going along the way.

MySQL link:
http://fedorasolved.org/Members/opsec/installing-configuring-mysql-server

Subversion + Apache links:
http://www.technize.com/2008/05/27/how-to-configure-svn-subversion-server-on-fedora-core-with-apache22-under-selinux/
http://www.tonyspencer.com/2007/03/02/setup-a-subversion-server-in-4-minutes/

Setting up DNS on Slicehost:
http://articles.slicehost.com/2007/10/24/creating-dns-records


Tool of the Trade 4: Standard Security Solutions

July 21, 2009

All security concerns that are not high-level concerns (such as ’store credit cards securely’) will not be brought up by customers, but standard tools of the trade are able to help.

I want to concentrate on two high level aspects of website security: authentication, and authorization.  If you have authored a few web sites, you see the same problems arising over and over: how to identify customers, validate their data, and how to grant them rights to perform website actions or show different views.  These sort of issues tend to become common QA test cases that can yield surprising failures over time in regression suites unless the foundation for the system is solid.

Authentication

This is the problem of how to identify a customer and their data when they make a request.  Typical solutions range from a plain-text cookie through to standard and then custom HTTP session management solutions.   In my mind authentication also covers standard solutions for securing round-tripped page parameters in HTML hidden input elements.  Cross site request forgery ( XSRF ) and the Top 10 list from OWASP include additional common security problems that are solved by nuances of authentication regimes.

Authorization

There are two main kinds of authorization regimes.  Over time, I have seen privilege-based systems scale well – where the code focuses on the privileges  required for an activity rather than the roles that may have the permissions.  The system is more  naturally immune to unanticipated changes to roles’ privileges, and the code is more stable over time.  The upshot is to consider the privileges required for different system behavior first, and only then identify the roles of the system.  This is usually the reverse of the role-based analysis often touted by development methodologies.

There are usually four relations in these sort of systems: User <-> Groups,  Groups <-> Groups, Groups <-> Roles and Roles <-> Privileges.  The question of a user holding a privilege becomes a hierarchical search through their groups for roles containing the privilege.


Tool of the Trade 3: Software Utilities

July 19, 2009

There is a whole class of software solutions, usually called ‘utilities’, that seem to be required wherever you work, in whatever runtime system you have.  For websites, I have a small list of utility classes that I usually develop or find equivalents for:

- Link generator: How does the website generate links, secured when needed?  Often a URL builder in one form or another that can deal with URI encoding is needed.

- Tracers, Counters, Logging Subsystem: It’s really important to have a standard set of abstractions to measure timings, log events and even just count how many times a code path is executed.  This will be invaluable performance measurement and debugging infrastructure that can grow as your site does.  Even the simplest site can benefit from logging.

- Secure tokens: This is a strange one that only recently became apparent as a utility to me – you will be generating ‘tokens’, sending them out into the real world, and when you get them back you will be tying them up to some authentication purpose.  Additional requirements are often that they can be versioned, are valid for a certain period of time and can be individually, by category or globally revoked. I’ve built and seen about 7 of these systems, and often you build it more than once within a company!

- Encryption libraries: You are going to need a hardware or software solution for storing sensitive data, and for digesting data.  Usually a combination of both hardware and software for the appropriately paranoid.  If you even think of storing an encrypted password, rather than a digest of it, you’ll probably want to protect it in memory using software and/or get it encrypted to disk as soon as possible.  Either way, I’m sure we’ve all written wrappers around core encryption libraries in the past – you’ll want this in your toolchest, along with a security policy that makes sense for the site.

- Admin Management Tools: A side application to be used for monitoring and/or customer support.

What other core utilities are there for website development?


Tool of the Trade 2: Software Design Guidelines

July 13, 2009

Design guidelines are critical for a successful software project.  No customer will ever demand this directly, or understand it.    I think there is a design at two levels for software – a macro and micro.  Macro level design has to do with technology choices and data flow.  Micro-design has to do with code organization and dependencies within the macro bounds.  It’s really important for a team to agree on the feeling of a software system design at both these levels; that is to have a basic idea that we are all painting a landscape and not a Cubist. The conceptual foundation,  and the feeling you get from the codebase in conjunction with the team’s feelings about it really matters!  If the code design standards are not agreed upon, the codebase slips, and people’s attitudes slip with it.  At the other extreme, setting design in stone defeats the SOFT of software. It takes a strong personality to motivate the engineers and the business owners to address the issues when the slippage happens.

I’ve seen alot of different attitudes to design ‘guidelines’, mostly to do with the level of enforcement.  When guidelines become enforced, the resulting software can be beautifully conceptually elegant.  When guidelines are not enforced, the resulting software can be a dumpling soup of dependencies (that is, code that looks like it’s separated into distinct globs but it all falls apart if you pick one thing up). But enforcement is usually inversely proportional to happiness – if your company wants to scale for 10+ years, you need to be a carrot-giver rather than a stick carrier.

So what is the key to a healthy codebase? If the goal is a happy team with a flexible codebase then make it so the organically easiest thing to do is to follow the patterns, at least well enough to preserve the macro picture of the product.  Relegate the painful bits of coding to side-issues but track them as well; always be flexible in admitting new patterns when there is a clear advantage, but FUND THE CODE HEALING necessary to make the code base look like the change was there from the start.  If the funding is large, perhaps it is still a side issue.  DON’T make the change is 10% of places that it is needed and then rely on other people to follow the new pattern – it WON’T HAPPEN.


Tool of the Trade 1: Testing Tools and Strategy

July 7, 2009

Is a customer ever going to tell you how to test your software?  No – they just assume you will test it and it will work.

A testing strategy is always worth considering first when building a website.  Especially if you are using a dynamic language as your server-side request handler.  Let’s face it – it is not sexy or cool to manually or automatically run a bunch of test cases  against your system.  But considering the question of how to test effectively can tell you a lot about the nature of your system at a macro-level – something good to be conscious of continuously that often is not.  A pedantic regime of regression tests only very rarely assists with showing real surprises; too pedantic and your large test suite is wastefully expensive, but too frivolous and you are going to need a customer support team – if you are lucky and don’t crash hard!

Here are some key decision points for developing a testing strategy and choosing the appropriate tools; the decision points really help you make a few tradeoffs from the start.

  • What test environments does the organization require? Many companies establish development, testing and production infrastructures, replicating the database and applications in each environment.  This allows development, testing and the live site to be used independently and concurrently.  Often this is accompanied by a regime of either recreating or refreshing the development and testing databases on a schedule.  All of this can be expensive and may grow over time as the site grows.
  • Do I require automated testing at all? It is surprising how far you can get with no automated testing regime at all.  It requires a level of attention not usually required of QA engineers, and perhaps a funded customer support channel as a backup, but it can be a legitimate business decision for the short to medium term.
  • What styles of automated testing are available to me? I have seen testing regimes that start with black-box testing, record/playback automated black box, intelligent UI-agnostic form-aware link-aware automated black box, in-container request/response testing, integration testing of a unified business layer, integration testing of a module collection, unit-tests for a module and unit tests for some classes.  I don’t think any of them scale particularly well.  Right now you are better off choosing those that match your design the most, remembering not to be too pedantic about testing.
  • Do I need a layered code design or is a flatter design acceptable? Often we start by building a typical 3-tier (MVC)-presentation, Services and Data-Access layered system for websites.  If it is a quick and dirty site, this may not be the most productive use of your time (have to edit multiple files, follow a process for feature development).   Perhaps it’s ok to rely on refactoring when and if the site grows?  Until then, only share code when completely obvious.
  • How am I going to keep the test cycle fast and efficient? Whether you are performing automated testing or not, it will be worth considering what the workflow of QA engineers will be.  Often they will be heavy users of a bug system, issue tracking system for support and good ones will be clients of the production databases for troubleshooting and ad-hoc queries.  Determination of the test cycle workflow will also impact what your release cycle can be.  Shorter is sweeter.  At a technical level, one can also consider using mock objects to make the tests run fast.

Testing strategies can help define how business resources are used, the tone of the company and the release cycle for your software.  The consideration of testing processes is a vital tool in the construction of websites (and software) – one which many business owners will implicitly require.


Top 5 Tools of the Website Trade You Won’t Hear About from Customers

July 7, 2009

What are the tools of the trade that you will never see in a product requirements document or feature request from an investor or client?

Clients often just expect that certain things are taken care of when building a web site.  These tools of the trade are important -  they are ubiquitous across systems, and often we use the cheaper tools to start with and then switch over (re-tool) during the course of business growth.  Over the course of the next few posts, I’m going to briefly develop what I think are the Top 5 tools of the trade for a software engineer working on a website, from the point of the view that your customers will never ask for them, but they are almost always required and even assumed.

I’m going to try to consider this in a language-neutral way and stick with general principles or ideas.  Stay tuned!