Friday, 5 December 2014

Lightweight web-based transactional systems

Reanimating a legacy system

I have a set of data from an old legacy application that must be resurrected and web-enabled. The data was extracted from a relational database so it is structured in a conventional third-normal form, but the original application was pensioned off a long time ago, and probably a good thing too.

The user interface and its business logic must be rebuilt as a web-based transactional application so that the data can be accessible from a variety of devices, so I had to decide on a technology stack, but how? There are dozens of database systems, hundreds of software languages, scores of web-development frameworks, and more opinions about the matter than there are trolls on 9GAG.

The formal approach to making a decision would be to consider all the various technologies at each layer of the application stack and then weigh up the pros and cons of each.

A less formal approach would be to ask someone who has done this sort of thing before, like a grizzled Unix veteran or an 31337 h4x0r. 

In the end I applied a divide and conquer algorithm to eliminate dozens of decision points in a few strokes.

MVC or not?

The first thing to decide is the core structure of the application. The application must be web-based, and most dynamic web-sites are based on a Model-View-Controller architecture, but is this pattern still relevant today? Is it applicable to my use case?

I have a data set that must be viewed and maintained because of its historical value, so the application must have a Model. It is not a utility program that has no persistent data. 

It must be a web-based system, accessible from various user interfaces like a browser or smartphone app, so it must have a View. This is not a command-line application. 

Lastly, the data set will be the subject of various views, some of which will be used to maintain and update the data, so the system must have a Controller to manipulate the Model.

So the MVC pattern is most the logical structure for a web-based transactional application.

There are alternatives, like Facebook's Flux, but they are usually just refinements of the MVC pattern, although JoĆ«lle Coutaz introduces a hierarchy of MVC-style layers in her Presentation–Abstraction–Control pattern which is useful for applications that need complex client tiers.

One thing to bear in mind is that web applications use HTTP which is a stateless protocol, so the system must have a Controller that is able to maintain state.

Open or closed source software?

Your average punter may not care whether they have access to the source code of the software or not, but if you are a software developer, you certainly should care. A mechanic would not buy a car with an engine that was locked up, so why would an I.T. professional use proprietary software? I want to see inside, dammit. I might not know what I am looking at, but I want to poke around anyway (and yeah, I know that Blogger is not open source but I will leave that fight to Richard Stallman).

So goodbye to products from Microsoft, Oracle, Adobe, and IBM, amongst others.

Okay, this binary chop between open and closed software means that in terms of the operating system, I am down to Linux, a BSD variant, OpenSolaris, or something less well known like Minix, Darwin, Plan 9 and the rest.

Let's be sensible and stick to Linux. I am currently using Debian Wheezy but we can argue about the distro later. 

Relational or NoSQL?

The next binary chop involves the data layer. My raw dataset is already in a highly normalised relational format, so it makes no sense to convert it to one of the NoSQL databases. Just import it into a relational database. But which one? 

The major open source relational databases are MySQL, MariaDB (which was forked from MySQL), and PostgreSQL. I don't trust Oracle with the custodianship of open source software, so that eliminates MySQL, and I don't see compelling technical arguments for MariaDB over PostgreSQL, so PostgreSQL it is. 

Static or dynamic type-checking? Functional or procedural? Compiled or interpreted?

I need a programming language to manipulate the model. The application needs a state engine, mechanisms to provide transactional integrity and role-based access control, so a server-side template engine is not going to cut it. I have to carve code, but what sort of code?

Entering into a discussion about the merits and demerits of a programming language is like competing in a bog snorkelling championship. There are no winners, just a bunch of cold, wet, exhausted people, covered in mud and weeds.

We can nail down some basic requirements though, like readability. Nobody wants to maintain code that looks like it was written by someone with bits of toast stuck in the keyboard, so that excludes Turing tar-pits like Brainfuck or functional languages like Lisp.

Variable typing is another. I like a variable to be clear about its role in a piece of code, and so do compilers. A language with static types will have fewer surprises at runtime and a compiler can get cracking with its optimisations straightaway, so I decided to go for Java because I am creating a web site, not writing a device driver, and C++ makes me feel seasick.

Java might not be as elegant as Haskell and it has its detractors, but it has a just-in-time compiler, garbage collection, a raft of libraries for reuse, and Java 8 introduces lambda expressions which reduces bulky anonymous inner classes to single expressions. Not too shabby then.

Bare bones or the full Monty? 

Okay, so now I need a Java application server that runs on Linux. The choices are Jetty, Geronimo, TomEE, GlassFish, Enhydra, Resin, JOnAS, JBoss EAP (now Redhat) or WildFly.

But do I really need a full blown application server? I want a bare-bones solution, simple, but not too simple, so I decided to stick with Apache Tomcat because it is one of the most widely used application servers and, although it is just a servlet container, it can be beefed up by adding other components of the Java EE stack as required. Well, within reason, unless you want to recreate TomEE on your own.

As a bonus the Tomcat API documentation is full of terms like Valve, Filter, Container, Pipeline and Engine which is what you want to hear in a software workshop.

Web frameworks for the JVM

Unfortunately I could not find a cleavage plane to divide up the hardest decision: which Java web development framework to use.

Community-driven or standards-driven? Component-based or request-based? Rich responsive user interface or server-side rendering? Cutting-edge or tried-and-tested? 

Matt Raible has done excellent work sketching out the landscape, as have the guys at ZeroTurnaround, so based on their spade work, I see that if this was a Reality TV contest called Survivor JVM 2014, SpringMVC would be leading the pack, with JavaServer Faces hard on its heels. Also in the race are Grails, Vaadin, Google Web Toolkit, Play, and Struts.

There are no right or wrong decisions at this point, they will all do the job, but I have to choose one.

SpringMVC is a request-based MVC framework so it provides a lot of control of the client-side HTML, CSS and JavaScript, but it does not follow the Java EE standard. It did give J2EE a good kick in the pants though, which was ultimately a good thing for web development in general.

Grails makes good reuse of Spring and Hibernate but I really don't want to wrap my head around Groovy, even though it is a superset of Java and runs on the JVM.

Play looks interesting, and I can see the attraction of convention over configuration, but if I am not using Scala, what are the advantages of abandoning the servlet specification?

Struts paved the way for other Java frameworks and is still widely used as a consequence, but it has long since been overtaken by the others.

Vaadin uses GWT widgets and they both produce very rich user interfaces. Vaadin is particularly impressive and is from Finland, like gravlax and the reindeer it is named after, but, like gravlax, it is too rich to eat every day.

So that leaves JavaServer Faces.

Like all the other frameworks, JSF has its detractors, including James Gosling himself, but it provides a Controller with a well documented request-response life-cycle that can manage state, and it has had a new lease on life thanks to component libraries like PrimeFaces.

Changing landscape

Having settled on JSF, it is worth noting that the View layer of Java web applications is currently in a state of flux.

The Java EE 8 spec includes an action-based MVC, but server-side web application frameworks are coming under pressure from HTML5 and client-side JavaScript MVC frameworks like AngularJS and Backbone that communicate with the server using JSON over REST or WebSocket. In fact, there is a project called AngularFaces that tries to combine AngularJS with JSF. 

The key is to make sure that the web application has a very clean separation of concerns so that if the decision to use JSF gets overtaken by events it can be stripped out without leaving too much damage to the remaining parts of the stack.

So the technology stack looks like this: a View consisting of HTML, CSS and JavaScript running in a browser, a Controller consisting of JavaServer Faces in a Tomcat servlet container, and a Model provided by PostgreSQL. The server will run on Linux.