Here's a quick rundown of the hardware and software our systems run on, the applications we make and how they work together.
We're running Ubuntu Server 12.04. That's the same OS used locally for development; keeping the production environment on the same OS simplifies production environment preparation and deployment.
Simply Testable Applications
I wrote about our system architecture some months ago; have a quick read of that to get up to speed.
Four applications make up the Simply Testable system. They're all PHP-based using the Symfony2 framework and all code is held in git repositories available on GitHub.
The core application is a job queuing and task management system. It takes requests for new full-site test jobs, breaks down each job into a collection of tasks, assigns tasks out to workers, receives results from workers and makes the results available to the web client.
Workers carry out the actual tasks, such as an HTML validation task. They accept tasks from the core application, carry them out and report back the results.
As you can see from the list above, we have four workers at present. Although they currently all run on the same box, they can be deployed anywhere allowing us to distribute the workload and provide a spot of redundancy.
Scalability, cloud computing and related buzzwords spring to mind.
The web client is a human-friendly interface to the core application. You can start new full-site tests, watch the progress and view the results.
The web client merely reports back what is going on at any given moment. What is going on in the core application keeps on going on whether you're watching or not. If you start a test and then close your browser, the test keeps running and you can come back at any time.
This is what you see when you to to simplytestable.com.
Carrying out actual tests
The W3C HTML validator is basically one epic Perl script. The validator.nu HTML validator is a Java application.
Later I'll be adding local installations of the W3C CSS validator and JSLint.
The applications that actually carry out the tests are all quite different. Each worker has a collection of drivers to interact with the testing applications. The HTML validation driver, for example, tells a worker how to talk to the HTML validator and how to understand what comes back.
If you were to take the code for the core application and the workers and deploy them somewhere and start up a new full-site test you might be surprised to notice that absolutely nothing useful happens.
The workflow of each application comprises a number of steps either initialised by an HTTP request or via issuing a command line command.
Let's quickly look at the workflow for a new test job as an example:
php app/console simplytestable:job:prepare 1
php app/console simplytestable:task:assign 1
php app/console simplytestable:task:assign 2
php app/console simplytestable:task:assign N
Once a request to start a new job is receive, it must be prepared (URLs to test discovered and a collection of tasks created) and then each task must be assigned out to a worker.
The whole process is a collection of distinct steps. Each step carries out a small unit of work and changes the state of the job. Most are initialised via command line commands.
Distinct steps are easy to test. Distinct step failure is easy to handle. Distinct steps can happen later. Distinct steps can (often) be run in parallel.
Distinct steps do not, however, run themselves. When one is done, that's it. Something else has to kick off the next step and the one after that and so on.
Once every step is done, it pops a job in a queue to kick off the next step. When developing, we can ignore this queue and manually kick off steps as needed to make things initially work and to investigate bugs. The integration system can also ignore the queue and can instead sequentially kick off all steps and verify that everything works bit by bit as a full-site test progresses.
The production system uses Resque to manage the workflow queue:
… a Redis-backed library for creating background jobs, placing those jobs on multiple queues, and processing them later.
Smashing, just what we need. It'd be awfully tedious to manually kick off each step in the production environment.
This automates the workflow for both the core application and the workers. The core application uses about 10 workflow job queues, all being serviced by about 100 Resque processes in total. Each worker uses about 3 workflow job queues, all being serviced by about 15 Resque processes in total.
I'm done with my quick rundown of what we used, what it's made from and how it manages to do what it does without constantly failing.
Send a message to @simplytestable if you want to know more.