Tutorial: How to make an offline HTML5 web app, FT style

9th November 2014 Update

This article is a little out of date but it still gets a lot of traffic. For more up to date tips and best practise see my offline web app workshop, published for free on GitHub, which includes a chapter on building an offline news app.

Most importantly, my advice on using local databases (WebSQL and IndexedDB) has evolved since writing this article. I now recommend only implementing IndexedDB integration and using the IndexedDB polyfill to support browsers that only provide WebSQL.

For the actual integration with IndexedDB I recommend using Dexie.js a Promise-based library that makes integrating with IndexedDB easy and safe.

Why the world needs another Offline HTML5 App Tutorial

There are plenty of great resources already written for offline HTML5 websites, but just getting a website to work offline is not enough.

In this tutorial we will build two versions of an offline website in order to demonstrate how to add functionality to an existing offline website in such a way that existing users won’t get left behind using an old version.

Many existing tutorials tend to focus on a single technology at a time. This tutorial intentionally avoids going into detail on particular technologies and instead attempts to give a high level overview on how, with the fewest lines of code and in the shortest amount of time, various technologies can be brought together to create an real (and potentially useful) working web app that is structured in a way that makes further development on it in the future easy.

Introduction

We are going to make a simple RSS feed reader capable of storing the latest news items for offline reading. The completed project is ready for forking on github.

Requirements for the demo app

  • Users should be able to download the latest articles.
  • There should be an easy and reliable way to upgrade users’ cached versions of the demo app if we ever want to add functionality to or fix bugs in client side code.
  • Users should be able to view a list of headlines and be able to click or tap into any to read the full content.
  • It should work offline.
  • It will support the iPhone, iPad and iPod touch (and any other platforms you get for free with this support, which includes: Blackberry Playbook, Chrome for Android, Android Browser, Opera Mobile as well as Chrome, Opera and Safari on Desktop).

This demo will use PHP and jQuery because we want the best combination of ubiquity and brevity for demo purposes.

Introducing the application cache

The appcache can be used to enable websites to work offline by specifying a discrete, markup-less list of files that will be saved in case the user’s internet connection is lost.

However, as is widely documented on the internet, the app cache is a bit rubbish.

  • If you list one hundred files in your app cache manifest, it will then download all one hundred of those files as quickly as it can – slowing down the user’s interactions with the app – the app will seem less responsive whilst the browser is doing all that downloading.
  • Added to that, if you change any one of those resources, even just a one line change in one CSS file the browser will then re-download every single resource in the manifest. It can’t do a single file update. It can only replace the entire contents in the cache.
  • And if any of the files fails to download for any reason it will get rid of all the files it’s successfully downloaded and revert to the previous version it had in cache.
  • So even if you made a one line change to one file and the browser successfully downloaded the updated file, if it didn’t manage to download one of your files that didn’t change the entire update would be lost.
  • So our policy on using the application cache is put as few things in it as possible and put things in it that are not going to change very often, such as:
    • Fonts
    • Sprites
    • Splash screen images
    • And one single bootstrap page (see below).
  • And we don’t use it for:-
    • The majority of our Javascript, HTML & CSS
    • Content (including images)

31/08/2012 Edit: Read about our efforts to fix app cache!

So this is what we do instead

We use the appcache to store just enough Javascript, CSS and HTML to get the web app started (we call this the bootstrap) then deliver the rest through an ajax request, eval() it, then store it in localStorage*.

This is a powerful approach as it means that if, for whatever reason, a mistake or corruption creeps into the Javascript code which prevents the app from starting then that broken Javascript will not be cached and the next time the user tries to launch the app their browser will attempt to get a fresh copy of the code from the server.

* This is controversial because localStorage is synchronous, which means nothing else can happen – the website will be completely frozen – whilst you save or retrieve data from it. But in our tests on the platforms we target it is also fast, much faster than WebSQL (the client side database available on platforms such as iOS and Blackberry, which can sometimes be slower than the network). When we come to storing and retrieving articles from our RSS feed, we will use client side database technology WebSQL.

1. The bootstrap

In order to make a simple bootstrapped Hello World web app, create the following files.

/index.html The bootstrap HTML, Javascript & CSS
/api/resources/index.php This will concatenate our Javascript & CSS source files together and send them as a JSON string.
/css/global.css
/source/application/applicationcontroller.js Start off by making a single Javascript file for our application, & create others later
/jquery.min.js Download the latest version from jquery.com
/offline.manifest.php The app cache manifest file.

/index.html
Start by creating the bootstrap html file.

<!DOCTYPE html>
<html lang="en" manifest="offline.manifest.php">
    <head>
        <meta name="viewport" content="width=device-width,initial-scale=1.0,maximum-scale=1.0,minimum-scale=1.0,user-scalable=no" />
        <script type="text/javascript" src="jquery.min.js"></script>
        <script type="text/javascript">
            $(document).ready(function () {
              var APP_START_FAILED = "I'm sorry, the app can't start right now.";
              function startWithResources(resources, storeResources) {

                  // Try to execute the Javascript
                  try {
                      eval(resources.js);
                      APP.applicationController.start(resources, storeResources);

                  // If the Javascript fails to launch, stop execution!
                  } catch (e) {
                      alert(APP_START_FAILED);
                  }
              }

              function startWithOnlineResources(resources) {
                  startWithResources(resources, true);
              }

              function startWithOfflineResources() {
                  var resources;

                  // If we have resources saved from a previous visit, use them
                  if (localStorage && localStorage.resources) {
                      resources = JSON.parse(localStorage.resources);
                      startWithResources(resources, false);

                  // Otherwise, apologize and let the user know the app cannot start
                  } else {
                      alert(APP_START_FAILED);
                  }
              }

              // If we know the device is offline, don't try to load new resources
              if (navigator && navigator.onLine === false) {
                  startWithOfflineResources();

                  // Otherwise, download resources, eval them, if successful push them into local storage.
              } else {
                  $.ajax({
                      url: 'api/resources/',
                      success: startWithOnlineResources,
                      error: startWithOfflineResources,
                       dataType: 'json'
                  });
              }
          });
        </script>
        <title>News</title>
    </head>
<body>
    <div id="loading">Loading&hellip;</div>
</body>
</html>

To summarise, this file does the following:-

  • It tells the browser this website is capable of working offline by including a reference to the manifest file in its html tag: <html manifest="offline.manifest.php">
  • Unless the app knows it is offline (by using window.navigator.onLine), attempt to download the latest Javascript and CSS files.
  • If the app cannot get new resources retrieve them from local storage instead.
  • Eval the Javascript.
  • Start the app by calling a function in the evaled code (which will be APP.applicationController.start() for the purposes of this tutorial)
  • If we have just downloaded new resources, save them to local storage.
  • And finally, if at any point the app fails to load, show a friendly error.
  • Whilst the app is loading, display a Loading… message to users.

/api/resources/index.php
Now to make the server side response to api/resources/ (which we requested on #47 of the previous file, /index.html):-

<?php
// Concatenate the files in the /source/ directory
// This would be a sensible point to compress your Javascript
$js = '';
$js = $js . 'window.APP={}; (function (APP) {';
$js = $js . file_get_contents('../../source/application/applicationcontroller.js');
$js = $js . '}(APP));';
$output['js'] = $js;

// Concatenate the files in the /css/ directory
// This would be a sensible point to compress your css
$css = '';
$css = $css . file_get_contents('../../css/global.css');
$output['css'] = $css;

// Encode with JSON (PHP 5.2.0+) and output the resources
echo json_encode($output);

/css/global.css

At this stage, this is just a placeholder to demonstrate how we can deliver CSS.

body {
  background: #d6fab2; /* garish green */
}

/source/application/applicationcontroller.js

This will be expanded later, but for now this is the minimum Javascript required to inject our CSS resources, remove the loading screen and display a Hello World message instead.

APP.applicationController = (function () {
    'use strict';

    function start(resources, storeResources) {

        // Inject CSS into the DOM
        $("head").append("<style>" + resources.css + "</style>");

        // Create app elements
        $("body").html('<div id="window"><div id="header"><h1>My News</h1></div><div id="body">Hello World!</div>');

        // Remove our loading splash screen
        $("#loading").remove();

        if (storeResources) {
          localStorage.resources = JSON.stringify(resources);
        }

    }

    return {
        start: start
    };
}());

/offline.manifest.php
Finally, the appcache manifest.

This is where other tutorials will tell you to edit your apache config file to add the content-type for *.appcache. You would be right to do this but I want this demo web app to be as portable as possible and work by simply uploading the files to any standard PHP server without any .htaccess or server configuration file hassle, so I will give the file a *.php extension and set the content type by using the PHP header function instead. The *.appcache extension is a recommendation, not a requirement, so we will get away with doing this.

<?php
header("Content-Type: text/cache-manifest");
?>
CACHE MANIFEST
# 2012-07-14 v2
jquery.min.js
/
NETWORK:
*

As you can see, in line with our app cache usage recommendations we’ve only used the app cache to store the bare minimum to get the web app started:- jquery.min.js and / – which will store index.html.

Upload these files to a standard PHP web server (all the files should go in a publicly accessible folder either in public_html (sometimes httpdocs) – or a subfolder of it) then load the app and it should work offline. Currently it doesn’t do anything more than say Hello World – and we needn’t have written a single line of Javascript if that were our aim.

What we’ve actually created is a web app capable of automatically upgrading itself – and we won’t need to worry about the app cache for the rest of the tutorial.

2. Building the actual app

So far we’ve kept the code very generic – at this point the app could feasibly go onto become a calculator, a list of train times, or even a game. We’re making a simple news app so we will need:-

  • A client side database to store articles downloaded from the RSS feed.
  • A way to update those articles.
  • A view listing all the articles.
  • A view to show each article on their own.

We’ll use a standard Model-View-Controller (MVC) approach to organise our code and try to keep it all as clean as possible. This will make testing and future development on it a lot easier.

With this in mind, we’ll make the following files:-

/source/database.js Some simple functions to make using the client side (WebSQL) database easier.
/source/templates.js The V in MVC. View logic will go in here.
/source/articles/article.js The model for articles – in this case just some database functions.
/source/articles/articlescontroller.js The controller for articles.
/api/articles/index.php An API method for actually getting the news.

We will also need to make changes to api/resources/index.php and /source/application/applicationcontroller.js.

/source/database.js
The client side database technology which we will use to store article content will be WebSQL even though it is deprecated because its replacement, IndexedDB, is still not supported on iOS – our key target platform for the demo web app. We will cover how to support both IndexedDB and WebSQL in future posts.

APP.database = (function () {
    'use strict';

    var smallDatabase;

    function runQuery(query, data, successCallback) {
        var i, l, remaining;

        if (!(data[0] instanceof Array)) {
            data = [data];
        }

        remaining = data.length;

        function innerSuccessCallback(tx, rs) {
            var i, l, output = [];
            remaining = remaining - 1;
            if (!remaining) {

                // HACK Convert row object to an array to make our lives easier
                for (i = 0, l = rs.rows.length; i < l; i = i + 1) {
                    output.push(rs.rows.item(i));
                }
                if (successCallback) {
                    successCallback(output);
                }
            }
        }

        function errorCallback(tx, e) {
            alert("An error has occurred");
        }

        smallDatabase.transaction(function (tx) {
            for (i = 0, l = data.length; i < l; i = i + 1) {
                tx.executeSql(query, data[i], innerSuccessCallback, errorCallback);
            }
        });
    }

    function open(successCallback) {
        smallDatabase = openDatabase("APP", "1.0", "Not The FT Web App", (5 * 1024 * 1024));
        runQuery("CREATE TABLE IF NOT EXISTS articles(id INTEGER PRIMARY KEY ASC, date TIMESTAMP, author TEXT, headline TEXT, body TEXT)", [], successCallback);
    }

    return {
        open: open,
        runQuery: runQuery
    };
}());

This module has two functions that other modules can call:-

  • open will open (or create a new) 5MB* database and ensure a table called articles with some appropriate fields exists so that the app can store the articles for offline reading.
  • runQuery is just a simple helper method that makes running queries on the database a little simpler.

* See our article on offline storage for more details on database size limits.

/source/templates.js
We will keep all view or template type functions together here.

APP.templates = (function () {
    'use strict';

    function application() {
        return '<div id="window"><div id="header"><h1>FT Tech Blog</h1></div><div id="body"></div></div>';
    }

    function home() {
        return '<button id="refreshButton">Refresh the news!</button><div id="headlines"></div></div>';

    }

    function articleList(articles) {
        var i, l, output = '';

        if (!articles.length) {
            return '<p><i>No articles have been found, maybe you haven't <b>refreshed the news</b>?</i></p>';
        }
        for (i = 0, l = articles.length; i < l; i = i + 1) {
            output = output + '<li><a href="#' + articles[i].id + '"><b>' + articles[i].headline + '</b><br />By ' + articles[i].author + ' on ' + articles[i].date + '</a></li>';
        }
        return '<ul>' + output + '</ul>';
    }

    function article(articles) {

        // If the data is not in the right form, redirect to an error
        if (!articles[0]) {
            window.location = '#error';
        }
        return '<a href="#">Go back home</a><h2>' + articles[0].headline + '</h2><h3>By ' + articles[0].author + ' on ' + articles[0].date + '</h3>' + articles[0].body;
    }

    function articleLoading() {
        return '<a href="#">Go back home</a><br /><br />Please wait&hellip;';
    }

    return {
        application: application,
        home: home,
        articleList: articleList,
        article: article,
        articleLoading: articleLoading
    };
}());

In this file we’ll just put some simple functions that (with as little logic as possible) generate HTML strings. The only slightly odd thing here is: you may have noticed the database.js runQuery function always returns an array of rows even if you’re only expecting a single result. This means the APP.templates.article() function will need to accept an array that contains a single article to be compatible with that. A new method could easily be added to the database function which could run a query but only return the first result, but for now this will do.

As our app grows we might like to split this file up, the article functions could go into /source/articles/articlesview.js, for example.

/source/articles/article.js
This file will deal with communication between the article controller and the database.

APP.article = (function () {
    'use strict';

    function deleteArticles(successCallback) {
        APP.database.runQuery("DELETE FROM articles", [], successCallback);
    }

    function insertArticles(articles, successCallback) {
        var remaining = articles.length, i, l, data = [];

        if (remaining === 0) {
            successCallback();
        }

        // Convert article array of objects to array of arrays
        for (i = 0, l = articles.length; i < l; i = i + 1) {
            data[i] = [articles[i].id, articles[i].date, articles[i].headline, articles[i].author, articles[i].body];
        }

        APP.database.runQuery("INSERT INTO articles (id, date, headline, author, body) VALUES (?, ?, ?, ?, ?);", data, successCallback);
    }

    function selectBasicArticles(successCallback) {
        APP.database.runQuery("SELECT id, headline, date, author FROM articles", [], successCallback);
    }

    function selectFullArticle(id, successCallback) {
        APP.database.runQuery("SELECT id, headline, date, author, body FROM articles WHERE id = ?", [id], successCallback);
    }

    return {
        insertArticles: insertArticles,
        selectBasicArticles: selectBasicArticles,
        selectFullArticle: selectFullArticle,
        deleteArticles: deleteArticles
    };
}());

There are complexities to be dealt with here:-

  • In this simple demo app, articles are passed around as objects (in the form var article = { headline: 'Something has happened!', author: 'Matt Andrews', … etc }). In order to insert an article of this form this WebSQL, it’ll need to be converted into an array – which is what happens on line #17
  • Because WebSQL is really slow (sometimes even slower than the network), when we’re selecting all of the articles for listing on our app’s home page we don’t want to select the article body (as this is the largest part of each article) which is why there are two functions with different queries for selecting articles, selectBasicArticles (plural) and selectFullArticle.

/sources/articles/articlescontroller.js
Now create the articles’ controller.

APP.articlesController = (function () {
    'use strict';

    function showArticleList() {
        APP.article.selectBasicArticles(function (articles) {
            $("#headlines").html(APP.templates.articleList(articles));
        });
    }

    function showArticle(id) {
        APP.article.selectFullArticle(id, function (article) {
            $("#body").html(APP.templates.article(article));
        });
    }

    function synchronizeWithServer(failureCallback) {
        $.ajax({
            dataType: 'json',
            url: 'api/articles',
            success: function (articles) {
              APP.article.deleteArticles(function () {
                  APP.article.insertArticles(articles, function () {
                    /*
                     * Instead of the line below we *could* just run showArticeList() but since
                     * we already have the articles in scope we needn't make another call to the
                     * database and instead just render the articles straight away.
                     */
                    $("#headlines").html(APP.templates.articleList(articles));
                  });
              });
            },
            type: "GET",
            error: function () {
                if (failureCallback) {
                    failureCallback();
                }
            }
        });
    }

    return {
        synchronizeWithServer: synchronizeWithServer,
        showArticleList: showArticleList,
        showArticle: showArticle
    };
}());

The article controller will be responsible for:-

  • Instructing the model to pull article(s) out of the database, and for passing the returned data to the view so that it can be displayed on screen. (#4 and #10)
  • Synchronising the articles in the database with the latest articles from the RSS feed. This works by:-
    • Using jQuery’s .ajax method, it first download the latest articles from the RSS feed (formatted using JSON).
    • If that download successfully completes, it runs the APP.articles.deleteArticles function to clear the database of any articles that are currently stored
    • Then it uses the APP.article.insertArticles to push the articles that have been just downloaded into the database.
    • Finally, it uses jQuery and a call to the templates module to display a list of those article’s headlines.

/api/articles/index.php
This file will download then parse an RSS feed (using xpath). It will then strip out all the HTML tags from each article’s body (except for <p>’s and <br>’s) and output this information using json_encode.

<?php
// Convert RSS feed to JSON, stripping out all but basic HTML
$rss = new SimpleXMLElement(file_get_contents('http://feeds2.feedburner.com/ft/tech-blog'));
$xpath = '/rss/channel/item';
$items = $rss->xpath($xpath);

if ($items) {
  $output = array();
  foreach ($items as $id => $item) {

    // This will be encoded as an object, not an array, by json_encode
    $output[] = array(
      'id' => $id + 1,
      'headline' => strval($item->title),
      'date' => strval($item->pubDate),
      'body' => strval(strip_tags($item->description,'<p><br>')),
      'author' => strval($item->children('http://purl.org/dc/elements/1.1/')->creator)
    );
  }
}

echo json_encode($output);

Although we’ve finished adding all the new files we’re not quite done yet.

/api/resources/index.php
We need to update the resource compiler to let it know the locations of our newly added Javascript files, so /api/resources/index.php becomes:-

<?php
// Concatenate the files in the /source/ directory
// This would be a sensible point to compress your Javascript.
$js = '';
$js = $js . 'var APP={}; (function (APP) {';
$js = $js . file_get_contents('../../source/application/applicationcontroller.js');
$js = $js . file_get_contents('../../source/articles/articlescontroller.js');
$js = $js . file_get_contents('../../source/articles/article.js');
$js = $js . file_get_contents('../../source/database.js');
$js = $js . file_get_contents('../../source/templates.js');
$js = $js . '}(APP));';
$output['js'] = $js;

// Concatenate the files in the /css/ directory
// This would be a sensible point to compress your css
$css = '';
$css = $css . file_get_contents('../../css/global.css');
$output['css'] = $css;

// Encode with JSON (PHP 5.2.0+) & output the resources
echo json_encode($output);

/source/application/applicationcontroller.js
And finally we will need to update applicationcontroller.js so that all the new functions we’ve added can actually be used by our users.

APP.applicationController = (function () {
    'use strict';

    function offlineWarning() {
        alert("This feature is only available online.");
    }

    function pageNotFound() {
        alert("That page you were looking for cannot be found.");
    }

    function showHome() {
        $("#body").html(APP.templates.home());

        // Load up the last cached copy of the news
        APP.articlesController.showArticleList();

        $('#refreshButton').click(function () {

            // If the user is offline, don't bother trying to synchronize
            if (navigator && navigator.onLine === false) {
                offlineWarning();
            } else {
                APP.articlesController.synchronizeWithServer(offlineWarning);
            }
        });
    }

    function showArticle(id) {
        $("#body").html(APP.templates.articleLoading());
        APP.articlesController.showArticle(id);
    }

    function route() {
        var page = window.location.hash;
        if (page) {
            page = page.substring(1);
            if (parseInt(page, 10) > 0) {
                showArticle(page);
            } else {
                pageNotFound();
            }
        } else {
            showHome();
        }
    }


    // This is to our webapp what main() is to C, $(document).ready is to jQuery, etc
    function start(resources, start) {
        APP.database.open(function () {

            // Listen to the hash tag changing
            $(window).bind("hashchange", route);

            // Inject CSS Into the DOM
            $("head").append("<style>" + resources.css + "</style>");

            // Create app elements
            $("body").html(APP.templates.application());

            // Remove our loading splash screen
            $("#loading").remove();

            route();
        });

        if (storeResources) {
          localStorage.resources = JSON.stringify(resources);
        }
    }

    return {
        start: start
    };
}());

(Working from bottom to top) this file will handle the following functionality:-

  • On APP.applicationController.start():-
    • Start listening for changes in the hash tag and when a change is detected, run the route function – more on this below.
    • Inject the CSS into the DOM, create initial app elements (as before, but we’ve moved the HTML string into the templates.js file).
    • Remove the loading splash screen, as before.
    • Run the route function.
  • The route function will get the current hash tag:-
    • If it is blank run the showHome function.
    • If it isn’t remove the first character (as it will always be “#”) – then if it’s a positive integer assume it’s an article and try to load the article with that ID number by calling showArticle(id).
    • If it’s not blank or a positive integer display a friendly Page not found message to the user.
  • Finally, showHome and showArticle(id) will put some basic HTML into the page and call the articleController’s showArticleList and showArticle(id) functions, respectively. The showHome also sets up the event listener so that the refresh button triggers the articleController’s synchronizeWithServer [sic] method.

Ideas for further development

  • We’ve broken the web – it won’t work without Javascript switched on.
  • We’ve broken search engines – there’s no crawlable content.
  • We’ve not considered accessibility.
  • We’re passing all the rendering of the page to be done on the client (potentially an antique mobile telephone) when we have an entire web server (and cache) at our disposal.
  • It doesn’t feel like an app. You may have noticed on certain touch devices, the links aren’t very responsive – there is a 300ms delay between tapping and anything happening. Nor is there any swiping, flicking or pinching.
  • It doesn’t look like an app either – eg. it isn’t optimized to “my” device’s screensize…
  • Images currently won’t work offline.
  • Specific improvements we could make the bootstrap:-
    • In this demo news app, we download all our CSS and Javascript and process that information (JSON decoding then re-encoding it, saving it to local storage) each time we load the app. This could be made more efficient by giving the resources we download a version number. If we did this, the demo app could first check if it has the latest version already and skip the download stage if nothing has changed.
    • This still forces the user to wait for the server to respond before the demo app will start when they are online. Instead, the demo app could boot with the code it has already – then only start using the new code the next time it is launched. This is how the FT web app works.
  • Currently the demo app doesn’t gracefully fail when iOS is in ‘private browsing’ mode – it just throws an error. This is caused by an iOS bug, which we could easily detect and work around.

Wrapping Up

Clearly our demo web app leaves a lot of room for improvement. However, by organising our code in a clean and structured way, we’ve created a platform that almost any kind of application could be built upon and by using a short script (which we called the bootstrap) to download and eval the application’s code, we don’t need to worry about dealing with the app cache’s problems. This leaves us free to get on with building great web applications.

Finally, if you think you’d like to work on this sort of thing and live (or would like to live) in London, we’re hiring!

By Matt Andrews – @andrewsmatt on Twitter & Weibo.
Also available in Serbo-Croatian thanks to Jovana Milutinovich.

Continue to part 2 – Going cross platform with an FT style web app