HTTP/3, UDP and a faster, more seamless Internet

Why is this interesting?

The few readers who have hitherto wandered to this blog, plagued as it is by my obsessional interest in either the eye-wateringly technical details of software installation or else the equally obscure details of Celtic languages, may find it refreshing to hear that this post is going to explain why the new HTTP/3 standard is going to make the Internet much better – mostly by being a LOT faster.

HTTP/3 200
server: h2o
content-length: 193
date: Fri, 01 Nov 2019 10:32:48 GMT
content-type: text/html
last-modified: Tue, 29 Oct 2019 09:15:07 GMT
etag: "5db8031b-c1"
accept-ranges: bytes
alt-svc: h3-23=":8081"

(For those who are interested in-depth instructions of how to set up HTTP/3 on the h2o web server, please see the updated version of my previous post here.)

You might have heard that HTTP/2 came out in 2015, which contains a of optimisations and technical improvements that make it far faster than the aged HTTP/1.1 (1999), itself a revision of the now archaic HTTP/1.0 (1996) and, practically back at the dawn of time, the retrospectively named HTTP/0.9 (1991)*. The latter two are of largely historical interest by now and many command-line tools don’t even bother to support them even for the hell of it because you’d be a very pedantic (or specialist) person indeed to bother to distinguish them from HTTP/1.1. So why, you may ask, would I make such a fuss about HTTP/3, just another improvement? Why should I care how or why it works, if it does work anyway?

Well, it is NOT “just an another improvement”. It fundamentally changes the way that the Internet works by doing something very unexpected. Up to now, most reliable transfers of data have been made using the ageing Transmission Control Protocol (TCP), dating from the publication of RFC 675 back in 1974. That was a while back, it must be said, particularly in computing terms. The UNIX Epoch, widely understood to be the era of modern computing, dates from 1 January 1970. (This is often a default date for files that haven’t got one of their own or have lost it somehow, in case you were wondering where you might have seen it before.)

A view of the Chrome Canary Developer Tools showing HTTP/3 as “http2+quic/99”

(*HTTP/0.9 was essentially HTTP 1.0 without any headers. It had no version name at the time.)

What is HTTP/3, then?

Instead, it has been decided that HTTP/3 will work over the User Datagram Protocol (UDP), which is blazingly fast but, until now, totally unsuitable for the purpose of sending information and being sure that it will arrive in the correct order. It’s great for short messages, for streaming, for images in games and so forth. However, it has the great failing that it doesn’t control the order in which packets arrive, the discrete parts of data that make up everything that you or anybody else send over the Internet, and it does nothing to reassemble them afterwards. Since the browser manufacturers decided to prefer HTTPS at the time HTTP/2 was adopted in 2015, 80% of the Internet is encrypted using increasingly good versions of TLS (sometimes or more commonly known by the name of its now defunct predecessor SSL) and is thus a great deal more secure. If you thought it was a problem for HTTP that messages broken up into little pieces and reassembled in the wrong order would create chaos (which is why we used TCP instead of UDP), how much worse would it be if they were over HTTPS?

The answer is that it would be trash. If you encrypt something you can’t guess at what the parts of it are until they unencrypted again. You have to know what the order of the packets is or it won’t work. UDP is a complete waste of time for things like the World Wide Web, email and nearly all the other applications people use on the Internet – it works for some important aspects of gaming, for example, but you still need TCP for anything to be organised and for the rest of the game (like any program) to work. UDP is disastrous for encryption or big data, anything that really needs not to be mangled and rearranged randomly in transmission.

Into the breach steps QUIC. To cut a long story and a tedious list of technical descriptions short, it is similar to a completely re-engineered kind of TCP that works over UDP and thus benefits from its speed. It makes sure the packets arrive in order. It works seamlessly when you change from your wi-fi connection to your data connection and back again, whereas your phone or laptop would otherwise complain that the connection was interrupted. We will keep HTTP/2 and earlier over TCP as fallbacks for the foreseeable future. HTTP/3 uses QUIC in order to make sure it works like TCP would – only a LOT faster.

What are the benefits?

It may be convenient that your video keeps playing as you move from your wi-fi to your phone’s data connection. But we live in a world of big data. What does that mean?

The Internet services and sites that we depend on move a LOT of data. Optimising this by a few percent makes new technology possible. But optimising it by, say, 20% would make it a LOT faster. I haven’t actually done any benchmarking tests because HTTP/3 is cutting-edge and there is almost no information beyond complex instructions for compiling servers and tools on the Internet to date. Some very clever programmers have, however. It is that sort of massive difference in speed and effectiveness. There is rather a big difference in our understanding of transmission protocols between 1974 and 2019, some 45 years of development.

Imagine you are a national library. Imagine you are a physicist with a huge data set. Imagine you are Facebook or Google with massive amounts of our data to move about. Also imagine, yourself as a user, if you can move things much faster, how much BIGGER amounts of data you (or your phone) can move in a shorter space of time: things that make your videos work, that may make video conferencing far better, the quality of video calls close to perfect. The extent to which web technologies can take advantage in order to design apps and services that we have not hitherto felt were practically achievable at scale should not be underestimated.

Will it be adopted?

Yes. HTTP/2 went from being unknown to being used by every web service in months, much faster than any such major version was adopted before. There are less than a dozen known HTTP/3 test servers but Facebook provides access to their entire site by it, despite it being experimental and no browsers having enabled it yet. Browsers get updated every other week so this functionality WILL soon be in them. Cloudflare have enabled it on ALL their servers for ALL their customers. That is a LOT of customers. They drove HTTP/2 adoption and much more. They use Nginx (a major web server, arguably the best major web server in existence since Apache, responsible for a huge slice of the modern Internet). It already has test HTTP/3 functionality, which will be rolled out to server administrators in the next year or so in all likelihood.

Watch this space. HTTP/3 is much, much bigger than HTTP/2. I’ve got a test server because I am nerdy enough to want one. But soon everybody will be doing it. Big companies know it because for them it means money. You will just see things get faster and maybe forget why. But lots of new technology will be possible because of it, just because we can move stuff around faster and more seamlessly.

Compiling and administering the h2o web server

1 Reply

Original version posted on 2015-10-25 10:26 GMT

Update: 2019-11-01 edited 2019-11-04

Although one does not now need to compile h2o for Debian or Ubuntu, it has recently come to my attention that test servers for experimental HTTP/3 support (over QUIC and UDP in place of TCP) have become available: among these is h2o but you currently need to compile the latest version rather than install the packaged one. Consequently, it seemed a good moment to make some updates and corrections to these instructions. There are two main ways to see HTTP/3 responses in action:

(1) Firstly you can use the latest nightly developer version of Google Chrome Canary with the command line arguments –enable-quic –quic-version=h3-23 as described by Cloudflare in their recent blog post and then enabling the developer tools and the protocol tab under Network. For the moment this will masquerade as “http2+quic/99” but it is really HTTP/3.

A view of the Chrome Canary Developer Tools showing HTTP/3 as “http2+quic/99”

(2) Alternatively, for a clearer command line response, you can compile cURL with Quiche and BoringSSL in order to make a request using HTTP/3. You can also compile it with ngtcp2 using nghttp3 and a patched version of OpenSSL but I found this problematic. I have not yet been able to get it working properly either on ARM, e.g. the RaspberryPi. Even so, the result of an instruction similar to the following returned my first HTTP/3 header successfully:

$ curl --http3 https://myserver.net:4433/ -I -k

HTTP/3 200
server: h2o
content-length: 193
date: Fri, 01 Nov 2019 10:32:48 GMT
content-type: text/html
last-modified: Tue, 29 Oct 2019 09:15:07 GMT
etag: "5db8031b-c1"
accept-ranges: bytes
alt-svc: h3-23=":8081"

Note (2019-11-02): HTTP/3 test end point now available on h2o

Interestingly for the adoption of this new standard, Cloudflare are backing it and Facebook have enabled HTTP/3 as well as the other test servers listed here. You can also compile a patched version of Nginx following these instructions by Cloudflare (who use it to deliver their proxy servers for customers) but I haven’t yet tried this because I use Nginx in production on all my servers.

Update: 2018-10-18

There is no longer any need to compile h2o for Debian since you can install the packages. I was using Debian 9 when I tried this ~~recently~~ originally. I have since succeeded on Ubuntu 19.10.

There are now file includes using the !file directive but these don’t work with wildcards so you can’t include a folder like sites-enabled so the method that I outlined below to concatenate these into a temporary file would still be required as things stand if you want to administer h2o like Apache or Nginx.

Because you can’t define how .php files will be handled per directory in the way that you can in Nginx, there is no way to have HHVM fall back to php7.0-fpm if it falls over, by catching 502 errors. All you can do in h2o is define custom error documents, which can’t be from custom locations on the server but must be somewhere in the web root or folders below.

Update: since HHVM ceased to support PHP the above struck out section is no longer especially relevant, although you could of course use it to provide support for Hack in the same way. You can also provide support for Python, Perl and others, as with Nginx.

I created custom error pages using PHP so that I could produce an output like the Nginx error page including the server software header but that is rather pointless in the case of 502 Bad Gateway since it usually happens when PHP falls over, in which case you would see the normal h2o string response “Internal Server Error” anyway! I did it like this:

<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center><?=$_SERVER['SERVER_SOFTWARE'];?></center>
</body>
</html>
<!-- a padding to disable MSIE and Chrome friendly error page -->
<!-- a padding to disable MSIE and Chrome friendly error page -->
<!-- a padding to disable MSIE and Chrome friendly error page -->
<!-- a padding to disable MSIE and Chrome friendly error page -->
<!-- a padding to disable MSIE and Chrome friendly error page -->
<!-- a padding to disable MSIE and Chrome friendly error page -->

Background

Installing h2o ~~currently~~ no longer (2018-10-18) requires compilation from source unless you wish to use HTTP/3 (2019-11-01). This is probably not for the faint-hearted but is surprisingly possible if you are comfortable with researching software dependencies as problems may arise. There are no guarantees and you may need to trace errors caused by issues specific to the configuration or installed software on your server. I succeeded using Ubuntu 14.04 Trusty Tahr LTS, while I have not yet succeeded on Debian 8 Jessie. In principle, it should be perfectly possible. My latest successful attempt (2019-11-01) was on Ubuntu 19.10 Eoan Ermine.

As previously with Nginx, it is comparatively easy to use h2o with ~~HHVM or~~ PHP-FPM via FastCGI in order to provide PHP support. You can use ps-watcher to increase the reliability of HHVM by bringing it back up if it falls over, as I described in my previous post about HHVM with Nginx. It is not possible, apparently, to provide automatic fallback to PHP-FPM. I’ve only been able to set up one or the other. However, I don’t think this is a major disadvantage. (See note above about HHVM no longer supporting PHP.)

What is a little bit more involved, though simple enough in principle, is setting up h2o to start up and operate in a standard way using distributed config files for virtual hosts in the way that has been packaged for servers such as Apache and Nginx. This is particularly complicated because the language chosen for configuration, YAML, does not fully [edited 2018-10-21] support include statements: the custom !file directive is now available but it still does not allow wildcard * includes. YAML is a programmer’s choice rather than a good systems administration choice, provided we are going to be purist in refusing include statements because they are not part of YAML. We can achieve a similar effect using /etc/init.d scripts, however: I will present a practical work-around method that I have created to do so below, following the installation instructions.

Kazuho Oku’s amazing work on this new generation HTTP/2 and now HTTP/3 server has provided a blazingly fast, efficient web server. Now is the time for work to make it more usable in practice.

Unless you want experimental HTTP/3 support (2019-11-01), jump on to “Configure h2o with distributed config files for virtual hosts” below, since the compiling steps are no longer required (2018-10-18) if you install the package normally in Debian or Ubuntu.

Installing h2o from source

Update: as of 2019-11-01 the packaged version of libuv-dev will now do the job, so you can now skip compiling it and install it with:

sudo apt install libuv-dev

~~First of all, we need to compile libuv 1.x, which is because libuv-dev does not currently meet the required version for h2o at present. So we will have to ensure that it’s uninstalled:~~

sudo apt-get remove libuv-dev

If you don’t have the general compilation tools, we must install those, which we will need for several other steps along the way as well:

sudo apt-get install libtool automake make

~~Now we can get on and do the job. (You’ll need unzip of course if it’s not installed.) Good luck!~~

wget https://github.com/libuv/libuv/archive/v1.x.zip
sudo apt-get install unzip
unzip v1.x.zip
./configure
make
sudo make install
cd

~~If this has succeeded,~~ we must now install wslay as follows:

INSTALL DEPENDENCIES

sudo apt install libcunit1 libcunit1-dev nettle-dev

THEN EITHER

wget https://github.com/tatsuhiro-t/wslay/archive/master.zip
unzip master.zip
cd wslay-master

OR

git clone https://github.com/tatsuhiro-t/wslay.git
cd wslay

THEN

autoreconf -i
automake
autoconf
./configure
make
make install

Now, if you want to compile h2o with mruby, which is used for custom script processing of requests in h2o configuration, then we must compile it now as well. It needs various additional tools first, as you’ll notice in the first line:

INSTALL DEPENDENCIES

sudo apt-get install ruby gcc bison clang

THEN EITHER

wget https://github.com/mruby/mruby/archive/1.1.0.tar.gz
tar xvf 1.1.0.tar.gz
cd mruby-1.1.0

OR

git clone https://github.com/mruby/mruby.git
cd mruby

THEN
make
make install
cp build/host/lib/libmruby.a /usr/local/lib/
cp build/host/lib/libmruby_core.a /usr/local/lib/
cp -R include/mr* /usr/local/include/
cd

If you have got this far, it’s time to compile h2o itself:

EITHER

wget https://github.com/h2o/h2o/archive/v1.5.2.tar.gz
sudo tar xvf v1.5.2.tar.gz
cd h2o-1.5.2

OR

git clone https://github.com/h2o/h2o.git
cd h2o

THEN
cmake -DWITH_BUNDLED_SSL=on -DWITH_MRUBY=ON .
make
sudo make install
cd

If you don’t want mruby for any reason, you can leave out -DWITH_MRUBY=ON above and don’t need to compile it either.

I really hope that this has worked for you. Now you should have h2o installed. It’s time to set it up.

Configure h2o with distributed config files for virtual hosts

First create /etc/h2o/h2o.conf as follows. The parts in red show the additions for HTTP/3.

# H2O config file -- /etc/h2o.conf
# to find out the configuration commands, run: h2o --help

server-name: "h2o"
user: www-data
access-log: "|rotatelogs -l -f -L /var/log/h2o/access.log -p /usr/share/h2o/compress_logs /var/log/h2o/access.log.%Y-%m-%d 86400"
error-log: "|rotatelogs -l -f -L /var/log/h2o/error.log -p /usr/share/h2o/compress_logs /var/log/h2o/error.log.%Y-%m-%d 86400"
#error-log: /var/log/h2o/error.log
#access-log: /var/log/h2o/access.log
#access-log: /dev/stdout

pid-file: /run/h2o.pid
listen: 80
listen: &ssl_listen
  port: 443
  ssl:
    certificate-file: /etc/ssl/certs/server.crt
    key-file: /etc/ssl/private/server.key
    #cipher-suite: ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-ECDSA-AES256-SHA384
    minimum-version: TLSv1.2
    cipher-suite: "ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256"
    # Oldest compatible clients: Firefox 27, Chrome 30, IE 11 on Windows 7, Edge, Opera 17, Safari 9, Android 5.0, and Java 8
    # see: https://wiki.mozilla.org/Security/Server_Side_TLS
# The following three lines enable HTTP/3
listen:
  <<: *ssl_listen
  type: quic # Doesn't work in h2o 2.2.5 hence no http/3 available
# See security issue https://www.mozilla.org/en-US/security/advisories/mfsa2015-44/
header.set: "Alt-Svc: h3-23=\":443\""

expires: 1 year
file.dirlisting: off
file.send-gzip: on
limit-request-body: 1024
#num-threads: 4

file.mime.addtypes:
  application/atom+xml: .xml
  application/zip: .zip

header.set: "strict-transport-security: max-age=39420000; includeSubDomains; preload"
#header.set: "content-security-policy: default-src 'none';style-src 'unsafe-inline';img-src https://example.com data: ;"
header.set: "x-frame-options: deny"

file.custom-handler:                  # handle PHP scripts using php-cgi (FastCGI mode)
  extension: .php
  fastcgi.connect:
    #port: /var/run/hhvm/hhvm.sock
    #type: unix
    port: 9000
    type: tcp
    #port: /run/php/php7.3-fpm.sock
    #type: unix

file.index: [ 'index.php', 'index.html' ]

hosts:
  "0.0.0.0:80":     
    #enforce-https: on                                     
    paths:
      /:
        #file.dir: /usr/share/h2o/examples/doc_root.alternate
        file.dir: /var/www/default
      #/backend:
        #proxy.reverse.url: http://127.0.0.1:8080/
        #fail: 
    #access-log: /dev/stdout
  "0.0.0.0:443":
    #enforce-https: on
    listen:
      port: 443
      ssl:
        certificate-file: /etc/ssl/certs/server.crt
        key-file: /etc/ssl/private/server.key
    paths:
      /:
        #file.dir: /usr/share/h2o/examples/doc_root.alternate
        file.dir: /var/www/default
      #/backend:
        #proxy.reverse.url: http://127.0.0.1:8080/
    #access-log: /dev/stdout

We will create /var/www/default containing nothing at all so that the server has something secure and predictable to fall back to if someone reaches it directly by IP address. You should always do this with any web server for good systems administration, including Nginx and Apache. It is better than choosing /var/www because this contains the other web roots and a fallback here can gain access to them if the paths are known.

sudo mkdir /var/www/default

Now we are going to create /etc/h2o/sites-available and /etc/h2o/sites-enabled following the pattern for Apache and Nginx. We will use these in the next section.

sudo mkdir /etc/h2o/sites-available
sudo mkdir /etc/h2o/sites-enabled

Now we must create an example virtual host e.g. /etc/h2o/sites-available/example.com as follows:

  "example.com:80":
    #enforce-https: on
    paths:
      /:
        file.dir: /var/www/example.com
    #access-log: /dev/stdout
    header.set: "content-security-policy: default-src 'none';style-src 'unsafe-inline';img-src https://example.com data: ;"
  "example.com:443":
    #enforce-https: on
    listen:
      port: 443
      ssl:
        certificate-file: /etc/ssl/certs/server.crt
        key-file: /etc/ssl/private/server.key
    paths:
      /:
        file.dir: /var/www/example.com
    #access-log: /dev/stdout
    header.set: "content-security-policy: default-src 'none';style-src 'unsafe-inline';img-src https://example.com data: ;"

Now we also need to create a folder and link the config files.

sudo mkdir /var/www/example.com
sudo ln -s /etc/h2o/sites-available/example.com /etc/h2o/sites-enabled/example.com

Configure start-up using init.d

I have not yet succeeded with upstart or systemd, so I’ve used the old-fashioned sysvinit system that still exists in all Linux distributions and is relied on in Ubuntu 14.04 for major software packages that include web servers such as Apache and Nginx.

We will now create /etc/init.d/h2o as follows:

#!/bin/sh

### BEGIN INIT INFO
# Provides:          h2o
# Required-Start:    $local_fs $remote_fs $network $syslog
# Required-Stop:     $local_fs $remote_fs $network $syslog
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
# Short-Description: starts the h2o web server
# Description:       starts h2o using start-stop-daemon
### END INIT INFO

PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
RUN_DIR=/tmp
DAEMON=/usr/local/bin/h2o
DAEMON_OPTS='-c /tmp/h2o.conf -m daemon'
NAME=h2o
DESC=h2o

# Include h2o defaults if available
if [ -f /etc/default/h2o ]; then
	. /etc/default/h2o
fi

test -x $DAEMON || exit 0

set -e

. /lib/lsb/init-functions

case "$1" in
	start)
		echo -n "Starting $DESC: "
		# Check if the ULIMIT is set in /etc/default/h2o
		if [ -n "$ULIMIT" ]; then
			# Set the ulimits
			ulimit $ULIMIT
		fi
	        rm -f $RUN_DIR/h2o.conf
        	cat /etc/h2o/h2o.conf /etc/h2o/sites-enabled/* > $RUN_DIR/h2o.conf
		$DAEMON $DAEMON_OPTS
		echo "$NAME."
		;;

	stop)
		echo -n "Stopping $DESC: "
		kill -TERM `cat $RUN_DIR/h2o.pid`
		echo "$NAME."
		;;

	restart|force-reload)
		echo -n "Restarting $DESC: "
		if [ -f $RUN_DIR/h2o.pid ]; then
			kill -TERM `cat $RUN_DIR/h2o.pid`
		fi
		sleep 1
		# Check if the ULIMIT is set in /etc/default/h2o
		if [ -n "$ULIMIT" ]; then
			# Set the ulimits
			ulimit $ULIMIT
		fi
	        rm -f $RUN_DIR/h2o.conf
        	cat /etc/h2o/h2o.conf /etc/h2o/sites-enabled/* > $RUN_DIR/h2o.conf
		$DAEMON $DAEMON_OPTS
		echo "$NAME."
		;;

        reload)
                echo -n "Reloading $DESC: "
                if [ -f $RUN_DIR/h2o.pid ]; then
                        kill -TERM `cat $RUN_DIR/h2o.pid`
                fi
                sleep 1
                # Check if the ULIMIT is set in /etc/default/h2o
                if [ -n "$ULIMIT" ]; then
                        # Set the ulimits
                        ulimit $ULIMIT
                fi
                rm -f $RUN_DIR/h2o.conf
                cat /etc/h2o/h2o.conf /etc/h2o/sites-enabled/* > $RUN_DIR/h2o.conf
                $DAEMON $DAEMON_OPTS
                echo "$NAME."
                ;;

	status)
		status_of_proc -p $RUN_DIR/$NAME.pid "$DAEMON" h2o && exit 0 || exit $?
		;;
	*)
		echo "Usage: $NAME {start|stop|restart|reload|force-reload|status|configtest}" >&2
		exit 1
		;;
esac

exit 0

Now we must change the permissions to enable this script:

sudo chmod +x /etc/init.d/h2o

Finally we must enable it on start-up and, if appropriate, stop and disable start-up of Nginx or Apache so that these don’t conflict. I will use Nginx as an example here but you can substitute Apache or another server if you are already running these:

sudo service nginx stop
sudo update-rc.d nginx disable

sudo chmod +x /etc/init.d/h2o
sudo chown root:root /etc/init.d/h2o
sudo update-rc.d h2o defaults
sudo update-rc.d celeryd h2o

It is now time to start up the service:

sudo service h2o start

Update: the next step is not necessary if you have installed from a package (2018-10-18).

Finally, there are a number of things to move into place:

cd ~/h2o-1.5.2
sudo mkdir /usr/share/doc/h2o
sudo cp -r doc/* /usr/share/doc/h2o
sudo cp LICENSE /usr/share/doc/h2o/
sudo cp README.md /usr/share/doc/h2o/
sudo cp Changes /usr/share/doc/h2o/
sudo mkdir /usr/share/h2o
sudo cp -r share/h2o/* /usr/share/h2o
sudo cp -r examples /usr/share/h2o

Hooray, if this has all worked, you now have h2o working in a way following standard systems administration methods for web servers.

Installing FreeSwitch on Raspbian 8

Leave a reply

This is an updated version of some instructions that I found elsewhere. Thanks to Tom O’Connor. Some key details needed to be changed and dependencies met for it to work so I decided to document them in brief, to pass on the help that I received. I’m going to keep this short otherwise, as you can find out more from Tom’s experience.

Incidentally, FreeSwitch 1.6 won’t compile on Debian Wheezy, so you’ll need to stick with 1.4 for that. Also, it may be worth knowing that 1.6 won’t compile on i386 (32 bit, i.e. x86_32) machines and currently requires i686 (i.e. x86_64 or x64). Neither of these is relevant to ARM and therefore to Raspberry Pi units but it may be interesting to some readers anyway. If you’re interested in Raspbian you may well also use Debian proper on other machines.

Install the components. It moaned about numerous missing dependencies and I have added them here. It might not be immediately obvious from the errors exactly what is missing so I needed to do some research on line, add the package, run ./configure, try again, etc… It took a lot of attempts to get all of them, which was frustratingly slow.

sudo apt-get update
sudo apt-get install build-essential git-core build-essential autoconf automake libtool libncurses5 libncurses5-dev make libjpeg-dev pkg-config unixodbc unixodbc-dev zlib1g-dev libcurl4-openssl-dev libexpat1-dev libssl-dev screen libtool-bin sqlite3 libsqlite3-dev libpcre3 libpcre3-dev libspeex-dev libspeexdsp-dev libldns-dev libedit-dev liblua5.1-0-dev libopus-dev libsndfile-dev
screen -S compile

Now you are in a screen session. This is because it’s a long job and you don’t want it interrupted and have to start all over again more than you will probably end up doing already. The Git repository has moved, hence the change to the instructions.

sudo -s
cd /usr/local/src
git clone https://freeswitch.org/stash/scm/fs/freeswitch.git freeswitch.git
cd freeswitch.git
./bootstrap.sh

In fact, for a basic installation on a Raspberry Pi, you probably don’t want to alter modules.conf to include FLITE because of the memory imprint.

./configure
make && make install && make all install cd-sounds-install cd-moh-install

That seems to be about all that’s needed. It takes a very long time to configure everything and compile all the files on a Raspberry Pi. I used a model B of the original version and no doubt it will be faster for people using RPi2 or even RPi3. I hope you are feeling patient, but if you like Raspberry Pi projects then you must be! You might want to look into finding out how to cross-compile this instead.

An alternative is Asterisk with its GUI called FreePBX. I’ve heard that Asterisk is a bigger beast and so I’ve steered clear from it for now. I’ve had success with FreeSwitch in the past.

HHVM with PHP-FPM fallback on Nginx

1 Reply

HHVM is the Hip Hop Virtual Machine, developed under the PHP licence by Facebook:

HHVM is an open-source virtual machine designed for executing programs written in Hack and PHP. HHVM uses a just-in-time (JIT) compilation approach to achieve superior performance while maintaining the development flexibility that PHP provides. HHVM runs much of the world’s existing PHP. […]

Essentially, Facebook have provided a more strongly typed, better version of PHP called Hack but they have also created a migration strategy for old code, in addition to providing PHP using Just In Time (JIT) methods on a virtual machine. Presumably people will continue to use PHP ad infinitum anyway.

Install HHVM (and PHP-FPM if you haven’t already)

In order to get HHVM working on Ubuntu 14.04 LTS (Trusty Tahr), I followed Digital Ocean’s instructions with reference to Bjørn Johansen’s basic instructions and his further instructions to use HHVM with PHP-FPM as a fallback in the event that HHVM should fail. I give them full credit for this, though I have adapted them slightly in the odd place or two below: Import the GnuPG public keys for the HHVM repository, install the repository, update the sources, then install HHVM:

sudo apt-key adv --recv-keys --keyserver hkp://keyserver.ubuntu.com:80 0x5a16e7281be7a449
sudo add-apt-repository "deb http://dl.hhvm.com/ubuntu $(lsb_release -sc) main"
sudo apt-get update
sudo apt-get install hhvm

Make sure HHVM starts when the system is booted:

update-rc.d hhvm defaults

Optionally, replace php5-cli with HHVM for command line scripts:

/usr/bin/update-alternatives --install /usr/bin/php php /usr/bin/hhvm 60

You could then uninstall php5-cli if you like. I will presume that you are using PHP-FPM but if not, for any reason:

sudo apt-get install php5-fpm

Configure HHVM on Nginx with PHP-FPM as fallback

We will assume here that you are using Nginx, but if not:

sudo apt-get install nginx

(You will need to prevent Nginx conflicting with ports used by other servers like Apache by uninstalling them, choosing different ports, disabling them or whatever. If the latter, make sure they stay disabled when the server restarts: this is beyond our scope here for now.) Now add the config, which I chose to include from /etc/nginx/php-hhvm.conf but you could add to each config file in /etc/nginx/sites-available instead:

        # pass the PHP scripts to FastCGI server
        #
        location ~ \.(hh|php)$ {
                fastcgi_intercept_errors on;
                error_page 502 = @fallback;

                try_files $uri $uri/ =404;
                fastcgi_split_path_info ^(.+\.php)(/.+)$;
                # NOTE: You should have "cgi.fix_pathinfo = 0;" in php.ini

                fastcgi_keep_conn on;

                # Using a port:
                fastcgi_pass 127.0.0.1:9000;
                fastcgi_param   SCRIPT_FILENAME $document_root$fastcgi_script_name;
                fastcgi_param   SERVER_NAME $host;
                # Using a web socket:
                ##fastcgi_pass unix:/var/run/hhvm.sock;
                fastcgi_index index.php;
                include fastcgi_params;
        }

        location @fallback {
                try_files $uri =404;
                fastcgi_split_path_info ^(.+\.php)(/.+)$;
                include         fastcgi_params;
                fastcgi_index   index.php;
                fastcgi_param   SCRIPT_FILENAME $document_root$fastcgi_script_name;
                fastcgi_param   SERVER_NAME $host;
                # Using a web socket:
                fastcgi_pass    unix:/var/run/php5-fpm.sock;
        }

Now, if you decide not to repeat it in every single config file, add the following to these instead:

	include /etc/nginx/php-hhvm.conf;

Now restart Nginx:

sudo service nginx restart

Test the fallback

You can now test it as follows (replacing the URL with your own):

curl -I https://example.com

If your x509 server certificate (HTTPS) is self-signed or otherwise gets refused, try:

curl -Ik https://example.com

You should now see this header (or similar) in the response:

X-Powered-By: HHVM/3.4.0

Now kill HHVM:

sudo service hhvm stop

Try again with cURL as before. This time you should get something like this:

X-Powered-By: PHP/5.5.9-1ubuntu4.5

However, you may get nothing if, like me, you have changed the settings in /etc/php5/fpm/php.ini not to expose the header:

; http://php.net/expose-php
expose_php = Off

Now I am going to do the same in /etc/hhvm/php.ini as well because it may foil some of the less competent hackers:

; Set 0 to hide the software and version or 1 to show off that we're using HHVM ;-)
expose_php = 0

Restart HHVM automatically

Now we need to restart HHVM if it should fail by installing ps-watcher as follows:

sudo apt-get install ps-watcher

Edit /etc/ps-watcher.conf (you might have to create it) and add the following lines:

[hhvm]
occurs = none
action = service hhvm restart

Now enable ps-watcher to start:

sudo sed -i -e 's/# startup=1/startup=1/g' /etc/default/ps-watcher

Start ps-watcher:

sudo service ps-watcher start

If you now kill HHVM manually as above (or it ever falls over), you should see it come back up within 150 seconds. You could change that number above if you liked, of course. With thanks to Bjørn Johansen and also to Digital Ocean from whom I have adapted these instructions for my own needs.

HTTP/2 is here

Leave a reply

In order to get HTTP/2 working, it’s no longer necessary to use experimental servers like h2o as I did earlier this year, although it should be said that h2o is blazingly fast and is worth considering seriously, especially for projects that need to move large data requests fast. Now Apache has the experimental mod_2 module and Nginx has the HTTP/2 module, so you can do it fairly simply with mainstream servers too.

I followed some instructions on how to upgrade to Nginx 1.9.5. However, my version of Ubuntu (Trusty Tahr, 14.04 LTS) only has Nginx 1.9.4 and the Nginx mainline (development) repository doesn’t seem to be upgrading. I added Chris Lea’s experimental repository instead:

In brief:

sudo add-apt-repository ppa:chris-lea/nginx-devel
sudo apt-get update
sudo apt-get install nginx

You’ll note that I have installed nginx-full, which is the standard version contained in the metapackage nginx, but you can choose any of the three versions nginx-light, nginx-full, nginx-extras according to what you need.

Then you need to change all lines in the config files in /etc/nginx/sites-available from the following with SPDY, the predecessor of HTTP/2:

listen 443 ssl spdy;
listen [::]:443 ssl spdy;

These can be quite simply changed. Unfortunately, I had tons on my server, so it took ages! I know, I should have done it with a find and replace. Next time maybe I will learn my lesson.

listen 443 ssl http2;
listen [::]:443 ssl http2;

Lastly, I had added a header for SPDY in my custom /etc/tls.conf, which I include in these to avoid repeating code. Here is the relevant line that I am now able to simply comment out:

#add_header        Alternate-Protocol  443:npn-spdy/2,443:npn-spdy/3,443:npn-spdy/3.1;

And that is it!

Browser misinformation about secure sites

Leave a reply

Put simply, a certificate is the document that you will see that tells you that a secure connection has been made using mathematical encryption that is currently impossible to break in most cases, i.e. messages are being sent from you to the server and back using a very, very good code. In that case, if people intercept the messages, they are going to have a hard time reading them unless they have the key to the code, which is secret. That’s how it all works.

The thing we are talking about is the secure HTTPS variety of HTTP, which uses security called Transport Layer Security (TLS), often known informally by the name of its predecessor, the Secure Sockets Layer (SSL). But you don’t need to be confused by the jargon terms. You may know HTTPS better as the padlock symbol by the address bar.

It’s very likely that you will have seen this message in Chrome:

Your connection is not private
Attackers may be tying to steal your information from whatever.domain.com (for example,
passwords, messages or credit cards). NET::ERR_CERT_AUTHORITY_INVALID

Or else this message in Firefox:

This Connection is Untrusted
You have asked Firefox to connect securely to whatever.domain.com, but we can’t confirm that your connection is secure.
Normally, when you try to connect securely, sites will present trusted identification to prove that you are going to the right place. However, this site’s identity can’t be verified.
What Should I Do?
If you usually connect to this site without problems, this error could mean that someone is trying to impersonate the site, and you shouldn’t continue.

Or else this message in Safari:

Safari can’t verify the identity of website “whatever.domain.com”.
The certificate for this website is invalid. You might be connecting to
a website that is pretending to be “whatever.domain.com”, which could put your
confidential information at risk. Would you like to connect to the website
anyway?

I’m afraid that I don’t have access to Internet Explorer (scheduled to be replaced by Microsoft by a product with a new name, codenamed project Spartan), so I would appreciate any comments about what message it gives here. This is because I have only Mac OS X and Linux devices. I am being lazy about checking the message Android gives me, but the same problems arise with mobile devices.

The implication of all of these messages is that there is something wrong with these https:// (secure) connections. It isn’t as simple as this: in short, they are pretty much lying to you. There are several issues at stake here, all of which depend on the precise wording:

1. “Not private” (Chrome)

This claim is untrue. In fact, the issue is in fact that the connection is secure and may be private but there is a unverified possibility that it has been made with a server other than the server that is claimed, i.e. intercepted: they may or may not then pass on traffic to and from the real server in order to gain information. Simply, we don’t know if it’s safe or dangerous. Or at least, your computer doesn’t know, whether or not you do personally.

Then again, it might just as easily be the correct server: the issue is that Chrome does not know that. What is definitely true is that it’s a far better connection than an unsecured http:// connection because at least you know that the rest of the Internet cannot see the traffic, i.e. it is more private by an order of magnitude than broadcasting any private information in clear text. I am not telling you to trust it, but it’s safer than no encryption at all.

Of course, if you are not intending to enter any private information into this web site anyway, it is misleading to make you worry about it because you are not then automatically at risk. Web traffic does not automatically put your private information at risk unless you exchange it, i.e. you are on a site where you need to be logged in, you are buying things etc. Don’t let the browsers fool you into believing that you are always at risk in some ill-defined way.

2. “Untrusted” (Firefox)

The idea that the connection is untrusted rather than unverified is untrue, although it is not quite as terrible. The question is, trusted by whom? How and why? You, as an ordinary user, have not examined the certificates supplied by the browser either, so you only have it on trust from that browser that they are valid and hence “trusted”. It’s possible though rather unlikely that you downloaded a bad copy of the browser because even that download site was impersonated. But did you check this? I bet you didn’t.

The certificate on the site that you are connecting to may be equally trustworthy but you haven’t imported it into your browser yet and probably don’t know how to do so. The browser isn’t giving you a good idea of how to do that, either.

So you’ll never get to decide who to trust and do something about it, as this system of certificates originally intended. The system itself works but has been hijacked by browsers and commercial interests, as we will read further below. Use it for your safety, but use it carefully and with knowledge of how they are trying to manipulate your lack of technical expertise.

3. “Trusted identification” (Firefox)

This concept is misleading: trusted by whom? How and why? Again, as in (2) above, the commercial certificate authorities (CAs) are not more trustworthy and, in fact, are able and likely to allow security authorities in their countries (usually the USA, UK and other western nations) to have access to those certificates, enabling the connection to be intercepted by government agents. You may not be worried about that aspect, as an ordinary user with nothing to hide from the government (at the moment) or else you may be. But I bet you didn’t know that either: you should have had that choice to decide. You didn’t get told about that choice.

In fact, unknown certificate authorities might be more trustworthy because you know that the government or other parties that you know about do not have access to the private key for the root certificate, not necessarily less trustworthy in every case. Simply, it depends on which certificate or certificate authority we are talking about.

3. “Invalid” (Chrome and Safari)

The idea that the certificate is invalid is in most cases likely to be untrue, although this is also sometimes possible: the question that is not made clear enough is why the browser considers it to be invalid: (a) simply being from an unknown authority is not evidence that the certificate is invalid, as is claimed incorrectly here; (b) if, for example, it is out of date, or if the domain name on the certificate differs from the one you are connecting to (which does also happen) then, yes, it’s invalid. This happens occasionally, but mostly through careless administration.

Mixing up these two things creates a simple lie that does nothing to make the Internet more secure or help ordinary users understand what certificates actually do, i.e. certificates are inherently the root of all Internet security and are good when used well.

4. Scaremongering

The phrases “Get me out of here!” and “Back to safety” are simple scaremongering. They are designed to make the majority of people without technical knowledge to run for cover.

What is really happening here then?

People are misled into believing that “trusted” certificates are good and that “untrusted” certificates are bad, without having any idea of why some are trusted and some are not, who issues them and what those authorities actually do for them.

The truth is that those big commercial certificate authorities make money for nothing.

In about five minutes using a Linux server, I can create my own root certificate authority and issue certificates that are as good if not sometimes better than theirs. I can then sign other people’s certificates, which states that those certificates are trusted by my certificate authority. If you import my root certificate into your browser then all certificates that are trusted by me will afterwards be trusted by your browser.

How do they get way with it?

The advantage that they have over me is that they have an agreement with the major browsers and are automatically trusted. I have not, so my certificates give the nasty red error. There is a commercial stitch-up that means ordinary users will never want my certificates because the browsers will never incorporate my root certificates, so my certificate authority is useless.

Meanwhile, people who set up secure web sites are forced to buy from the commercial certificate authorities. They simply sit back, wave a magic wand and wait for your money for a service that anybody can provide for nothing. There is almost no effort required on their part.

Are their certificates “safer”?

Their certificates are no guarantee of security. To get one, I need one small thing: an email address on that domain name. I can buy a domain name for less than £10 and get my email forwarded via an address on that domain. That is enough proof to buy a certificate. From then on, the browsers guarantee that I am trustworthy. But there is no reason to believe that my site is safe or that I am not trying to steal your information!

I am not the only one saying this

It’s all over the internet. The best synopsis that I have ever read is by Andrews and Arnold, an extremely professional and expert supplier of broadband and internet services based in Great Britain. The problem with most information is that it’s not aimed at the average reader.

What should you do?

At present, you can’t do much immediately except make your own decisions about which sites to trust. But you should really complain to your browsers about the scaremongering. They need to make it clear that “unverified” does not mean “untrusted” (in turn, something that they should not assert on your behalf) and that, even worse, “unverified” or “untrusted” are entirely different from “invalid”. You can actually influence them, if enough people ask.

Do not stop using encrypted https:// connections. They keep you safe. Be aware that the lack of a scary red browser warning does not mean it is safe. The fact that you do get a scary red browser warning does not mean that it is unsafe. If you are concerned, check that the domain on the certificate is the one you actually tried to see and that the date is valid: the browser will tell you this if you select the advanced option to see more details on the certificate. This, however, is not a guarantee of safety either, as it’s simple to get right.

Don’t be scared by everything. Consider whether or not the page you are looking at sends any of your private information over the internet or not, i.e. are you logged in, have you entered any information into any forms, have you explicitly given any permissions to use your private data, have you checked the padlock symbol in the address bar or whether the connection is https:// (secure) or not etc. Some sites can be dangerous but the browser does provide protections.

Ultimately the only protection is to go to sites that you trust. Always (re-)type the address yourself rather than following links (and especially avoid doing so from emails, even if they seem ok, as these are often faked).

Don’t let your choices be made for you

Do not let the browser make your choices for you. This is complacency and can lead to you visiting sites that are unsafe, while avoiding others that may well be perfectly safe. It can at times be safe to override the warnings and view sites anyway. Beware of the warnings and do consider them, but don’t take them as absolute truth.

Reviving the full Welsh breakfast

Leave a reply

This is a variant on the usual sort of full breakfast that you see across the British Isles and Ireland, i.e. served with fried eggs, toast and so on. I am a vegetarian, so I don’t add the (for me) horrible bits like bacon and sausages. There are several important differences:

Leeks in a pan

(1) braise some finely cut leeks in butter until it is soft. You can caramelize these slightly if you prefer them that way, but otherwise cover with a lid and make sure that they remain wet enough to braise rather than fry by adding a few drops of water if necessary. Don’t burn the butter or else it will taste bad. Leeks are the really distinctive part of a Welsh breakfast.

(2) add Glamorgan sausages, a traditional vegetarian sausage from South Wales containing herbs, cheese, leeks and flour amongst other things. I buy mine but I ought to learn how to make them.

(3) Mushrooms. If you don’t like them, you’re missing out. The browner ones of the types that are commonly available are nicest: field mushrooms, portabello etc. Traditionally you should add herbs, most commonly thyme, sage, chives, garlic chives or rosemary. However, non-traditional alternatives like Herbes de Provence, Basil Mint (neither like Basil nor Mint despite the name) etc are good too. On occasions, I add some garlic. A little wild garlic is also nice, just for flavour (though you could use it as a vegetable as below, if you like garlic). It is seasonal.

(4) In South Wales especially, and coastal areas, there is a preference for a type of seaweed mush called laverbread. I had some of this when I was younger and wasn’t that impressed, but my tastes have changed and I really need to try it again, to be fair. It has a very strong, unique taste, so be warned. The iodine is very good for you. A very nice, milder substitute is spinach, which is a great source of dietary iron: for best results, wilt slowly with a few drops of water in a covered pan. Again, take care not to burn it. If you like garlic, you could use the leaves from wild garlic in larger quantities, during the season. It is milder and sweeter than garlic.

This is the full Welsh Breakfast. Sadly, very few of the above elements are now commonly seen, as most people have just fallen back on the so-called full “English” breakfast, which in reality is not just English but is common to all of the British Isles and Ireland.

(For those readers who are not familiar with Great Britain, the word England does not cover the entire extent of Great Britain, which is also comprised of Wales, Cornwall and Scotland; meanwhile, the province of Northern Ireland (the larger part but not all of Ulster) remains within the same political entity as Great Britain, while the southern part of Ireland seceded from it: together, they form the “United Kingdom of Great Britain and Northern Ireland”. The incorrect habit of saying “England” for all of these is about as offensive as referring to Canada as part of America, Belgium as part of France or Austria as part of Germany. We are in the terrible habit of saying “Holland” for all of the Netherlands, which is a similar mistake. Using the acronym UK is politically correct jargon, however: we have been known throughout history as Great Britain, or Britain for short, notwithstanding the inclusion of Ireland and later Northern Ireland in the same political state. The “Great” is not a claim of greatness, but is by comparison with the former “Less Britain” i.e. Brittany, a former political state now governed by and included within the French Republic. Note that we don’t say “France and Corsica” for political correctness, or even “the French Republic” in normal speech, just “France” and “Corsica” separately. Similarly, “(Great) Britain” and “Northern Ireland” are fine for most non-official purposes.)

HTTP/2 – a faster Web

2 Replies

Background

The last three versions of HTTP were 0.9 (1991), 1.0 (1996) and the present version 1.1 (1997, improved 1999 & 2007), all of which are text protocols. Of these, the majority of internet traffic is now HTTP/1.1 but HTTP/1.0 is still used by certain tools that do not require persistent connections, which is a key innovation that was introduced in HTTP/1.1 via a keep-alive mechanism. Other improvements included chunked transfer encoding, HTTP pipelining and byte serving, all of which were designed to speed up data transfers to clients.

HTTP/2

This year, version 2 (not 2.0) has finally been released. It was developed out of the SPDY development created by Google and largely pioneered by the Nginx web server, which is still responsible for almost all SPDY traffic. It contains a raft of other optimisations but will interoperate with existing HTTP methods. Nginx is committed to implementing HTTP/2 by the last quarter of 2015, the only major web server that has made such a commitment to date.

Unlike it’s predecessors, it is a binary protocol, which means that data will be sent in a considerably more efficient stream. One small consequence is that you can no longer do the equivalent of this over either port 80 (HTTP) or port 443 (HTTPS):

telnet myserver.net 80
GET / HTTP/1.1
Host: myserver.net

Connection: close

This is sad for people who like old tools to keep working, but it won’t be long before tools are in place to do this better over HTTP/2 as well as the older versions, and it’s not the end of the world if we can’t test connections using the tried-and-trusted but venerable telnet, which is honestly not used for a lot in real action these days because it’s totally insecure.

It is already possible to make standards-compliant HTTP/2 requests using Firefox or Google Chrome Canary (the cutting-edge development version of Chrome), which you can test by clicking right on the page and selecting Inspect Element and the Network tab before refreshing the tab. In Chrome Canary you will need to add the protocol tab by clicking on the headings in the table and selecting it. However, you will find it relatively hard at present to find a web server capable of HTTP/2. This is only now beginning to be possible.

Servers that work now

I have tested h2o, which is an experimental, optimised HTTP web server that supports HTTP/2 as well as previous versions of the protocol. The other options are nghttp2 (which also has an experimental HTTP/2 web proxy) and Trusterd. You will notice that these need to be compiled, i.e. they are not yet available as Linux packages in any major distribution. This process requires a little more than the average sysadmin skills, though it was relatively easy with h2o once various dependencies were also installed.

Please note that all of the on-line HTTP response header testing tools that I can find are not yet capable of HTTP/2 and therefore the server will respond over HTTP/1.1 according to the request, so you need to do a bit more in order to verify that HTTP/2 is actually working, as described above: so far only two major browsers can show the protocol in action.

Another thing that you will notice is that these servers seem to fall back to HTTP/1.1 when the connection is insecure, i.e. HTTP rather than HTTPS. However, this isn’t actually set by the servers but by the browser: both Chrome and Firefox require HTTPS for HTTP/2, an issue that has been contentious because of the commercial control of x.509 TLS certificates. The main gain in mandating HTTPS is that a large number of man-in-the-middle (MiTM) attacks that arise where only part of a site is secure would be eliminated, i.e. where an attacker monitors traffic and then provides a fake certificate in order to intercept traffic to the real secure site, which means that all encrypted data could be read by that attacker, including passwords etc.

Ok, so why do we care?

If you run a small site, you will probably see rather little change, though it may well be that sites served from content management systems like WordPress, Joomla! and Drupal, which have rather large payloads, as well as any sites with lots of images, CSS or JavaScript, will load noticeably faster. If you have a decent, high-spec server, you may not even notice this.

Who will care, however, are people who need to transfer very large amounts of data for big web services. For instance, your social network providers like Twitter and Facebook will want to do this, and you will most probably see page load times decrease. Anybody who is building data-driven web projects, for example universities and industry, will see the benefits.

This will be significant for our work at Morgan Price Networks during 2015-6 in particular, as the new technology is gradually implemented by major web servers. We will probably be using it in production before a lot of people, since we use Nginx for preference (though we still use Apache too). It’s fast and it’s easy to configure securely; in addition, it’s a great reverse proxy; most of all, it moves data fast.

The “neo-Breton” construct: thought experiments and educational practice

Leave a reply

Background

There was recently another acrimonious exchange between myself and two other members of the Celtic Linguistics group that I set up on Facebook. It doesn’t serve much purpose to rehearse it again here, but I will simply say in my defence that I sought to bring the debate back to whether or not “neo-Breton” is real or an artificial academic construct that does more pedagogical and sociological harm within the Breton-speaking community than anything else. I am not remotely interested in damaging the reputations of the those concerned, only the divisive and discriminatory idea that they have advanced. I tried to draw them into explaining their exact positions here but this was not forthcoming in a clear way. For my part, I regret the personal level on which it was conducted and said so openly in the group.

Here, I want to focus mainly on the issue at hand: standard registers and dialects.

The nature of debate in general

This paragraph is a slight tangent which readers interested only in the issue may skip.

I fully understand that this “neo-Breton” idea has been advanced in academic fora (e.g. Trier has been mentioned) and has the support of published academics. No matter how high their standing, all academic issues must be resolved only on the strength of the ideas, not who said them or how well published they are. This is why I believe passionately in groups such as the Facebook group because everyone, no matter the state of their knowledge, may participate. There is a duty on academics, long neglected, to convey their knowledge to others. We are teachers, not just researchers jealously guarding our knowledge within a career ladder of academics. We are also just normal people with opinions, sometimes correct or ground-breaking and more often not. So are the senior academics whose work we may quote. That’s the lazy way out, rather than thinking the ideas through constantly form first principles and trying to find out where they break. This is what the best of those academics who we quote do themselves. They abandon broken ideas in order to get closer to truth.

The problem with the “neo-Breton” idea

It was not made clear what “neo-Breton” actually is except the claim that it is not mutually intelligible with native dialects. I have heard such claims regarding Welsh and consider them extreme and far too sweeping generalisms to be useful or accurate. In order to be taken seriously, they need to be substantiated on a granular evidential level, which is hard to achieve in practice. Instead, let us consider other languages. I will use English and Welsh but any will do, minority or otherwise. That is not to say that the circumstances are always the same but they are nonetheless usefully comparable. All are languages and no language is immune from the sociological imperatives that arise from being communication systems.

As a native speaker of English, I am aware that I can understand people in Britain but not always from elsewhere from most sociological backgrounds but not all. Even in Britain, there are exceptions in dialect areas that I have less contact with (including via the media) and where people’s sociological circumstances differ massively from my own. If people come from other extremes of the dialect continuum e.g. broad Scots (arguably another point in a diasystem of two languages, but that is another issue), varieties of speech from the US, Canada etc, there are occasional problems. (I usually find that Australian and New Zealand speech has many of the same dialect inputs as my dialect, so this tends to be easier for me.)

There is a role for standard registers, which arise out of a long process of speakers of different dialects trying to understand each other. Proponents of neo-Breton counter – I believe without good foundation – that instead a single constructed variety of Breton was created and continues to be advanced by an unidentified movement. This is the allegation formerly made of Cymraeg Byw and still generally of “learners’ Welsh”. The problem is that it’s accurate in the first case (which failed as a result) but far too general in the second. You cannot fairly impose a sweeping judgement of the teaching methods of vast numbers of Welsh or Breton tutors without providing evidential material about how their Welsh is or isn’t native in style. Thus it descends into a simple unevidenced prejudice. I shall discuss its divisiveness later.

An example of Indian English

Let’s imagine a man from India who learns English in a job in a call centre serving the UK. I shall add some background to make it realistic, i.e. that he is called Rajesh and has a mix of religions in his family but is himself a Hindu, that he is obliged to call himself Mike once he comes into work, where he is obliged to eat western food at lunch etc. So much for the stereotypical portrait. He has two evening slots available for English lessons owing to his wife’s work and their family commitments. However, one is less convenient because he will be very tired. This means he is learning from a non-native Indian, albeit fairly proficient but who makes some grammatical mistakes inspired by his or her first language. (Experience: I am a half-Punjabi and can speak confidently about the persistence of this even in fluent English.) He cannot go to the class run by a teacher who was born and bred in Birmingham, who has a native command both of standard English and his native dialect (language registers) but always has at least a slight Birmingham accent no matter how he speaks.

When Rajesh becomes fluent, is he speaking neo-English? At first he is not very idiomatic but easily comprehensible by good speakers of English who have had a wide exposure to many other speech registers. There are those who can’t follow him, of course, for whatever reasons relating to which speech registers they find easy, based on their personal experiences. Ideally, Rajesh might have gone to Jim’s class but he went to Indra’s instead. On the other hand, Jim never learned English grammar but Indra provides practical lessons in constructing good sentences based on her grammatical knowledge – she doesn’t actually teach grammar. Rajesh needs to improve in the standard register right now.

Later, now living in Birmingham, Rajesh needs to understand native dialects, which he picks up fast because by now his English has a secure grounding. But he struggles visiting his family in Canada because he doesn’t have the exposure to native dialects that Jim does. Indra has more experience but she can’t rival Jim, yet actually her advanced writing skills are in some ways better than his, e.g. in writing formal funding proposals and her own fiction and poetry.

What about Breton?

In Breton, native dialect speech is under threat. There are indeed purists who teach artificial types of Breton but it’s straining reality to breaking point to say that these are in no way mutually intelligible with natural dialects. Only one major proponent of the idea of neo-Breton, Iwan Wmffre, has gone into extensive details about what he considers it to be. But these seem instead to be admittedly very valuable analyses of dialect contractions, common in English as well, e.g. Rajesh might have issues with understanding a Thurrock speaker’s rapid “y’ g’na wai’ tiw yaw bruva az iz bifaw you av yaw ba'(k)un ‘n ash braans?” (my attempted rendering for non-linguists, which represents the relatively difficult phonetics imperfectly).

There are more speakers like Rajesh in the world than speakers like me. On a far greater scale, we see a strange comparison with Breton. Yet we would like to think that, while English will change like any living languages, the old dialects will continue and change too, and that “international” speakers will, as they mostly do now, hold up native American or British speech in the broadest and crudest sense as something to emulate. If they live in native speech areas, they typically adapt to the first or main one they encounter, sometimes others later. In Breton, we will see the same. The balance of these, however, may be critical for dialect survival.

The nature of the standard

It is alleged that the Breton standard was imposed artificially. This cannot mean the orthography, as orthographies are always imperfect and artificial to some degree. It cannot mean the work starting with Le Gonidec or the KLT agreements of the early 20th century that led ultimately to the once contraversial Peurunvan in Nazi-occupied Brittany. These people must have learned from natives where they were not themselves natives anyway, as those who oppose “neo-Breton” are (like the rest of us) advocating. Breton was then much stronger. The question is, is the alternative that they oppose real or imagined?

Yet it does seem to hark back to some of those historical individuals because at least some of those who argue for the existence of “neo-Breton” associate teachers of the standard KLT register en masse with the linguistic purism and prescriptivism of Hémon and Denez and deny that any such teachers support or teach dialect. This purism, I argue, was just an attribute of the era that these two revivalists lived in. It doesn’t serve much purpose to over-analyse part of the history of the Breton revival that is well known. It’s not been sufficiently clarified whether or not the opponents of “neo-Breton” (whose existence they have created in order to oppose it) are or are not basing the tradition they oppose on history or on the present or both. I have been accused of being an ideologue in this cause but I am not quite clear which cause it is being alleged that I advance. I’m not a fan of Denez’ language courses or fiction. I have only a small historical interest in Hémon at best.

Glanville Price used the pejorative term “Cornic” for neo-Cornish. Here, unlike in Breton, the latter term neo-Cornish is perhaps justified – though considered profoundly insulting by many speakers of today’s revived Cornish – but Breton is not a dead language and native speech has been extensively recorded in the modern era. The two cannot be compared fairly.

The reality of Breton

There is bad Breton. For the most part, it is increasingly based on French idiom and pronounced in a way heavily influenced by French. The ends of words are articulated unclearly, as in French, and the word pauses, external sandhi (i.e. the way the consonants at the ends and beginnings of words affect each other where they are in contact) are more like French. So are mildly annoying features like repeating “kwa” many times per sentence like we use “er” in English, which is an old loan in Breton but was never inserted so freely as a sentence filler on this increased scale hitherto. Dialect features are mostly or entirely ignored.

However, the standard form was not invented for this purpose. Many natives who can write in standard orthographies and who moderate the features of their dialect when speaking to speakers of other dialects can use multiple registers (cf. a Thurrock English speaker who may, according to personal taste, speak more formally than the above example when in a formal situation or when speaking to someone from elsewhere). Many more natives cannot write in Breton but can moderate their dialect if necessary, the extent of this varying per individual. I know a first-language native of Ar Gemene(z) in the Vannes country (Bro Wened) who will speak without the consonant and vowel changes of Gwenedeg when speaking to non-Gwenedeg speakers depending on how well they understand Gwenedeg. Admittedly he is a teacher. We do this semi-consciously in all languages. I do it in Welsh. First-language speakers do it.

Just because a standard exists and is increasingly used to teach bad Breton does not mean that its only purpose is in creating a new, artificial language in place of native Breton, though teaching bad, non-native Breton is of course to be regretted. To say that its origin is as a created language is a revisionist fiction: we can all read the history and see that, despite the purists, this was not in fact so at the time. We can read the literature of many authors and see that some writers were purists but many or most were not, in all dialects.

The divisive nature of the term “neo-Breton”

Here is the reason that I am so animated about opposing the academic construct of “neo-Breton”. Not only is it a catch-all, general observation with little analysis of internal variation, it is in practice used as a pejorative. My own Breton was insulted by a man who, if his claim is believed, heard it for perhaps seconds, but who most likely never heard it. In either event, why would he be so motivated to belittle either me personally or the quality of my Breton? Or that of my former teacher of Breton? I am not involved in Breton language politics by choice and so I am not a member of any cause, unless it is simply the revival of minority languages with a critical focus on their native forms. On this last issue, ironically, we all agree in principle.

When learners around me have been told they are not speaking the “real Welsh” or the “real Breton” because one teacher dislikes the methods of another, the result is usually swift and total disengagement with the language. This is one among many major reasons why so few speakers go on to fluency. Another is frustration with the lack of opportunity to speak, a particular minority language issue. Some learners, like all of us, are simply lazy. The list could go on, however. But this basic allegation is contained succinctly in the term “neo-Breton”.

Opposing views on dialect teaching

Some people, like myself, believe that the standard is a natural development, a necessity, and that it has an important role in preserving the dialects as they weaken, by providing the means by which speakers of others can find a bridge towards mutual comprehensibility. They can also use it as a stepping stone in acquiring fluency in an unusual dialect with limited currency. The ideal, of course, as pointed out by the self-appointed opponents of the so-called “neo-Breton”, is learning that dialect directly. But in reality, this idea has two serious flaws:

(1) The speakers will find themselves in a tiny dialect world within an already tiny language without sufficient ability to communicate with speakers of other dialects. Often these dialects are now basically dead or moribund, so you could with equal fairness use a term like “neo-Haut-Vannetais” or drill down even further by village. You can teach solely Cenarth or Cricieth Welsh but what will happen when learners are respectively in Caernarfon or Tregaron? You can teach only Leoneg but that will be no use in Bro Wened and vice versa. “Balkanisation” results, ironically in a way that it does not in the Balkans: they can broadly understand each other across the South Slavic diasystem of languages! They too like to over-estimate linguistic divisions for political reasons. I submit that the same happens in dialect snobbery too.

(2) The teaching materials have to be generalised on at least some level. In Welsh, for example, we have so-called “North Welsh” and “South Welsh”, which both encompass massive variation and many dialects as different from the container description as from any other dialect (particularly “South Welsh”). This is just shifting the problem from “standard Welsh” to “standard spoken South Welsh”, just as artificial in practice. We can substitute Leoneg and Gwenedeg or several geographically separated varieties of Kerneveg etc.

There is a third problem that often occurs:

(3) It leads people to denigrate and avoid the formal standard register in its wide variety of forms, which leads to functional illiteracy in Welsh. Written Welsh today is known by a few. All can use written English to a massively greater level of competence, which is socially expected. The same problem will occur where a teacher focusses exclusively on Tregerieg, Gwenedeg or any other dialect. People then denigrate the standard written form, sometimes linking it to superficial orthographical matters, and write dialect in a way others cannot easily read. It is then impossible to use the language in certain prestige sociological functions, where in this case French in any case dominates. Welsh, by contrast, is used in these roles and this is one of the reasons that it is stronger within society than Breton as a result.

The unresolved issue: Gwenedeg

There are varieties of Breton that are so different from the standard that none of the usual orthographies represent them well. But this is also true of varieties of English. Given the variability, not all can be as fortunate in being closest to the standard. This doesn’t mean that people can’t or shouldn’t choose forms and structures that best represent what they say. It also doesn’t mean that they need to change how they speak. Some speakers of Gwenedeg use two orthographies, one of which is purely for Gwenedeg alone. Compare Hindustani where a range of related dialects and languages are represented in several writing systems, which are entirely unreadable by other groups. Yet Hindu-Urdu is clearly a mutually intelligible idiom with a multitude of internal variations and dialects that can be and are written.

The alternative: a middle way

The only working alternative is to teach some form of colloquial standard (focussed broadly on a natural dialect) that minimises variation enough to be able to teach a single class effectively and without massive disruption and endless tangents in discussing dialects with beginners who aren’t yet capable of putting this into context within their knowledge of the language as it stands at this stage. On the other hand, a teacher is failing in her/his duty if s/he fails to prepare students for dialect variation since they will immediately need to speak to those who have other dialects. While “North Welsh” and “South Welsh” are constructs, the teacher must be open with students about this while not allowing too many tangents during class time. Likewise, KLT and Gwenedeg are also constructs: these are containers for many variations between real dialects and idiolects.

Most learning is not done in the classroom. It occurs afterwards in students’ heads, when they are practising, flicking vaguely through the class materials, thinking about the language or its social context in their lives etc. They then reprocess what they have learned little by little and contextualise it. The teacher must lead them, encourage them and give them what they need to do this. There are better authorities on this than me, but this is basic educational theory.

In every lesson, I teach the standard written form of Breton but pronounce it like native Breton and discuss the variations between, on the one hand, my Breton or that of the speakers on the audio materials and, on the other, what is written on the page. We are always highlighting small differences. This helps the learners develop a critical ear for the sounds of Breton and its dialects, preparing them to speak with people of all dialects.

Personally, I speak a variety of KLT, not exactly the same as the base Leoneg but fairly close with some diphthong and vowel modifications, some loss of /z/, some different word forms and contractions etc. I have not lived in Brittany but I have learned from near-natives and natives, preferring for example recordings of native dialect speakers. In class, of course, I will be a little more standard. One of my learners has lived in Bro-Leon and playfully spurns any other forms, so I encourage her in this. The others make other choices. In Welsh, I speak North Welsh despite never having lived far from Aberystwyth, the result of personal and social connections when I was first learning Welsh. Yet I can write high formal Welsh, office Welsh etc and I can speak a range of registers including a type of North Cardiganshire standard Welsh that I tend to switch to when teaching, because I have learned and taught in Aberystwyth.

Strong languages have registers that have particular, non-competing roles. They are not mispronounced except by foreigners, but we don’t call this neo-[name of language], just non-native or sometimes poor [name of language]. We would advise more contact with natives but we wouldn’t suggest that a person learns, for example, exclusively the speech of the Black Country, which would be a major hindrance in understanding anybody else. At the same time, if they live in the Black Country or love its dialect, it’s great to become a dialect speaker! This person will of course need, in time, to be able to switch register and to moderate that dialect form in order to be comprehensible to the others who s/he meets.

Principles

We must not be blinkered in insisting on dialect and only dialect, yet we must advise all learners to speak natively and dialectically in the spoken register.

We must give learners the tools to learn to speak in whatever dialect form that they choose and to to understand other dialects, in accordance with their growing critical ability.

When learners get fluent, we should seek to turn them into near-native speakers. In the meantime we must always expose them to dialect speech of all kinds but focus on them speaking in one particular form consistently.

The ability to switch registers must grow with speaking confidence, whether in the case of learners or in the case of native speakers whose confidence and ability in, for example, the written or formal register is insufficient for their personal needs.

Written ability is important but it has to be securely based in spoken ability. It is a higher language skill that needs to develop with time and ability in switching registers.

Why is Breton different – well, it isn’t!

In the case of Breton, all of these apply in principle. In practice, exposure to native and/or good near-native speech is critical. Where Breton teaching is bad, this is the reason. It is not because teachers have been wittingly or unwittingly enrolled into a secret cause promoting “neo-Breton”, whichever ill-defined party or parties may be alleged to be responsible for promoting it or having created it. “Bad” usually means heavily altered by French in practice.

There is a possible single exception in that Breton is taught in Wales, most frequently Aberystwyth. Here we may run the risk of anachronism, speaking too similarly to Welsh, purism etc. However, we are aware of these dangers, in my experience, and we use native and near-native recordings, not just our own voices as teachers. We invite real native Bretons. We do whatever we can, limited by not actually being in Brittany.

The bottom line is that Breton is not different. Its circumstances are unique but all the same things about native-type speech, educational practice and multiple registers exist as the do for any language. Where these start to evaporate or become separated, it is a classic sign (seen to a lesser degree but still strikingly in Welsh, for example) that the language itself is weaker. To argue against the standard is to argue against multiple registers and thus for the weakening of a language, i.e. its eventual fragmentation and death.

This is why “neo-Breton” not only does not exist as a single entity but is also a destructive idea that must never be expressed or described in these terms in front of those learning or improving their Breton. What is required is a positive focus on dialects and the standard (in that educational order), which can complement each other. We ought not to submit to divisive dialect snobbery that will weaken the language still further when it is already under extreme threat from French. This is playing language politics to denigrate sections of the linguistic community. Like all political philosophies it is partial, simplistic, narrow and damaging.

Playing with geolocation data

Leave a reply

I’ve been looking at the very interesting methods of locating places uniquely on the Earth instead of using outdated, country-specific postal code systems. For instance, Aberystwyth is in the SY23 postal district i.e. Shrewsbury 23, and Bangor is in LL57 i.e. Llangollen 57, based on towns that are nowhere near them (in the case of Shrewsbury, it is in a neighbouring country, England!) purely based on the location of the main sorting office. These seem like rather silly, impractical and potentially confusing idiosyncracies, however quaint the history may be.

The two most interesting methods are Geohash and the slightly more accurate variant Geohash-36 (Open Postcode), of which the more human-readable World Postcode is based on the latter.

Below are the locations of the site of the former Great Darkgate in Aberystwyth. A plaque on the wall of Halifax Bank of Scotland (the former Halifax Building Society) commemorates where it used to stand. It used to house the Old Police Station and a “House of Correction” (an earlier site was in Bridge Street) and a gaol until it was demolished in the 19th century in order to widen the road. I’ve used a defunct landmark in order to demonstrate that one can accurate down to one sixth of a metre, so one can exactly locate a site like this one, which now lies somewhere in the middle of the road between the Halifax and the Edwardian Post Office (1901). I have located it somewhere on the general site that it must have stood.

Open Postcode Geohash-36
http://geo36.org/bdl5JQ6bJM

World Postcode
http://mapplot.org/bdl5JQ6bJM

There is a useful app on Android phones called Geohash-36 that I’ve used. Now you can get the location code of where you are easily too!

Here is the KML data (in XML mark up) for the Geohash-36 (Open Postcode) for the location:

<?xml version=”1.0″ encoding=”UTF-8″?>
<kml > <Document> <Placemark><name>bdl5JQ6bJM/t</name><description>Site of the former Great Darkgate in Aberystwyth</description><Point><coordinates>-4.083865,52.414952</coordinates></Point><description>OK<Placemark><name>bdl5JQ6bJM/t</name><Point><coordinates>-4.083865,52.414952</coordinates></Point><description>OK<Placemark><name>bdl5JQ6Fhq/u</name><Point><coordinates>-4.083982,52.414861</coordinates></Point><description>OK<Placemark><name>bdl5JQ6Fhn/q</name><Point><coordinates>-4.084000,52.414860</coordinates></Point><description>OK<Placemark><name>bdl5JQ6FLR/a</name><Point><coordinates>-4.084000,52.414838</coordinates></Point><description>OK</description></Placemark> </Document> </kml>

This might seem like complex code but it’s easily generated using Google Maps. You could, for example, use it to uniquely identify where your business is so that devices like mobile phones could immediately find you. The little code that I have highlighted in red is the short code that you need: all the rest is basically extra stuff created by and for machines.