Scalable Live Streaming with Wowza

[This post is part of the set we’re doing on the tech behind the No Boundaries conference, which happened in Feb 2014.]

There are many tutorials online about how to do this, so I’m not going to present too much code/config. It’s probably most useful if I simply describe the architecture we used, along with its pros and cons.

First off, why not use one of the existing streaming services such as Livestream, Ustream or DaCast? They are all fairly basic services, and don’t come with the flexibility we needed to be able to mingle recordings/DVR, mutiple bitrate streams, and all the accessibility options (live subtitles, etc).

That leaves three main players in the live streaming software game: Flash Media Server, Wowza Media Server (now Wowza Streaming Engine), and Flussonic. Having had experience with Wowza, and knowing that in theory it had the capabilities we needed, we went for that.

The original idea was to have a DVR stream: that is, whenever you first load up the streaming page, you can rewind back throughout the entire event. This would save us having to mess about with recordings: we’d simply note the timestamp when each speaker started and provide links that would send viewers directly to that point in the DVR stream. This worked fine when we were just using one streaming server, but we hit a snag when we tried to scale it.

Our planned architecture was to have an in-house Wowza server (which would take video from our encoding computers running Wirecast). However, no-one would connect to our in-house server, since the bandwidth requirements would be too much even for our top-notch internet connection. Instead, our in-house server would be what’s called an Origin Server, and we’d set up two more Wowza instances on Amazon EC2 to be our Edge Servers. A single feed would go from our origin to each edge, and then viewers would connect to the edges to get the stream relayed to them.

Unfortunately, when the encoding computers stopped sending a stream to the origin (which they frequently would, eg when breaking for lunch), the origin would send a message to the edges to say “stream has stopped”. The edges would then pass on that message to all connected viewers, and everyone’s video would suddenly stop: even if they’d rewound the stream and were watching some video from earlier in the day.

It’s possible that tweaking some Wowza and Flowplayer settings could have helped with this, but we were running out of time to experiment. So instead, we used Wowza’s recording API to record individual talks and created a playback page where viewers could either watch live, or click to see recorded talks.

So the full architecture was as follows:

WebStream Diagram

  • Wirecast computers (one with the straight feed, one with the picture-in-picture of the BSL interpreter) sent video to the in-house Wowza server
  • The in-house server transcoded to different bitrates, and also embedded subtitles (see previous post)
  • The edge servers running on EC2 acted as proxies for all these streams, so the website’s flowplayer instance could connect to them to get the live video
  • Before the conference started, we typed in every talk’s details to a web admin interface, so we could just click on each speaker as they came to the lectern to say they were now live
  • Clicking a speaker in the admin interface would send a command to Wowza to tell it to a) stop recording the old speaker, b) start recording the new speaker, and c) copy the recordings of the old speaker out to the edge servers
  • Once the recordings had been copied out to the edge servers, the database was updated so that the website (which continuously polled for the latest playlist) would see that the new recordings added, and so add them to the playlist. Viewers could then click on these playlist items to get the video-on-demand (VOD) files from the edge servers