Twilio enhanced with

Twilio is a web service that connects you to voice and text messaging clients. Much like, anybody can sign up and start using it in minutes. In this post, we’re going to use the standard REST API with Twilio to liven up the standard Twilio hold music.


Twilio has a special markup language called TwiML that you use to create documents that tell it how to respond to voice calls. For example, you can associate a phone number with a URL that generates the following document:

When that phone number is called, Twilio will tell the caller about two cool services, then hang up. The commands in the document are called verbs. The Say verb instructs Twilio to read text to the caller. TwiML has a verb called Enqueue that places the caller in a hold queue. Normally while in the hold queue, your poor caller is subject to the default Twilio hold music. Let’s use to give the caller something better!

Get Signed Up

First, sign up for Twilio here. They will give you a phone number that you can attach our script to.

Next, let’s get an account on Head over to the sign up page and create a new account. When asked, tell that you are adding music to an app and mark the platform as Other. You’ll then be asked what music you want for your app – search for a station called ‘Spaghetti Westerns’ for our experiment (you can always change this later). will ask you if you want to purchase more minutes, but you can do that after we’ve finished testing things out. Once tells you you’re ready to integrate, click on Home, click on the app we just created, and then click on the Embed Codes and IDs tab to get the credentials that you’ll need to put in your code to access the API.


The key ingredient for using to power your hold queue music is the waitUrl attribute on the Enqueue tag, as in this document:

If provided, Twilio will repeatedly retrieve the waitUrl to get instructions on how to entertain the caller.

We will build a script that requests music from and generates a TwiML document to tell Twilio to play the requested music. Our node script will generate a document that looks like this:

Twilio will play that song, then on completion it will call the script again to get another song to play.’s REST API has an endpoint called ‘/play’ that will give us a new song to play. Before we hand the song over to Twilio, we also need to call the ‘/play/:id/start’ endpoint to let know that song playback has started.

Aside from authentication credentials, all we need to supply to the ‘/play’ endpoint is a client id. The client id is used to uniquely identify a listener so that DMCA music playback rules can be enforced, and can be created with a post to the ‘/client’ endpoint. Normally we’d cookie the listener with the id, but we can’t cookie a phone client. This script will need to keep a server side cache that maps the caller’s phone number (provided by Twilio) to a client id.


Our node script uses the ‘express’, ‘superagent’, and ‘memory-cache’ modules, so make sure to install them with:

The simplest app we can make that responds with a TwiML document looks like the following:

That script will listen on port 8080 for incoming requests. If you point your Twilio phone number to that host and port 8080, you’ll hear the message spoken to you.

To talk to the servers, we need to include some authentication with all our requests. This is easily done by base64-encoding your authentication credentials and placing them in the header with each request like so:

One caveat for our script is that we can only deliver music to people within the United States due to licensing restrictions. This is easily handled by looking at the ‘FromCountry’ parameter passed to our script from Twilio. When it is equal to ‘US’ we’re good to use music, otherwise we need to serve something else to the caller.

The final implementation of our app sticks together all the pieces above and looks like the following (available on github here):

If you map your Twilio phone number to this script, you’ll be greeted and then placed in a hold queue that will play our ‘Spaghetti Westerns’ music!

Future Tweaks

TwiML also has a Conference verb where you can specify actions to take while callers are waiting for other conference attendees. The script we made will work just as well in that situation.

TwiML also has verbs to get input from the caller. With only a few tweaks, you can modify this script to let the caller switch to a different music station.

While this is up and running you can log into your account and use our music management interface to change up the music all you want without having to change any code.

Happy hacking!

Spread the word!Share on FacebookTweet about this on TwitterShare on LinkedIn

Leave a Reply