Turning The Page: Using computer vision to tell stories

“Imagine if your well-thumbed, outdated guidebook could talk. Think of the stories it would tell about the places it’s been, the characters encountered and narrow escapes along the way.”

That’s the premise of Turning The Page, an installation by Stand + Stare and Tim Cole first shown at Mayfest 2013.

Participants take a seat at a desk covered in travel ephemera and put on a pair of headphones. When flicking through the travel guide on the desk certain pages will play audio & video, the book’s owner recounting stories of her journey.

How does it work?

The travel guide on the desk is a normal book, with no added electronics. The desk is illuminated by an anglepoise desk lamp. Concealed within the lamp is a camera pointed at the book.

The camera is connected to a computer inside the desk which processes the live video to ‘see’ which page of the book is currently open.

Also inside the desk is a projector for playing back video on the table surface. When particular pages are ‘seen’ by the system to be open on the desk, the matching audio & video tale is played.

Which page is open?

The camera concealed in the lamp is a normal USB webcam. We process the video from the camera in realtime to detect which page the book is opened to. The software is trained to detect the illustrative photos on the book pages. For this we use the IN2AR library in Adobe Flash, compiled to a mac native Adobe AIR application.

Playing audio & video

The installation is controlled through the theatre control software QLab. We’ve used it for previous projects where it has proven to be stable, powerful and allows easy changing of the script.

The camera processing software sends QLab a trigger when a story page is detected. QLab then plays the relevant sequence of audio & video.

Why not RFID?

For the previous Theatre Jukebox project by Stand+Stare we had to identify which postcard was placed on a table. For this we used RFID instead of computer vision. Each postcard contained a single RFID tag which was detected by an RFID reader under the table surface.

This was technically simpler than using computer vision, but was not an option for Turning The Page for two reasons. Firstly the book would need to contain 20 or so RFID tags in close proximity. Most RFID readers have difficulty reading multiple tags at the same time as their signals interfere with each other. We would need to find a reader/tag combination with collision avoidance.

Secondly each tag is quite thick compared to paper, about 0.6mm thick. In the cardstock of a postcard this is not a problem. The paper in travel guides is very thin however and the presence of the tag is very noticeable.

For this reason we put no technology in the book. Using computer vision also means we can change the book we are detecting easily without having to modify the object. We can move beyond the book to other objects. Travel tickets, souvenirs and the like.

Turning The Page was funded by REACT. The next REACT sandbox is themed Objects and will open for funding proposals in September 2013. Turning The Page images in this post provided by Stand+Stare.