OpenHMD and the Oculus Rift

For some time now, I’ve been involved in the OpenHMD project, working on building an open driver for the Oculus Rift CV1, and more recently the newer Rift S VR headsets.

This post is a bit of an overview of how the 2 devices work from a high level for people who might have used them or seen them, but not know much about the implementation. I also want to talk about OpenHMD and how it fits into the evolving Linux VR/AR API stack.

OpenHMD

http://www.openhmd.net/

In short, OpenHMD is a project providing open drivers for various VR headsets through a single simple API. I don’t know of any other project that provides support for as many different headsets as OpenHMD, so it’s the logical place to contribute for largest effect.

OpenHMD is supported as a backend in Monado, and in SteamVR via the SteamVR-OpenHMD plugin. Working drivers in OpenHMD opens up a range of VR games – as well as non-gaming applications like Blender. I think it’s important that Linux and friends not get left behind – in what is basically a Windows-only activity right now.

One downside is that does come with the usual disadvantages of an abstraction API, in that it doesn’t fully expose the varied capabilities of each device, but instead the common denominator. I hope we can fix that in time by extending the OpenHMD API, without losing its simplicity.

Oculus Rift S

I bought an Oculus Rift S in April, to supplement my original consumer Oculus Rift (the CV1) from 2017. At that point, the only way to use it was in Windows via the official Oculus driver as there was no open source driver yet. Since then, I’ve largely reverse engineered the USB protocol for it, and have implemented a basic driver that’s upstream in OpenHMD now.

I find the Rift S a somewhat interesting device. It’s not entirely an upgrade over the older CV1. The build quality, and some of the specifications are actually worse than the original device – but one area that it is a clear improvement is in the tracking system.

CV1 Tracking

The Rift CV1 uses what is called an outside-in tracking system, which has 2 major components. The first is input from Inertial Measurement Units (IMU) on each device – the headset and the 2 hand controllers. The 2nd component is infrared cameras (Rift Sensors) that you space around the room and then run a calibration procedure that lets the driver software calculate their positions relative to the play area.

IMUs provide readings of linear acceleration and angular velocity, which can be used to determine the orientation of a device, but don’t provide absolute position information. You can derive relative motion from a starting point using an IMU, but only over a short time frame as the integration of the readings is quite noisy.

This is where the Rift Sensors get involved. The cameras observe constellations of infrared LEDs on the headset and hand controllers, and use those in concert with the IMU readings to position the devices within the playing space – so that as you move, the virtual world accurately reflects your movements. The cameras and LEDs synchronise to a radio pulse from the headset, and the camera exposure time is kept very short. That means the picture from the camera is completely black, except for very bright IR sources. Hopefully that means only the LEDs are visible, although light bulbs and open windows can inject noise and make the tracking harder.

Rift Sensor view of the CV1 headset and 2 controllers.
Rift Sensor view of the CV1 headset and 2 controllers.

If you have both IMU and camera data, you can build what we call a 6 Degree of Freedom (6DOF) driver. With only IMUs, a driver is limited to providing 3 DOF – allowing you to stand in one place and look around, but not to move.

OpenHMD provides a 3DOF driver for the CV1 at this point, with experimental 6DOF work in a branch in my fork. Getting to a working 6DOF driver is a real challenge. The official drivers from Oculus still receive regular updates to tweak the tracking algorithms.

I have given several presentations about the progress on implementing positional tracking for the CV1. Most recently at Linux.conf.au 2020 in January. There’s a recording at https://www.youtube.com/watch?v=PTHE-cdWN_s if you’re interested, and I plan to talk more about that in a future post.

Rift S Tracking

The Rift S uses Inside Out tracking, which inverts the tracking process by putting the cameras on the headset instead of around the room. With the cameras in fixed positions on the headset, the cameras and their view of the world moves as the user’s head moves. For the Rift S, there are 5 individual cameras pointing outward in different directions to provide (overall) a very wide-angle view of the surroundings.

The role of the tracking algorithm in the driver in this scenario is to use the cameras to look for visual landmarks in the play area, and to combine that information with the IMU readings to find the position of the headset. This is called Visual Inertial Odometry.

There is then a 2nd part to the tracking – finding the position of the hand controllers. This part works the same as on the CV1 – looking for constellations of LED lights on the controllers and matching what you see to a model of the controllers.

This is where I think the tracking gets particularly interesting. The requirements for finding where the headset is in the room, and the goal of finding the controllers require 2 different types of camera view!

To find the landmarks in the room, the vision algorithm needs to be able to see everything clearly and you want a balanced exposure from the cameras. To identify the controllers, you want a very fast exposure synchronised with the bright flashes from the hand controller LEDs – the same as when doing CV1 tracking.

The Rift S satisfies both requirements by capturing alternating video frames with fast and normal exposures. Each time, it captures the 5 cameras simultaneously and stitches them together into 1 video frame to deliver over USB to the host computer. The driver then needs to split each frame according to whether it is a normal or fast exposure and dispatch it to the appropriate part of the tracking algorithm.

Rift S – normal room exposure for Visual Inertial Odometry.
Rift S – fast exposure with IR LEDs for controller tracking.

There are a bunch of interesting things to notice in these camera captures:

  • Each camera view is inserted into the frame in some native orientation, and requires external information to make use of the information in them
  • The cameras have a lot of fisheye distortion that will need correcting.
  • In the fast exposure frame, the light bulbs on my ceiling are hard to tell apart from the hand controller LEDs – another challenge for the computer vision algorithm.
  • The cameras are Infrared only, which is why the Rift S passthrough view (if you’ve ever seen it) is in grey-scale.
  • The top 16-pixels of each frame contain some binary data to help with frame identification. I don’t know how to interpret the contents of that data yet.

Status

This blog post is already too long, so I’ll stop here. In part 2, I’ll talk more about deciphering the Rift S protocol.

Thanks for reading! If you have any questions, hit me up at mailto:thaytan@noraisin.net or @thaytan on Twitter

New gst-rpicamsrc features

I’ve pushed some new changes to my Raspberry Pi camera GStreamer wrapper, at https://github.com/thaytan/gst-rpicamsrc/

These bring the GStreamer element up to date with new features added to raspivid since I first started the project, such as adding text annotations to the video, support for the 2nd camera on the compute module, intra-refresh and others.

Where possible, you can now dynamically update any of the properties – where the firmware supports it. So you can implement digital zoom by adjusting the region-of-interest (roi) properties on the fly, or update the annotation or change video effects and colour balance, for example.

The timestamps produced are now based on the internal STC of the Raspberry Pi, so the audio video sync is tighter. Although it was never terrible, it’s now more correct and slightly less jittery.

The one major feature I haven’t enabled as yet is stereoscopic handling. Stereoscopic capture requires 2 cameras attached to a Raspberry Pi Compute Module, so at the moment I have no way to test it works.

I’m also working on GStreamer stereoscopic handling in general (which is coming along). I look forward to releasing some of that code soon.

 

Network clock examples

Way back in 2006, Andy Wingo wrote some small scripts for GStreamer 0.10 to demonstrate what was (back then) a fairly new feature in GStreamer – the ability to share a clock across the network and use it to synchronise playback of content across different machines.

Since GStreamer 1.x has been out for over 2 years, and we get a lot of questions about how to use the network clock functionality, it’s a good time for an update. I’ve ported the simple examples for API changes and to use the gobject-introspection based Python bindings and put them up on my server.

To give it a try, fetch play-master.py and play-slave.py onto 2 or more computers with GStreamer 1 installed. You need a media file accessible via some URI to all machines, so they have something to play.

Then, on one machine run play-master.py, passing a URI for it to play and a port to publish the clock on:

./play-master.py http://server/path/to/file 8554

The script will print out a command line like so:

Start slave as: python ./play-slave.py http://server/path/to/file [IP] 8554 1071152650838999

On another machine(s), run the printed command, substituting the IP address of the machine running the master script.

After a moment or two, the slaved machine should start playing the file in synch with the master:

Network Synchronised Playback

If they’re not in sync, check that you have the port you chose open for UDP traffic so the clock synchronisation packets can be transferred.

This basic technique is the core of my Aurena home media player system, which builds on top of the network clock mechanism to provide file serving and a simple shuffle playlist.

For anyone still interested in GStreamer 0.10 – Andy’s old scripts can be found on his server: play-master.py and play-slave.py

GStreamer talk at OSSbarcamp tomorrow

OSSbarcamp logo

If you’re in Dublin tomorrow (Saturday 28th March), and you’re interested in Open Source, feel free to come along to the OSSbarcamp at DIT Kevin St and enjoy some of the (completely free!) talks and demos. I’ll be presenting an introduction to the GStreamer multimedia framework in the afternoon.

Other talks I’ll be attending include my wife Jaime’s talk on using Git, Luis’ “Being Creative With Free Software”, and Stuart’s Advanced Javascript presentation.

Details of times and the talk schedule are at http://www.ossbarcamp.com/

Christmas Holidays

My last post was months ago, as usual. The day before we moved from Barcelona to Ireland. Since then, we´ve been having fun – we found a house to live in, unpacked our things, made some new friends. I started my job at Sun, which is working out really well, and we bought a bunch of things and explored our new habitat.

Now, after 3 and a bit months in Dublin, we´re off to Sweden and Norway for Christmas and New Year´ś, respectively. We are in Göteburg at the moment, and for the next 2 days, and then catching the train across to Växjö for Christmas with our friends from Australia who moved there in July. Afterward, we head up to Stockholm, and then across to Norway for New Year´s where we willl get to see Uraeus in his natural habitat.

Congratulations to Andy Wingo on his new job – I hope it turns out fun!

LCA 2007 redux

I arrived back in Barcelona from Australia with thomasvs on Monday morning. We were there for the LCA conference, and the associated FOMS conference. I had a great time, getting back to the old stomping ground and geeking it up. Here’s a rundown:

We arrived in the evening on Wed 10th and were met by Shane & Kate at the airport. Had a little bit of confusion as to whether Thomas had also arranged for Lindsay to pick us up, but it turned out not. Went to Shane & Kate’s house, played with their Wii a bit and hit the hay.

Thursday and Friday, we spent at FOMS, which provided a nice chance to meet a bunch of FOSS Multimedia hackers I didn’t yet know, and catch up with others that I did, including fellow Fluendian Mike Smith. The conference was very well structured and run, and I think we covered a lot of ground and built some interesting bridges.
Shane & Kate hosted a bbq at their place on Thursday night, which was a lot of fun even though I flaked out on the floor of their study by about 10:30pm, knocked out by a flu I picked up on the plane on the way over. My friend Miki also came along to the bbq after she finished work.

Friday night was supposed to be the Annodex Foundation AGM at the James Squire brewery, in at King’s Wharf, but it turned out to just be a FOMS attendees get-together at the brewery after they discovered the Internet access at the pub was being too flaky. A fun night, either way.Thomas blogged a little about his impressions of FOMS too. I’d like to say thanks to all the people that put FOMS together, especially Silvia & Shane, and to the other attendees.

On Saturday, I caught a lift with my little brother up to our parents house on the Central Coast, and spent the night up there while Thomas & Mike went to a concert in Sydney. I had a nice afternoon and evening with a big part of my family unit, even though I couldn’t go visit my grandparents (because of the flu).

On Sunday my father had offered to take us all on a Hunter Valley wine+cheese tour, so we picked up Thomas, Mike and Miki at Tuggerah train station and headed up to the Hunter Valley for the day. We had a beautiful sunny day, toured the vineyards, tried some good wines, jams, olive oils, chocolate and cheeses. I bought 5 bottles of wine, some chocolates and some lilly pilly jam. Thomas bought some wine, and some particularly strong scented cheese, which would prove useful later in the trip. Sunday night we caught the train back to Sydney, and Thomas & I picked up the keys to our dorm accomodations for the rest of the trip.

Monday morning was a the first mini-conf day of LCA. This year, LCA was held at the University of NSW, which I used to spend a fair bit of time around. It’s been a while since though, so the campus was simultaneously familiar and strange. We picked up our sweet LCA bags and swag at the registration desk, and headed into the conference open. As usual, Jdub was entertaining. Talks I went to on Monday, in between hacking in the pavilions and re-discovering my way around UNSW:

  • Show and Tell: The Pedagogical Arguments for FOSS by Donna Benjamin
  • Using Avahi the “Right Way” by Trent Lloyd & Lennart Poeterring
  • Adventures in Linux on Programmable Logic Devices by Dr John Williams
  • GNOME Love Session

Monday night was the Speakers Dinner, and we were treated to a nice harbour cruise plus meal, with Sasha’s girlfriend Penelope as my date so that she could get a ticket, since several of the other LCA rego desk volunteers were already coming and she wasn’t. Anthony Baxter gave an entertaining talk titled ‘Style over substance’, which I completely ignored later in the week when giving my tutorial, causing several people in the audience to fall asleep.

Tuesday, I enjoyed Chris Blizzard’s keynote followed by:

  • Jokosher: The GNOME approach to audio production by Jono Bacon
  • De-mystifying PCI by Kristen Carlson Accardi
  • Wesnoth for Kernel Hackers (and everyone else) by Rusty Russell
  • Suspend & Resume to RAM repair workshop by Matthew Garrett

Rusty’s talk was interesting and well presented, as usual. Also because totem wouldn’t play on the big-screen because the Xv overlay just displayed as an empty square on the projector. I jumped up and used gstreamer-properties to change it to XShm output, which fixed the problem, but now his sound wouldn’t play, so Rusty stopped Wesnoth (to free the sound device), which for some reason meant that sound would play, but the video overlay went back to being broken. At that point, he switched to mplayer for his presentation, but after the talk I took another look and discovered that he was using totem-xine, and then I couldn’t figure out why my change had fixed the Xv Overlay in the first place 🙂
Went to the Google Conference party at the Roundhouse for a while in the evening, and chowed down on a couple of sausage sandwiches, but left early so I could get some washing done.

Wednesday morning, Thomas and I were up bright and early for our turn at the Speaker’s Adventure. All the speakers had to nominate one of the mornings to go on the adventure, but we weren’t told in advance what it was. Since the conference is over now, I can reveal that it was a trip up Centrepoint Tower to do the Sky Walk (although the secret was already leaked much earlier than this). The Sky Walk is a fun little jaunt where you dress up in fancy jumpsuits, remove any objects that you could possibly drop off the edge, and take a trip around the outside of Centerpoint, tethered securely to a little railway line that ensures you can’t do anything foolish. All in all, pretty fun, although I’m not sure I’d be prepared to pay the RRP for it.

Thanks to Pia and others, who (among other conference duties) got up bright and early to deliver speakers to the tower and get us back in time for the conference open. Wednesday morning’s keynote was from Andy Tanenbaum, giving a presentation about some of the recent work they’ve been doing in Minix, which was very interesting. *insert standard observations about micro vs monolithic kernels*

Wednesday’s talks:

  • The PulseAudio Sound Server by Lennart Poettering
  • Linux on the Cell Broadband Architecture by Arnd Bergmann
  • Desktops on a diet – old pants back on! by Carsten Haitzler
  • nouveau – reverse engineered nvidia drivers by Dave Airlie
  • Linux Clusters by Sulamita Garcia
  • X Monitor Hotplugging Sweetness by Keith Packard

To be continued….

Back to AU

For the 5th time in as many months, I’m flying halfway around the world on Tuesday to hit Sydney for FOMS and linux.conf.au. Huzzah!

This time though, I’m not flying alone – I’m bringing one of my Belgian workmates – seen here looking worshipful at a Pearl Jam concert.

I’ll be giving a tutorial @ LCA on Thursday 18th, about GStreamer – an introduction to pipeline building and writing a simple element.

If you’re going to be in Sydney and want to make sure we meet up, drop me an email

SSH trick

A while ago, I wanted to copy some stuff from my laptop to a machine behind a proxying firewall.

Very quickly, I got sick of copying something to the firewall, logging in, then copying to the final machine, so I put together a small ssh proxy script that would log into the firewall for me when I requested the dest machine (sunshine), and then use nc to connect to sunshine.

But, the problem is that sometimes I carry my laptop into the house where ‘sunshine’ lives, so I extended it to become the script ssh-through-fw

With that script in an appropriate location, I add this to my ~/.ssh/config:

host sunshine

ProxyCommand $HOME/.install/bin/ssh-through-fw 192.168.1. user@firewall %h %p

Where 192.168.1. is the prefix of the IP range used in the network behind the firewall.

Now, when I am running remotely, connecting to sunshine happens through the firewall, but when I’m behind the firewall it connects directly to the machine without me thinking about it.

Mono in Barcelona

Went to Miguel’s Mono Talk at the university last night.

Nothing too unexpected in the talk, but it was nice to hang out with a bunch of Free Software Barcelonians that I’ve never met.

As an added bonus, during the talk I discovered some brain-dead code in our GStreamer GstAdapter that made for an easy 90x speedup when collecting large numbers of buffers.