SINLAB: Streaming VR video

What is VR video? Do you see VR video when you watch 360 content in an Oculus headset? Or do you see VR video when your right and left eye see slightly different images and you can actually see depth? Both are required for a real VR experience, but how to you stream them to your headset?

Bildet kan inneholde: snø, ekstremsport, hjelm, vinter, snowboard.

Video streaming is something that we experience every day. There is interactive streaming, which we experience in Zoom, and on-demand streaming, which we experience in Netflix or YouTube.

But there is also something that is called VR video streaming.

Most of the time, people mean by VR video that you can move your head wearing a head-mounted display (Oculus, Vive, ...) and look freely in all directions, but without control over the location of the camera in the virtual world. This is also known as 360 video.

Other times, people want to see real depth, and that means that slightly different images must be presented for the left and right eyes, and they call that VR video as well. This is known as stereoscopic video.

Is it easy to bring these two ideas together? The answer is: not really, because 360 video requires that the camera rotates, but it must stay in the same place, whereas stereoscopic video requires that two camera look in the same direction, but from slightly different places.

If we want both, we must bring at least two 360 cameras and compute (infra- or extrapolate) the stereoscopic video in real-time: that allows us a horizontal head rotation by several degrees. We can cover 360 horizontal degrees if we use three cameras, and we can also go vertical with four 360 cameras arranged in a pyramid.

But how do we record and compress this kind of video data from up to four 360 cameras? All of these video frames are extremely similar, so we should be able to compress them relative to each other, inspired by MPEG's coding for stereoscopic videos. And perhaps we can code in a different manner and make it easier for the client to show stereoscopic frames?

In the theses, we explore the means of compression, transfer, decompression and rendering.

Potential Contributions

  • Develop and compare tiling schemes and appropriate compression tools that make it possible to encoding, stream, decode and display 360 video in real-time; or
  • Implement a fast, non-ML, depth estimation method that makes it possible to interpolate views between several cameras in real-time, such as optical-flow-based depth estimation; or
  • Compared inside-out and outside-in trackers for ultra low-latency and AI-predicted head tracking, these trackers may be based on visible markers or natural features.

All software should be released on Github.

Learning outcome

  • Understand the ecosystems surrounding the two major protocols for video streaming in the Internet, RTP and HTTP - and use both.
  • Understand bandwidth adaptation for streaming video over the Internet.
  • Learn details about video coding, compression, inter- and extrapolation and compression.
  • Understand how to conduct user studies to assess whether a video coding and compression methods works.
  • Learn how to use CUDA for hardware compression, decompression, interpolation and extrapolation.

Conditions

We expect that you:

  • have been admitted to a master's program in MatNat@UiO - primarily PROSA
  • take this as a long thesis
  • will participate actively in the weekly SINLab meetings
  • are present in the lab and collaborate with other students and staff
  • are willing to share your results on Github
  • have taken IN3230 or include IN4230 in the study plan
  • include the course IN5060 in the study plan, unless you have already completed a course on classical (non-ML) data analysis
Publisert 2. okt. 2023 15:18 - Sist endret 2. okt. 2023 15:32

Veileder(e)

Omfang (studiepoeng)

60