VidPanos: Generative Panoramic Videos from

Casual Panning Videos


Supplementary Material


Jingwei Ma, Erika Lu, Roni Paiss, Shiran Zada, Aleksander Holynski

Tali Dekel, Brian Curless, Michael Rubinstein, Forrester Cole

 

 


#1 Real Video Results & Comparisons

Below we show our diffusion-based results on real videos. Click here to see the full set of results and comparisons.

0


1


2


3

4

5

6

7

8

9

10


Click here for more real video results.

 


#2 Synthetic Video Results

Below we show results on the four videos that are shown in the paper. Click here to see the full set of results.

0


1


2


3


Click here for more synthetic video results.

 


#3 MAGVIT Comparisons

We show comparisons with two ways of running MAGVIT [1] (see supplementary PDF for details).

0 (a)


0 (b)


1 (a)


1 (b)


2 (a)


2 (b)


3 (a)


3 (b)

 


#4 Flow-based Method Comparisons

Below we show comparisons with two flow-based methods, ProPainter [2] and E2FGVI [3]. Click here to see the full set of results.

0


1


2


Click here for more comparisons.

 


#5 Ablations

We show two ablations for our Lumiere-based method: 1) naive Lumiere and 2) removing temporal coarse-to-fine. We also show two Phenaki-based ablations: naive Phenaki and Phenaki with coarse-to-fine.

0


1 (a)


1 (b)


2 (a)


2 (b)


3 (a)


3 (b)


4 (a)


4 (b)

 


#6 Flow Visualizations

We visualize and compare flow computed on our method and on the interpolation baseline. Click here to see more examples.

0


1


2


3


Click here for more flow visualizations.

 


#7 Comparison with Panoramic Video Textures

Below is a comparison with Agarwala, et al. [4].

 


#8 Results before spatial super-resolution stage

Below we show a few "Ours Lumiere" results before versus after the spatial super-resolution stage. For the full set of results click here.

0


1

 


#9 Baseline results generated with different seeds

Below we show a few different samples for the non-deterministic baselines: MAGVIT type 1, naive Phenaki, and naive Lumiere.

scuba - MAGVIT baseline1 (2 seeds)


ski - MAGVIT baseline1 (2 seeds)


scuba - Naive Phenaki (2 seeds)


ski - Naive Phenaki (2 seeds)


scuba - Naive Lumiere (2 seeds)


ski - Naive Lumiere (2 seeds)




References

[1] Lijun Yu, Yong Cheng, Kihyuk Sohn, José Lezama, Han Zhang, Huiwen Chang, Alexander G. Hauptmann, Ming-Hsuan Yang, Yuan Hao, Irfan Essa, Lu Jiang. MAGVIT: Masked Generative Video Transformer. CVPR 2023.

[2] Shangchen Zhou, Chongyi Li, Kelvin C.K Chan, and Chen Change Loy. ProPainter: Improving Propagation and Transformer for Video Inpainting. ICCV 2023.

[3] Zhen Li, Cheng-Ze Lu, Jianhua Qin, Chun-Le Guo, and Ming-Ming Cheng. Towards An End-to-End Framework for Flow-Guided Video Inpainting. CVPR 2022.

[4] Aseem Agarwala, Ke Colin Zheng, Chris Pal, Maneesh Agrawala, Michael Cohen, Brian Curless, David Salesin, and Richard Szeliski. Panoramic video textures. SIGGRAPH 2005.