Providing omnidirectional depth along with RGB information is important for numerous applications (VR/AR). However, as omnidirectional RGB-D data is not always available, synthesizing RGB-D panorama data can be useful. Therefore, prior work tried to synthesize RGB-D panorama data from limited information of RGB-D data; however, there is a problem that a small error in the generated depth map may cause unnaturalness when it is restored in 3D model. In this paper, we study a problem: Synthesizing RGB-D panorama, which can provide a complete 3D model, under the arbitrary configurations of cameras and depth sensors. Accordingly, we propose a novel bi-modal (RGB-D) panorama synthesis (BIPS) framework. In this paper, we design a generator that fuses the bi-modal information and train it with residual-aided adversarial learning (RDAL). RDAL allows to synthesize realistic indoor layout structures and interiors by jointly inferring RGB panorama, layout depth, and residual depth. In addition, the method for generating the layout depth map and the residual depth map, which is required for RDAL, is proposed. Extensive experiments show that our method synthesizes high-quality indoor RGB-D panoramas and provides realistic 3D indoor models than prior methods.