website/page2.html at main · indoorpositioning/website · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
<!DOCTYPE html>
<html>
    <head>
        <meta charset="utf-8">
        <title>IPCV Research Group</title>
        <link rel="stylesheet" href="css/normalize.css">
        <link rel="stylesheet" href="css/style.css">
        <link rel="icon" href="images/favicon_2.png">
        <link rel="preconnect" href="https://fonts.googleapis.com">
        <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
        <link href="https://fonts.googleapis.com/css2?family=Poppins&family=Roboto+Condensed&display=swap" rel="stylesheet">
    </head>
    <body>
        <header>
            <nav>
                <ul class="navBar">
                    <li class="navBar"><a href="index.html">Home</a></li>
                    <li class="navBar"><a href="#Abstract">Abstract</a></li>
                    <li class="navBar"><a href="#Architecture">Architecture</a></li>
                    <li class="navBar"><a href="#Results">Results</a></li>
                    <li class="navBar"><a href="#Future_Work">Future Work</a></li>
                    <li class="navBar"><a href="#Publication">Pre-print</a></li>
                </ul>
            </nav>
        </header>

        <h1>Deep Camera Pose Regression Using Pseudo-LiDAR</h1>

        <a id="Abstract">
            <h2>Abstract</h2>

            <div class="container-1400">
                <div class="paragraph">
                    <p> An accurate and robust large-scale localization system is an integral component for active areas of research such as autonomous vehicles and augmented reality. To this end, many learning algorithms have been proposed that predict 6DOF camera pose from RGB or RGB-D images. However, previous methods that incorporate depth typically treat the data the same way as RGB images, often adding depth maps as additional channels to RGB images and passing them through convolutional neural networks (CNNs). In this paper, we show that converting depth maps into pseudo-LiDAR signals, previously shown to be useful for 3D object detection, is a better representation for camera localization tasks by projecting point clouds that can accurately determine 6DOF camera pose. This is demonstrated by first comparing localization accuracies of a network operating exclusively on pseudo-LiDAR representations, with networks operating exclusively on depth maps. We then propose FusionLoc, a novel architecture that uses pseudo-LiDAR to regress a 6DOF camera pose. FusionLoc is a dual stream neural network, which aims to remedy common issues with typical 2D CNNs operating on RGB-D images. The results from this architecture are compared against various other state-of-the-art deep pose regression implementations using the 7 Scenes dataset. The findings are that FusionLoc performs better than a number of other camera localization methods, with a notable improvement being, on average, 0.33m and 4.35&#176 more accurate than RGB-D PoseNet. By proving the validity of using pseudo-LiDAR signals over depth maps for localization, there are new considerations when implementing large-scale localization systems. </p>
                </div>
            </div>
        </a>

        <a id="Architecture">
            <h2>Architecture</h2>

            <div class="images">
                <img src="./images/rgbpointloc.drawio(1)(1).png" alt="Architecture diagram" style="width: 700px;">
            </div>
        </a>

        <br>

        <a id="Results">
            <h2>Results</h2>
            <h4>Depth-only PoseNet results</h4>
            <div class="images">
                <figure class="left">
                    <img src="./images/chess_results_DPoseNet-trimmy(1).png" alt="" width="200" height="200">
                    <figcaption>Chess</figcaption>
                </figure>
                <figure class="left">
                    <img src="./images/fire_results_DPoseNet-trimmy(1).png" alt="" width="200" height="200">
                    <figcaption>Fire</figcaption>
                </figure>
                <figure class="left">
                    <img src="./images/heads_results_DPoseNet-trimmy(1).png" alt="" width="200" height="200">
                    <figcaption>Heads</figcaption>
                </figure>
                <figure class="left">
                    <img src="./images/office_results_DPoseNet-trimmy(1).png" alt="" width="200" height="200">
                    <figcaption>Office</figcaption>
                </figure>
                <figure class="left">
                    <img src="./images/pumpkin_results_DPoseNet-trimmy(1).png" alt="" width="200" height="200">
                    <figcaption>Pumpkin</figcaption>
                </figure>
                <figure class="left">
                    <img src="./images/redkitchen_results_DPoseNet-trimmy(1).png" alt="" width="200" height="200">
                    <figcaption>Red Kitchen</figcaption>
                </figure>
                <figure class="left">
                    <img src="./images/stairs_results_DPoseNet-trimmy(1).png" alt="" width="200" height="200">
                    <figcaption>Stairs</figcaption>
                </figure>
            </div>
            <h4>PointNet-Pose results</h4>
            <div class="images">
                <figure class="left">
                    <img src="./images/chess_results_RGBPointLoc(1).png" alt="" width="200" height="200">
                    <figcaption>Chess</figcaption>
                </figure>
                <figure class="left">
                    <img src="./images/fire_results_RGBPointLoc(1).png" alt="" width="200" height="200">
                    <figcaption>Fire</figcaption>
                </figure>
                <figure class="left">
                    <img src="./images/heads_results_RGBPointLoc(1).png" alt="" width="200" height="200">
                    <figcaption>Heads</figcaption>
                </figure>
                <figure class="left">
                    <img src="./images/office_results_RGBPointLoc(1).png" alt="" width="200" height="200">
                    <figcaption>Office</figcaption>
                </figure>
                <figure class="left">
                    <img src="./images/pumpkin_results_RGBPointLoc(1).png" alt="" width="200" height="200">
                    <figcaption>Pumpkin</figcaption>
                </figure>
                <figure class="left">
                    <img src="./images/redkitchen_results_RGBPointLoc(1).png" alt="" width="200" height="200">
                    <figcaption>Red Kitchen</figcaption>
                </figure>
                <figure class="left">
                    <img src="./images/stairs_results_RGBPointLoc(1).png" alt="" width="200" height="200">
                    <figcaption>Stairs</figcaption>
                </figure>
            </div>
            <h4>FusionLoc results</h4>
            <div class="images">
                <figure class="left">
                    <img src="./images/chess_FusionLoc-trimmy(1).png" alt="" width="200" height="200">
                    <figcaption>Chess</figcaption>
                </figure>
                <figure class="left">
                    <img src="./images/fire_FusionLoc-trimmy(1).png" alt="" width="200" height="200">
                    <figcaption>Fire</figcaption>
                </figure>
                <figure class="left">
                    <img src="./images/heads_FusionLoc-trimmy(1).png" alt="" width="200" height="200">
                    <figcaption>Heads</figcaption>
                </figure>
                <figure class="left">
                    <img src="./images/office_FusionLoc-trimmy(1).png" alt="" width="200" height="200">
                    <figcaption>Office</figcaption>
                </figure>
                <figure class="left">
                    <img src="./images/pumpkin_FusionLoc-trimmy(1).png" alt="" width="200" height="200">
                    <figcaption>Pumpkin</figcaption>
                </figure>
                <figure class="left">
                    <img src="./images/redkitchen_FusionLoc-trimmy(1).png" alt="" width="200" height="200">
                    <figcaption>Red Kitchen</figcaption>
                </figure>
                <figure class="left">
                    <img src="./images/stairs_FusionLoc-trimmy(1).png" alt="" width="200" height="200">
                    <figcaption>Stairs</figcaption>
                </figure>
            </div>

        </a>

        <a id="Future_Work">
            <div class="container-1400">
                <h2>Future Work</h2>
                <div class="paragraph">
                    <p>While FusionLoc is competitive with other camera pose regression methods, we believe there is still room for improvement and further testing. We can incorporate muti-scale grouping in the SA layers for more robust features. Furthermore, it may also be beneficial to add temporal constraints to learn features consistent throughout pointsets and images, similar to MapNet. Finally, we believe FusionLoc can produce even better results with improved depth maps. 7 Scenes was collected using the Kinect v1, however, since then, depth cameras and depth estimation techniques, including monocular depth estimation, have seen significant improvements. We would like to further test FusionLoc using depth maps generated using varying techniques and tools.</p>
                </div>
            </div>
        </a>

        <a id="Publication">
            <div class="container-1400">
                <h2>Pre-print</h2>
                <div class="publication">
                    <p>Ali Raza, Lazar Lolic, Shahmir Akhter, Alfonso Dela Cruz, Michael Liut.</p>
                    <p><a href="https://arxiv.org/abs/2203.00080">Deep Camera Pose Regression Using Pseudo-LiDAR</a></p>
                </div>
            </div>
        </a>

        <br><br><br><br><br>
    </body>
</html>