Waterway-BEV: Generate Bird's Eye View Layouts of a Waterway From a First-Person View Camera Using Cross-View Transformers

Ma, F, Jiang, X, Chen, C, Sun, J, Yan, XP and Wang, J (2025) Waterway-BEV: Generate Bird's Eye View Layouts of a Waterway From a First-Person View Camera Using Cross-View Transformers. IEEE Transactions on Intelligent Transportation Systems, 26 (6). pp. 8078-8096. ISSN 1524-9050

[thumbnail of Accepted version.pdf]
Preview
Text
Accepted version.pdf - Accepted Version

Download (1MB) | Preview

Abstract

In the domain of autonomous ship navigation, the construction of bird's-eye view (BEV) layouts for waterways has obvious significance. A helmsman can generate the BEV layout of the waterway using his/her eyes only. To simulate this intelligence, a novel neural network-based algorithm named Waterway-BEV is proposed, which enables reconstructing a local map formed by the waterway layout and ship occupancies in the bird's-eye view given a first person view monocular image only. Waterway-BEV employs an efficient SEResNeXt encoder to extract features from first person view (FPV) monocular images, capturing deep semantic information related to waterways and ships. Due to the variations in information across different perspectives, Waterway-BEV incorporates a Cross-View Transformation Module, which takes the constraint of cycle consistency between views into account and makes full use of their correlation to strengthen the view transformation and scene understanding. To fully leverage the feature output of the SEResNeXt encoder, Waterway-BEV employs a decoder based on a dedicated lightweight network. This decoder is responsible for decoding the enhanced bird's-eye view (BEV) feature maps and generating the BEV layout. By employing the Focal Loss as the loss function for model optimization, Waterway-BEV takes into account the quantity and classification difficulty of ship samples during the training process, thereby improving the generation performance and convergence speed. The experiments demonstrated that Waterway-BEV achieved notable performance metrics, with mIOU and mAP rates reaching 97.8% and 98.2%, respectively, in waterway bird's-eye view layout generation. Waterway-BEV outperformed other state-of-the-art (SOTA) algorithms in generating BEV layouts of waterways. In particular, during specialized scenarios such as crossroads of waterways and tasks involving small target ships, Waterway-BEV consistently generated satisfactory bird's-eye view layouts, demonstrating robustness and applicability.

Item Type: Article
Additional Information: © 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Uncontrolled Keywords: 46 Information and Computing Sciences; 4611 Machine Learning; Eye Disease and Disorders of Vision; 0801 Artificial Intelligence and Image Processing; 0905 Civil Engineering; 1507 Transportation and Freight Services; Logistics & Transportation; 3509 Transportation, logistics and supply chains; 4602 Artificial intelligence; 4603 Computer vision and multimedia computation
Subjects: T Technology > TA Engineering (General). Civil engineering (General)
V Naval Science > VM Naval architecture. Shipbuilding. Marine engineering
Divisions: Engineering
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date of acceptance: 22 March 2025
Date of first compliant Open Access: 13 June 2025
Date Deposited: 13 Jun 2025 13:26
Last Modified: 13 Jun 2025 13:30
DOI or ID number: 10.1109/TITS.2025.3554717
URI: https://researchonline.ljmu.ac.uk/id/eprint/26586
View Item View Item