Pipeline Architecture
The preprocessing pipeline consists of 11 scripts that must be executed in filename order. All configuration is centralized inconfig.yaml.
ARIS Extraction
Extract individual frames and metadata from proprietary .aris files
- prep_1_aris_extract.py: Extracts frames as .pgm and metadata as .csv
ARIS Transformation
Convert raw sonar data to human-readable polar images
- prep_2_aris_to_polar.py: Creates polar-transformed .png images
- prep_3_aris_calc_optical_flow.py: Calculates optical flow for motion detection
- prep_4_aris_find_offsets.py: GUI for marking motion onset/end
Gantry Processing
Extract trajectories from ROS bags
- prep_5_gantry_extract.py: Extracts trajectories as .csv from ROS bags
- prep_6_gantry_find_offsets.py: Automatically detects motion onsets
GoPro Processing
Cut, downsample, and analyze video footage
- prep_7_gopro_cut.bash: Cuts clips from raw footage using timestamps
- prep_8_gopro_downsample.bash: Re-encodes to multiple resolutions
- prep_9_gopro_calc_optical_flow.py: Calculates optical flow for matching
Data Matching
Synchronize all sensor modalities
- prep_x_match_recordings.py: GUI for pairing and time offset adjustment
Configuration: config.yaml
All scripts read their settings from a centralized configuration file. This ensures consistency across the pipeline.Key Design Decisions
ARIS as Ground Truth
All preprocessing decisions were made based on and in favor of the ARIS data. The sonar provides the most reliable timestamps and serves as the reference for synchronization.GoPro Synchronization Challenge
The GoPro Hero 8 does not provide synchronized timestamps. The team used optical flow matching to align video footage with ARIS motion patterns. Timestamps were manually extracted from audio tracks where motor engagement was clearly visible.Motion Onset Detection
Optical flow calculation helps identify when actual motion begins in each recording. This is critical for:- Trimming recordings to relevant portions
- Synchronizing sensor modalities
- Matching GoPro clips to ARIS recordings
Execution Order
Scripts must be run in this specific order:- prep_1_aris_extract.py - Extract ARIS frames
- prep_2_aris_to_polar.py - Transform to polar coordinates
- prep_3_aris_calc_optical_flow.py - Calculate ARIS optical flow
- prep_4_aris_find_offsets.py - Mark ARIS motion boundaries (GUI)
- prep_5_gantry_extract.py - Extract gantry trajectories
- prep_6_gantry_find_offsets.py - Detect gantry motion onsets
- prep_7_gopro_cut.bash - Cut GoPro clips
- prep_8_gopro_downsample.bash - Downsample to target resolutions
- prep_9_gopro_calc_optical_flow.py - Calculate GoPro optical flow
- prep_x_match_recordings.py - Match and synchronize all data (GUI)
- release_1_export.py - Export final dataset
- release_2_archive.bash - Create distribution archives
Performance Considerations
- ARIS polar transformation: Can be slow for large datasets. Use
aris_to_polar_skip_existing: Trueto resume interrupted runs. - GoPro downsampling: Processing 5.3K video at 60fps is extremely time-intensive. Downsampling cut clips is more efficient than downsampling full footage.
- Optical flow: Both
lk(Lucas-Kanade) andfarnerbackmethods are supported. LK is generally faster for feature tracking.
Output Structure
After preprocessing, data is organized as:Next Steps
ARIS Extraction
Learn about sonar data extraction and polar transformation
GoPro Processing
Understand video clip extraction and downsampling
Gantry Extraction
Extract trajectories from ROS bags
Export
Assemble and package the final dataset