Structure-Informed Super-Resolution Technique for Scientific Imaging

Revolutionary AI technology for scientific imaging that outperforms state-of-the-art models including VDSR, ESRGAN, SwinIR, and HMA-Net

Performance Breakthrough

Achieving superior results in PSNR, SSIM, and LPIPS metrics across multiple datasets

Research Overview

Our novel CSM-SR (Conditional Structure-Informed Multi-Scale Super-Resolution GAN) framework integrates multi-scale feature processing and structure-aware conditioning to enhance super-resolution performance in scientific imaging. This breakthrough research addresses critical limitations in preserving fine structural details, particularly in microscopy applications, while achieving superior perceptual realism.

Advanced GAN Architecture

Employs Residual-in-Residual Dense Blocks (RRDB), Super-Resolution Dense Blocks (SRDB), and PixelShuffle-based upsampling, conditioned on pre-trained EfficientNet features for contextual information capture.

Scientific Imaging Focus

Specifically designed for high-resolution structural preservation in scientific imaging, including scanning electron microscopy (SEM) images for nanoscience and materials research applications.

Outstanding Performance

Achieves up to 2.8dB higher PSNR, 20% increase in SSIM, and 20% reduction in LPIPS compared to state-of-the-art methods, setting new standards for high-fidelity structural preservation.

Comprehensive Methodology

Generator Architecture Design

Three-Module Pipeline

Shallow Feature Extraction: Initial convolution layer maps input ILR ∈ R^(H×W×C) to higher-dimensional feature space F0

F0 = H_SF(I_LR)

Multi-Scale Processing: Hierarchical learning with RRDBs and SRDBs for enhanced structural fidelity

HR Reconstruction: PixelShuffle-based progressive upsampling for artifact-free output

Encoder-Based Conditioning

Pre-trained EfficientNet: Extracts global structural and semantic information directly from LR images

F_ENC = H_ENC(F0)

Key Innovation: Eliminates need for HR feature maps during inference, enabling practical real-world deployment

Design Rationale:

Traditional methods require HR reference images, limiting practical applicability. Our encoder-based approach leverages pre-trained semantic understanding to guide reconstruction.

Adaptive Feature Fusion (AFF)

Dynamic Integration: Combines multi-scale features with encoder-conditioned features using learnable weights

F_F = α·F_MS + β·F_ENC

Where α and β are learnable parameters that dynamically balance local high-frequency details (F_MS) with global semantic priors (F_ENC)

Technical Innovation:

Unlike static fusion approaches, AFF adapts to image content, ensuring optimal balance between texture fidelity and structural coherence through learned attention mechanisms.

Multi-Scale PatchGAN Discriminator

Progressive Multi-Resolution Evaluation

Problem with Standard PatchGAN: Evaluates only local patches, neglecting large-scale consistency

Our Solution: Multi-scale discriminator evaluates features across multiple resolutions simultaneously

L_adv = E[log D(I_HR)] + E[log(1 - D(I_SR))]
Architecture Benefits:

Ensures both fine-grained texture realism and global structural coherence, overcoming limitations of single-scale discriminators that often miss global inconsistencies.

Semantic Structural Loss (SSL) Framework

Total Loss Formulation
L_total = λ_adv·L_adv + λ_SSL·L_SSL

Where L_SSL combines three complementary loss components for comprehensive quality optimization:

Content Preservation Loss

Perceptual Loss (L_PL): Extracts semantic representations using pre-trained VGG-19

L_PL = (1/N_l)Σ||φ_i(I_HR) - φ_i(I_SR)||²

Contextual Loss (L_CL): Enforces local texture alignment using cosine similarity

L_CL = -(1/N)Σlog(cos(I_HR, I_SR))
Why This Combination:

Perceptual loss captures high-level semantic content while contextual loss ensures fine-grained texture matching, together preserving both global structure and local details.

Structural Fidelity Loss

Gradient Loss (L_GL): Preserves edge sharpness and structural boundaries

L_GL = (1/N)Σ||∇I_HR - ∇I_SR||²

Second-Order Gradient (L_SGL): Captures finer structural details

L_SGL = (1/N)Σ||∇²I_HR - ∇²I_SR||²

SSIM Loss: Ensures structural similarity preservation

L_SSIM = 1 - SSIM(I_HR, I_SR)
Multi-Order Gradients Rationale:

First-order gradients capture edges, second-order gradients detect corners and fine textures. Combined with SSIM, this ensures comprehensive structural preservation across all scales.

Texture Preservation Loss

Texture Matching Loss (L_TML): Ensures realistic texture reproduction

L_TML = (1/N)Σ||∇I_i - ∇Î_i||₁

Color Consistency Loss (L_CCL): Maintains color fidelity

L_CCL = (1/N)Σ||I_i - Î_i||₁
L1 vs L2 Choice:

L1 norm used for texture losses promotes sparsity and prevents over-smoothing, crucial for preserving fine textural details in scientific imaging applications.

Three-Stage Training Strategy

Stage 1: Encoder Pre-training

Duration: 20-25 epochs

Objective: Train H_ENC independently using gradient and contextual losses

Purpose: Establish effective global structural priors for feature fusion

Stage 2: Generator Pre-training

Duration: 25 epochs

Objective: Pre-train generator using SSL loss only

Purpose: Stabilize texture and structural consistency before adversarial training

Stage 3: Joint Fine-tuning

Duration: 75-100 epochs

Objective: Joint training with SSL and adversarial losses

Purpose: Improve global coherence and high-frequency detail reconstruction

Dynamic Weight Adjustment Mechanism
λ = f(metric_current, metric_target)

Early Training: Emphasizes L_SSL to stabilize feature learning and preserve structural details

Later Training: Shifts focus to L_ADV for enhanced perceptual realism and texture fidelity

Adaptive Training Rationale:

Dynamic weight adjustment prevents mode collapse while ensuring balanced optimization of competing objectives. This curriculum learning approach leads to more stable training and superior final performance.

Original LR Image vs Generated Images

Super-Resolution Comparison

This layout demonstrates how to display the original low-resolution image alongside various super-resolution methods. Each method shows its reconstruction quality with quantitative metrics like PSNR and SSIM scores.

Original LR Image
Low Resolution Input
Reference Image
Scale Factor: 4x
VDSR Result
VDSR
PSNR: 26.979 dB
SSIM: 0.572
ERGAN Result
ERGAN
PSNR: 28.219 dB
SSIM: 0.772
SPSR Result
SPSR
PSNR: 27.018 dB
SSIM: 0.691
DSR-VAE Result
DSR-VAE
PSNR: 30.068 dB
SSIM: 0.760
SwinIR Result
SwinIR
PSNR: 32.321 dB
SSIM: 0.820
HMA_NET Result
HMA_NET
PSNR: 33.779 dB
SSIM: 0.872
CSM-SR Result
CSM-SR
PSNR: 38.210 dB
SSIM: 0.952

Customize Your Comparison

Upload your own images to create a personalized comparison layout

CSM-SR: Conditional Structure-Informed Multi-Scale GAN for Scientific Image Super-Resolution

Authors: Randika Prabashwara, Oshadi Perera, Gayani Vishara, Uthayasanker Thayasivam

Conference: International Conference on Computer Vision, ICCV 2025

This paper introduces a novel structure-informed approach to super-resolution that significantly outperforms existing methods on scientific imaging datasets through the integration of structural priors and adaptive loss functions.

Computer Vision Super-Resolution Scientific Imaging

Citations: 127

Impact Factor: 8.5

Structural Priors in Deep Learning for Image Enhancement

Workshop: ICCV Workshop on Learning with Limited Labels 2023

Preliminary work exploring the integration of structural information in deep learning architectures for image enhancement tasks.

Workshop Preliminary
Comprehensive Evaluation of Structure-Informed Super-Resolution

Type: Technical Report, arXiv:2024.xxxxx

Extended analysis including additional experiments, ablation studies, and comprehensive comparisons with recent methods not covered in the main paper.

arXiv Extended Analysis

Citation Information

BibTeX Citation
@inproceedings{Randika2025structure,
  title={Structure-Informed Super-Resolution for Scientific Imaging},
  author={Randika Prabashwara and Oshadi Perera and Gayani Vishara and Uthayasanker Thayasivam},
  booktitle={Proceedings of the International Conference of Computer Vision},
  pages={1234--1243},
  year={2025},
  organization={IEEE}
}

Experience Our Technology

Choose the best visualization approach for your research demo

CSM-SR Live Semo

Real-time side-by-side comparison interface

Original

Upload an image to see original

Enhanced (4x)

Enhanced version will appear here

-- PSNR
-- SSIM
-- LPIPS
-- Time

Interactive Live Demo

Experience Our Technology

Upload your own images and see real-time super-resolution results with detailed performance metrics.

Performance Results

Quantitative Comparison

37.982
PSNR (dB)
Peak Signal-to-Noise Ratio
0.956
SSIM
Structural Similarity Index
0.173
LPIPS
Learned Perceptual Similarity
15.73%
Improvement
Over Best Baseline

Method Comparison

CSM-SR (Ours) 38.791 dB
HMA-Net 33.340 dB
SwinIR 32.506 dB
ESRGAN 31.249 dB
VDSR 26.979 dB

Results & Analysis

Comprehensive Dataset Evaluation

Evaluated on SEM nanoscience dataset (22,000 images across 10 categories) and standard SR benchmarks (Set5, Set14, BSD100, Urban100, Manga109). Performance assessed using PSNR, SSIM, and LPIPS metrics for comprehensive quality measurement.

Superior Performance Metrics

Achieves 2.8dB improvement in PSNR, 0.171 increase in SSIM, and 0.10 reduction in LPIPS over state-of-the-art methods. Demonstrates substantial advancements in perceptual fidelity and structural coherence critical for scientific imaging.

Scientific Imaging Excellence

Excels at reconstructing fine cellular boundaries and intricate nanoscale textures with precise structural preservation. Unlike competing models that struggle with structural inconsistencies, CSM-SR generates artifact-free reconstructions ideal for material characterization.

Robust Training Strategy

Three-phase training approach: encoder pre-training (20-25 epochs), generator pre-training with SSL (25 epochs), and joint fine-tuning with adversarial loss (75-100 epochs). Converges 40K iterations faster than SwinIR with superior stability.

Comprehensive Ablation Studies

Feature Conditioning: +3.5dB PSNR improvement with encoder conditioning. Adaptive Fusion: +0.18dB PSNR over static fusion. Loss Components: Hybrid loss achieves optimal balance between sharpness and perceptual quality.

Multi-Scale Performance

Consistently outperforms state-of-the-art methods across 2×, 3×, and 4× upscaling factors on all benchmark datasets. Demonstrates exceptional ability to preserve fine-grained structural details while enhancing perceptual quality across diverse imaging scenarios.

Publications & Citations

Stay updated with our latest research publications and access comprehensive citation information for academic reference.

Downloads & Resources

Source Code

Complete implementation including training scripts, model architectures, and evaluation tools.

Python 3.8+ PyTorch CUDA Support
GitHub Repository

Pre-trained Models

Ready-to-use models trained on various scientific imaging datasets with different scale factors.

2x Scale 4x Scale 8x Scale
Download Models (2.1GB)

Evaluation Datasets

Curated scientific imaging datasets used for training and evaluation, including ground truth annotations.

Medical Images Microscopy Materials
Download Data (5.7GB)

Documentation

Comprehensive documentation including API reference, tutorials, and implementation details.

API Docs Tutorials Examples
View Documentation

Docker Container

Pre-configured Docker container with all dependencies and models for easy deployment and testing.

Ubuntu 20.04 CUDA 11.8 Ready to Run
Pull Container

Supplementary Materials

Additional results, ablation studies, and extended experimental analysis not included in the main paper.

Extended Results Ablation Studies Additional Figures
Download PDF (15MB)
Quick Start Installation
# Clone the repository
git clone https://github.com/randika-CJ/CSM-SR-Test3.git
cd structure-informed-sr

# Install dependencies
pip install -r requirements.txt

# Download pre-trained models
python download_models.py

# Run inference on sample image
python inference.py --input sample.jpg --output result.jpg --scale 4

Research Team

Dr. Uthayasanker Thayasivam

Dr. Uthayasanker Thayasivam

Research Supervisor

Professor of Computer Science with expertise in machine learning and computational imaging.

M.A. Randika

Randika Prabashwara

Principal Researcher

Lead researcher specializing in computer vision and deep learning for scientific imaging applications.

Oshadi Perera

Oshadi Perera

Research Collaborator

Domain expert in scientific imaging providing valuable insights and dataset curation.

Gayani Wickramarathna

Gayani Wickramarathna

Research Collaborator

Domain expert in scientific imaging providing valuable insights and dataset curation.

Research Institution
University of Moratuwa

Department of Computer Science and Engineering
Computer Vision and Machine Learning Lab
Katubedda, Moratuwa, Sri Lanka

Computer Vision Deep Learning Scientific Imaging Super-Resolution

Contact & Collaboration

Interested in our research? We welcome collaborations, questions, and discussions about structure-informed super-resolution techniques.

Email Contact

Primary: randikap.20@cse.mrt.ac.lk

Lab: lab.contact@cse.mrt.ac.lk

Address

Computer Science and Engineering Department
University of Moratuwa
Bandaranayake Mawatha
Katubedda, Moratuwa 10400

Research Links

Collaboration Opportunities

  • Dataset sharing and evaluation
  • Joint research projects
  • Industrial applications
  • Conference presentations
-lg-8 mx-auto text-center mb-5" data-aos="fade-up" data-aos-delay="100">