SEM Analysis Pipeline

[PyTorch] SEM image dataset of ancient bronze molds · SAM zero-shot segmentation · VGG classification · Applied to archaeological research

A full-stack computer vision system for analyzing the microstructure of ancient Chinese bronze ware molds (土范) from scanning electron microscope (SEM) images. Built for Prof. Jing Zhichun (荆志淳) at SUSTech’s Institute for Advanced Study in Social Science, the pipeline applies SAM-based zero-shot particle segmentation and VGG-B classification to identify three material components — gravel (沙砾), clay (黏土), and cavities (空腔) — and produces quantitative composition reports. Results were directly applied to ongoing archaeological research on Shang Dynasty bronze casting techniques.

View on GitHub


Highlights

  • Applied SAM (ViT-H, Meta’s largest checkpoint) for zero-shot particle segmentation — no domain-specific retraining; IoU/stability thresholds tuned for SEM material contrast
  • Implemented VGG-B for three-class mask classification: gravel (沙砾) / clay (黏土) / cavity (空腔) — 92% test accuracy; linear layer trimmed for inference efficiency
  • Designed full pipeline: sliding-window crop (224×224) → SAM segmentation → mask extraction → VGG classification → color overlay reconstruction → confidence + area statistics
  • Built a custom dataset from real SEM specimens provided by Prof. Jing Zhichun, covering 500×–2000× magnification across multiple archaeological sites and dynasties
  • Delivered quantitative composition output per image: per-class area pie chart and confidence distribution, directly usable as archaeological evidence
  • Full-stack web app (Vue.js + Flask) with live drag-and-drop SEM upload, dynamic colored mask overlay, and auto-generated analysis dashboard

Pipeline Overview

The pipeline processes any SEM image through four stages:

  1. Sliding-window crop — input image split into 224×224 tiles (no overlap, no gaps) for memory-efficient SAM processing
  2. SAM segmentation — ViT-H SAM generates all particle masks per tile; parameters tuned (pred_iou_thresh=0.94, stability_score_thresh, min_mask_region_area=1024) for SEM contrast
  3. VGG-B classification — each SAM mask region is extracted from the original image and classified as gravel, clay, or cavity; masks colored by category
  4. Result reconstruction — tiles stitched back; colored masks alpha-composited over original; confidence and area statistics computed

Segmentation Results

SEM images of bronze mold specimens, after processing through the SAM → VGG pipeline. Colors represent classified material: red = gravel (沙砾), green = clay minerals (黏土), blue = cavities (空腔).

Raw SEM input — bronze mold specimen at 1000×
Pipeline output — SAM-segmented and VGG-classified: red=gravel, green=clay, blue=cavity
Four processed specimens — variation in gravel/clay/cavity ratios reflects differences in mold material composition across specimens

Web Application

The frontend (Vue.js + Vite) exposes three sections:

  • Pipeline visualization — carousel walkthrough of the processing stages for non-technical stakeholders
  • Static gallery — curated processed specimens; click to expand original → segmented overlay comparison
  • Live demo — drag-and-drop SEM image upload; backend processes and returns: SAM overlay image, three per-class mask images, confidence distribution chart, area pie chart per category

The Flask backend wraps the full pipeline as REST endpoints (/upload, /img/<id>, /process/<id>), with results cached server-side per image ID.


SEM Dataset

Bronze mold specimen at 500× magnification — raw SEM input showing the particle distribution before segmentation

The dataset consists of real SEM images of excavated bronze mold specimens provided by Prof. Jing Zhichun, covering mold materials from multiple archaeological sites and time periods. Images span 500×–2000× magnification. A second set of reference specimens (UBC series) provides comparative clay samples for cross-site analysis.


Technical Summary

   
Language Python 3, JavaScript
Models SAM (ViT-H, Meta), VGG-B (PyTorch)
Task 3-class microstructure segmentation + classification (gravel / clay / cavity)
Accuracy 92% test accuracy (VGG-B on classified masks)
Frontend Vue.js, Vite
Backend Flask, OpenCV, NumPy
Dataset Real SEM images at 500×, 1000×, 2000× — provided by SUSTech archaeology lab
Application Prof. Jing Zhichun — SUSTech Institute for Advanced Study in Social Science
GPU ~7 GB VRAM (SAM ViT-H inference)