SEM Analysis Pipeline

A full-stack computer vision system for analyzing the microstructure of ancient Chinese bronze ware molds (土范) from scanning electron microscope (SEM) images. Built for Prof. Jing Zhichun (荆志淳) at SUSTech’s Institute for Advanced Study in Social Science, the pipeline applies SAM-based zero-shot particle segmentation and VGG-B classification to identify three material components — gravel (沙砾), clay (黏土), and cavities (空腔) — and produces quantitative composition reports. Results were directly applied to ongoing archaeological research on Shang Dynasty bronze casting techniques.

View on GitHub

Highlights

Applied SAM (ViT-H, Meta’s largest checkpoint) for zero-shot particle segmentation — no domain-specific retraining; IoU/stability thresholds tuned for SEM material contrast
Implemented VGG-B for three-class mask classification: gravel (沙砾) / clay (黏土) / cavity (空腔) — 92% test accuracy; linear layer trimmed for inference efficiency
Designed full pipeline: sliding-window crop (224×224) → SAM segmentation → mask extraction → VGG classification → color overlay reconstruction → confidence + area statistics
Built a custom dataset from real SEM specimens provided by Prof. Jing Zhichun, covering 500×–2000× magnification across multiple archaeological sites and dynasties
Delivered quantitative composition output per image: per-class area pie chart and confidence distribution, directly usable as archaeological evidence
Full-stack web app (Vue.js + Flask) with live drag-and-drop SEM upload, dynamic colored mask overlay, and auto-generated analysis dashboard

Pipeline Overview

The pipeline processes any SEM image through four stages:

Sliding-window crop — input image split into 224×224 tiles (no overlap, no gaps) for memory-efficient SAM processing
SAM segmentation — ViT-H SAM generates all particle masks per tile; parameters tuned (pred_iou_thresh=0.94, stability_score_thresh, min_mask_region_area=1024) for SEM contrast
VGG-B classification — each SAM mask region is extracted from the original image and classified as gravel, clay, or cavity; masks colored by category
Result reconstruction — tiles stitched back; colored masks alpha-composited over original; confidence and area statistics computed

Segmentation Results

SEM images of bronze mold specimens, after processing through the SAM → VGG pipeline. Colors represent classified material: red = gravel (沙砾), green = clay minerals (黏土), blue = cavities (空腔).

Raw SEM input — bronze mold specimen at 1000×

Pipeline output — SAM-segmented and VGG-classified: red=gravel, green=clay, blue=cavity

Four processed specimens — variation in gravel/clay/cavity ratios reflects differences in mold material composition across specimens

Web Application

The frontend (Vue.js + Vite) exposes three sections:

Pipeline visualization — carousel walkthrough of the processing stages for non-technical stakeholders
Static gallery — curated processed specimens; click to expand original → segmented overlay comparison
Live demo — drag-and-drop SEM image upload; backend processes and returns: SAM overlay image, three per-class mask images, confidence distribution chart, area pie chart per category

The Flask backend wraps the full pipeline as REST endpoints (/upload, /img/<id>, /process/<id>), with results cached server-side per image ID.

SEM Dataset

Bronze mold specimen at 500× magnification — raw SEM input showing the particle distribution before segmentation

The dataset consists of real SEM images of excavated bronze mold specimens provided by Prof. Jing Zhichun, covering mold materials from multiple archaeological sites and time periods. Images span 500×–2000× magnification. A second set of reference specimens (UBC series) provides comparative clay samples for cross-site analysis.

Technical Summary


Language	Python 3, JavaScript
Models	SAM (ViT-H, Meta), VGG-B (PyTorch)
Task	3-class microstructure segmentation + classification (gravel / clay / cavity)
Accuracy	92% test accuracy (VGG-B on classified masks)
Frontend	Vue.js, Vite
Backend	Flask, OpenCV, NumPy
Dataset	Real SEM images at 500×, 1000×, 2000× — provided by SUSTech archaeology lab
Application	Prof. Jing Zhichun — SUSTech Institute for Advanced Study in Social Science
GPU	~7 GB VRAM (SAM ViT-H inference)