Revive your old photos from scanned documents using AI-powered detection and extraction
Key Features | Installation | Usage | Examples | How It Works | Configuration
Nothing out of the box worked well for extracting individual photos from album scans, so I built NerdScan. It achieves 100% accuracy on my private dataset of 39 scans, though I haven't done deeper evaluations yet.
- π AI-Powered Detection: Uses state-of-the-art AI object detection to find photos in scans
- πΌοΈ Smart Photo Extraction: Precisely crops and saves individual photos from cluttered scans
- π Visual Feedback: Creates clear visualizations of detection results for easy verification
- π Intelligent Dating: Automatically adds EXIF metadata based on folder structure with sequential dates
- π² Flexible Organization: Option to preserve original folder structure or use flat output
- π Smart Filtering: Removes overlapping or low-confidence detections for clean results
- π User-Friendly CLI: Beautiful command-line interface with progress tracking and rich output
- π οΈ Highly Configurable: Fine-tune every aspect of detection to suit your needs
If you have old photo albums, scrapbooks, or document collections with multiple photos per page, NerdScan makes digitizing them effortless. Instead of manually cropping each photo, simply scan entire pages and let NerdScan handle the rest - it finds, extracts, and even dates your photos automatically!
NerdScan uses the Grounding DINO object detection model from Hugging Face (IDEA-Research/grounding-dino-base
) to find photos in your scanned images. It leverages natural language prompts like "an old photo" to identify photos with high accuracy. The detected photos are then precisely cropped, and can be automatically tagged with dates based on your folder structure.
- AI-Powered Detection: Uses the Grounded Object Detection model from Hugging Face
- Text Prompting: Leverages natural language prompts to find photos in scanned images
- Confidence Filtering: Removes low-confidence detections to reduce false positives
- Overlap Handling: Identifies and filters overlapping detections to avoid duplicates
- Smart Cropping: Precisely extracts each detected photo as a separate image
- Metadata Enhancement: Sets EXIF date data based on folder names (e.g., "1979")
- Quality Preservation: Saves high-quality JPG files with original color profiles
- Automatic Year Extraction: Detects years from folder names (e.g., "1979")
- Smart Dating: Creates sequential dates within the year for multiple photos
- Comprehensive Metadata: Sets dates in multiple formats for maximum compatibility
- Python 3.7+
- [Optional] CUDA-capable GPU for faster processing
-
Clone this repository:
git clone https://github.com/klimentij/NerdScan.git cd NerdScan
-
Create a virtual environment and install dependencies with uv:
uv venv source .venv/bin/activate # On Windows: .venv\Scripts\activate uv pip install -r requirements.txt
Put your scanned images (like JPGs) into the input
directory. NerdScan supports images of any size and resolution - it's been tested with scans up to 1200 DPI, handling extremely large files without issues.
Recommendation: Organize them into subfolders named after the year the photos were taken (e.g., input/1979/scan1.jpg
, input/1980/scan2.jpg
). NerdScan will automatically detect the year from the folder name and use it to set the EXIF creation date for the extracted photos, assigning sequential dates within that year.
python main.py
This will:
- Scan for images in the
input
directory - Extract detected photos to
output/crops
- Save visualizations to
output/visualizations
NerdScan features a rich command-line interface with detailed progress tracking and vibrant output:
python main.py --help
Option | Description | Default |
---|---|---|
-i, --input |
Input directory containing scanned images | "input" |
-o, --output |
Output directory for results | "output" |
--text-prompt |
Text prompt for AI detection | "a photo. a picture. a photograph." |
--single-image |
Process a single image instead of directory | None |
--preserve-structure |
Preserve folder structure in output | False |
--remove-overlaps |
Remove overlapping detection boxes | False |
--overlap-threshold |
Ratio threshold for considering boxes as overlapping (0.0 to 1.0). A higher value allows more overlap. | 0.05 |
--confidence-threshold |
Minimum confidence score to keep detections (0.0 to 1.0). Lower values find more potential photos but may increase false positives. Higher values are stricter. | 0.15 |
--seed |
Random seed for reproducibility | 42 |
--sample-size |
Number of random images to process | All |
--device |
Device to run model on ('cuda' or 'cpu') | Auto-detect |
python main.py --single-image path/to/scan.jpg
python main.py -i input -o output --preserve-structure
For older or vintage photos:
python main.py -i input -o output --text-prompt "an old photograph. a vintage photo."
For Polaroid photos:
python main.py -i input -o output --text-prompt "a polaroid photo. an instant camera picture."
More strict detection (fewer false positives):
python main.py -i input -o output --confidence-threshold 0.25
More lenient detection (fewer missed photos):
python main.py -i input -o output --confidence-threshold 0.10
Allow less overlap between detected boxes (removes more):
python main.py -i input -o output --remove-overlaps --overlap-threshold 0.01
Allow more overlap between detected boxes (removes fewer):
python main.py -i input -o output --remove-overlaps --overlap-threshold 0.10
python main.py -i input -o output --sample-size 10
output/
βββ crops/ # Extracted photos
β βββ file1_001.jpg
β βββ file1_002.jpg
β βββ ...
βββ visualizations/ # Detection visualizations
βββ file1_visualization.jpg
βββ ...
With --preserve-structure
enabled, the subfolder structure from the input directory will be maintained.
- Folder Organization: Name folders with years when possible (e.g., "1979") for automatic EXIF dating
- Scan Quality: Higher quality scans produce better detection results
- Custom Prompts: Use specific text prompts to improve detection for challenging cases
- Confidence Tuning: Adjust confidence threshold based on your specific scans and needs
- No detections: Try lowering the
--confidence-threshold
(e.g.,--confidence-threshold 0.10
). This makes the model more lenient. - Too many false positives: Increase the
--confidence-threshold
(e.g.,--confidence-threshold 0.25
). This makes the model stricter. - Overlapping detections: Ensure
--remove-overlaps
is enabled. If still too many overlaps remain, try lowering the--overlap-threshold
(e.g.,--overlap-threshold 0.01
) to be more aggressive about removing overlapping boxes. If important overlapping boxes are being removed, try increasing the--overlap-threshold
(e.g.,--overlap-threshold 0.10
). - Specific photo types not detected: Customize the text prompt (e.g., for Polaroids:
--text-prompt "a polaroid photo."
) to better match the objects you are looking for.
This project is licensed under the MIT License.
- Grounding DINO - For the object detection model
- HuggingFace Transformers - For model access
- Rich - For beautiful terminal output