Synthetic Dataset Generation | Annotation PipelineSynthetic Dataset Generation | Annotation Pipeline

Synthetic Dataset Generation | Annotation Pipeline

Developed an end-to-end pipeline for generating and annotating synthetic datasets for computer vision training.
Project Year
2025
Softwares/Frameworks Used
Blender (Python API), Python, Gradio , OpenCV , NumPy / SciPy ,Pillow

Synthetic Data Generation

  • Built procedural scene generation workflows in Blender to create large-scale synthetic datasets
  • Rendered paired outputs:
    • RGB images
    • ID-color (viewport) masks for surface segmentation
  • Enabled controlled variation in environments (lighting, materials, composition)

Custom Annotation Tool (Gradio-based)

  • Developed an interactive annotation tool for converting synthetic renders into YOLO instance segmentation datasets
  • Implemented dual-group system:
    • Wall (foreground structures)
    • Background (environment surfaces)
  • Designed patch-based and global color selection workflows using ID masks

Advanced Annotation Features

  • Connected-component selection for accurate surface-level segmentation
  • Morphological mask expansion for better coverage
  • Non-destructive editing via rectangle eraser
  • Undo system supporting iterative annotation workflows

Dataset Export Pipeline

  • Exported annotations in YOLO instance segmentation format (polygon-based)
  • Implemented contour extraction using OpenCV (hole-aware polygons)
  • Normalized coordinates and standardized dataset output
  • Built verification layer for previewing annotations before export

Scalability & Workflow Efficiency

  • Designed batch processing workflow for large datasets
  • Automated pairing of render + mask inputs
  • Reduced manual annotation time significantly
  • Ensured consistency and reproducibility across datasets

Use Case & Impact

  • Enables rapid generation of training data for computer vision models
  • Eliminates dependency on real-world annotated datasets
  • Bridges 3D content creation workflows with AI training pipelines

Link to GitHub
A logo of github
No items found.

Other Projects