iOS Vision API Demo: On-Device OCR, Poses, Barcodes
Clone this SwiftUI iOS app to test Apple's Vision framework locally for text recognition, rectangle detection, body pose tracking, and barcode scanning using MVVM architecture—no cloud needed.
Implement Four Core Vision Features On-Device
Build privacy-focused computer vision apps by integrating Apple's Vision framework directly into iOS. The demo processes images from camera or photo library entirely on-device for speed and data security. Key implementations:
- Text Recognition (OCR): Use
VNRecognizeTextRequestto extract text with confidence scores, visualized in SwiftUI Charts viaConfidenceChart.swift. - Rectangle Detection: Configure
VNDetectRectanglesRequestto identify rectangular shapes in real-time. - Human Body Pose Detection: Track joints with
VNDetectHumanBodyPoseRequest, rendering poses on detected bodies. - Barcode Detection: Scan multiple formats using
VNDetectBarcodesRequest.
All features handle live camera feeds or static images through CameraService.swift and VisionService.swift, requesting NSCameraUsageDescription and NSPhotoLibraryUsageDescription permissions only when needed.
MVVM Architecture for Scalable Vision Apps
Structure your Vision-powered iOS app with clean separation:
MyVisionAPI/
├── Models/VisionModels.swift # Results data
├── Services/
│ ├── VisionService.swift # API requests
│ └── CameraService.swift # Input handling
├── Views/
│ ├── WelcomeView.swift
│ ├── ConfidenceChart.swift
│ ├── TextRecognitionView.swift
│ ├── RectangleDetectionView.swift
│ ├── BodyPoseView.swift
│ └── BarcodeDetectionView.swift
├── ContentView.swift # Tab navigation
└── MyVisionAPIApp.swift # Entry point
This setup isolates Vision logic in services, keeps views declarative with SwiftUI, and uses models for structured outputs. Configure app signing in Xcode for MyVisionAPI.entitlements and build with Cmd+R.
Quick Setup and Testing Workflow
Clone repo, open in Xcode, select signing team for MyVisionAPI target, then run. Test via tabbed interface:
- Text: Pick image/camera, view extracted text and confidence chart.
- Rectangles: Detect and overlay bounding boxes.
- Poses: Pose estimation on human figures.
- Barcodes: Decode payloads instantly.
Troubleshoot builds with Cmd+Shift+K clean; check console for runtime errors. Performance stays smooth on-device. Contribute by branching git checkout -b feature/name, committing, and pushing—MIT licensed.