Real-Time · 98% Accuracy

Turn Hand Signs
Into Words

A real-time ASL fingerspelling recognition system built with deep learning. ~7,000 training images, dual-layer CNN, Hunspell autocorrect — bridging the gap between sign language and text.

Start Signing → How It Works

0% Accuracy

0+ Training Images

0 ASL Letters

0x CNN Layers

Live Camera Feed

A 98.4% confident

Scroll to explore

↓

About the Project

Breaking Communication Barriers
with AI

Sign2Text is an end-to-end deep learning pipeline that translates American Sign Language (ASL) fingerspelling into readable text in real time — making communication more accessible for the hearing-impaired community.

🧠

Dual-Layer CNN

TensorFlow/Keras model with two convolutional layers trained on 7,000+ custom images for robust recognition across all 26 ASL letters.

⚡

Real-Time Processing

OpenCV-powered video pipeline with Gaussian blur and adaptive thresholding delivers low-latency predictions at 30 frames per second.

✏️

Hunspell Autocorrect

Intelligent sentence formation using Hunspell dictionary integration to correct spelling errors in predicted character sequences.

🎯

Sub-Classifiers

Dedicated sub-classifiers resolve visually confusing sign pairs like D/R/U, boosting real-world accuracy to 98%.

📊

Frame Validation

Multi-frame consensus voting eliminates noisy single-frame errors and stabilizes letter predictions across video streams.

🔬

Custom Dataset

~7,000 images collected across all alphabets with variations in lighting, hand sizes, and orientations for maximum generalization.

Interactive Demo

Try It Yourself

Simulate the Sign2Text pipeline below. Click letters to build words, or let the auto-demo run through an example phrase.

ASL Input Feed

—

Confidence

—

Translated Text

Click a letter to sign it:

Hunspell Autocorrect

Architecture

How Sign2Text Works

A modular, multi-stage pipeline that goes from raw camera frames to autocorrected text output in milliseconds.

📷

Camera Capture

OpenCV reads real-time video frames from webcam at 30 FPS. Frames are normalized and resized to a fixed input dimension.

→

🔧

Preprocessing

Gaussian blur removes noise. Adaptive thresholding creates clean binary hand masks. The ROI region is cropped and fed to the model.

→

🤖

CNN Inference

A dual-layer TensorFlow/Keras CNN classifies each preprocessed frame into one of 26 letter classes with a softmax confidence score.

→

✅

Frame Validation

Predictions are buffered over N frames. The dominant class across the buffer is accepted as the stable letter prediction.

→

✏️

Autocorrect

Hunspell dictionary checks formed words against known spellings and suggests corrections, enabling natural sentence assembly.

🎯

Sub-Classifier for Confusable Signs

Letters like D, R, U share similar hand shapes. Sign2Text uses dedicated sub-classifiers that activate only when the primary model's confidence falls below a threshold, resolving ambiguity with higher precision.

Performance

Model Performance

Evaluated on a held-out test set. Results show consistent accuracy across all 26 ASL letters even in challenging lighting conditions.

Overall Accuracy

Primary CNN classifier on full 26-class ASL alphabet

Training Accuracy

99.2%

Validation Accuracy

98.4%

Test Accuracy

98.0%

Sub-Classifier (D/R/U)

97.5%

Autocorrect Word Hit Rate

94.0%

Tech Stack

Python TensorFlow Keras OpenCV NumPy Hunspell CNN Gaussian Blur Thresholding

Turn Hand Signs Into Words

Breaking Communication Barrierswith AI

Dual-Layer CNN

Real-Time Processing

Hunspell Autocorrect

Sub-Classifiers

Frame Validation

Custom Dataset

Try It Yourself

How Sign2Text Works

Camera Capture

Preprocessing

CNN Inference

Frame Validation

Autocorrect

Sub-Classifier for Confusable Signs

Model Performance

Overall Accuracy

ASL Fingerspelling Alphabet

Turn Hand Signs
Into Words

Breaking Communication Barriers
with AI