AI-Powered Resume Screening

Built an AI system that evaluates how closely resumes match job descriptions using semantic similarity models. The goal was to create a scalable tool for more efficient resume screening by understanding not just keywords but the meaning behind candidate experience and job requirements.

Key Skills Demonstrated

Generated 23,000 synthetic resumes and 7,800 job descriptions to build a diverse, representative training dataset for model development.
Used Deep Seek models to simulate realistic candidate work histories, skills, and experiences across multiple industries and job levels.
Designed an NLP pipeline using BERT Tokenizer, BERT Encoder, and Sentence Transformers to convert text into high-dimensional embeddings.
Applied Cosine Similarity to measure semantic alignment between resumes and job descriptions, enabling nuanced matching beyond simple keyword overlap.
Built a custom neural network to refine similarity scoring, optimizing performance with MSE loss values between 0.0002 and 0.0009.
Developed a PDF resume parser to process candidate submissions into machine-readable formats, identifying inconsistencies and edge cases for future improvement.

Project Insights & Learnings

One of the biggest challenges was handling the natural inconsistency in resume formats and job posting structures. Building the parser highlighted how messy real-world data can be and how important preprocessing is for downstream model performance. I spent significant time testing how different inputs affected embedding quality and similarity calculations.

This project strengthened my ability to build end-to-end NLP systems, from dataset creation and model design to evaluation and practical deployment concerns. It also deepened my understanding of embeddings and how semantic models handle subtle differences in language.

Key Skills Demonstrated

Project Insights & Learnings

Project Slideshow