Victor Ngetich
I'm a master's candidate at the UC Berkeley School of Information, specializing in natural language processing. My research interests are methodological work (new architectures, training objectives, efficiency, and robustness) and mechanistic interpretability.
On the robustness and interpretability side, my capstone project builds a perturbation-based evaluation framework that uses optimal transport–guided RAG to retrieve naturalistic prompts from WildChat, applies LLM-based style transfer to generate linguistically grounded benchmark variants, and models the resulting variation using Biber's Multidimensional Analysis, making it possible to analyze how register shifts affect model reliability. In another project, I'm investigating whether open reasoning models genuinely reason on benchmark items or retrieve memorized answers, using chain-of-thought trace analysis and layer-wise activation probing to separate computation from recall.
On the language task and representation side, I've conducted computational analysis of 74,000+ film dialogue lines to identify systematic pragmatic and structural differences between genres, using speech act detection, spaCy-based named entity recognition, and SVM/BERT/LSTM classifiers with SHAP feature importance. In a separate project, I compared Word2Vec, GloVe, and fine-tuned BERT embeddings for predicting voxel-wise fMRI brain activity from narrative text, building preprocessing pipelines for temporal alignment, Lanczos downsampling, and hemodynamic delay modeling on a 50GB dataset processed across Berkeley SCF and PSC Bridges2 HPC clusters.
Before graduate school, I worked as a software engineer in Nairobi, Kenya. At Flux Water Limited, I designed and optimized backend systems using Django and PostgreSQL and built AccessWASH, a nationwide water-access data platform handling over 100,000 datapoints. That engineering background shapes how I approach research: with attention to reproducibility, computational efficiency, and the gap between a working prototype and something that holds up at scale.
I'm a recipient of the Mastercard Foundation Scholarship, the I School Fellowship, the Dr. James R. Chen Award, and the Big Ideas Award.
