Lucas Sempé
Senior Evaluation Specialist | Research Fellow
Brazilian-Argentine researcher with over two decades of experience spanning grassroots programme delivery, government leadership, and academic research. Managed £2 billion in education budgets as Director at Peru's Ministry of Education, led results-based reforms, and built 215 community centres for indigenous populations. Current work focuses on impact evaluation, leading AI implementation in evidence synthesis, and the ethics of AI/LLMs in policy.
Recent Posts
View all →
The Instruction Tuning Firewall
Mental health chatbots can drift toward dangerous validation while sounding perfectly appropriate. I built a monitoring system that detects persona drift in model activations—catching problems that even a fine-tuned DeBERTa misses, with a 2.6× advantage on crisis recognition. Validated by two clinical psychologists (ICC=0.716) and tested on naturalistic emotional support conversations.
When Algorithms Meet Warzones
A drone image classifier that can't distinguish combatants from farmers. A beneficiary targeting model trained on data from before the displacement. A chatbot collecting trauma narratives in a language it barely understands. These aren't hypotheticals—they're the edge cases where AI meets impact evaluation in fragile contexts.
The Capacity Gap
I scored 2,216 AI policy documents across 193 countries on implementation capacity. The headline isn't that rich countries do better—it's that the gap nearly vanishes once you account for documentation quality. The real story is what's happening within income groups.
Talking to Your Evidence Base
What if you could ask your research library a question out loud and get a spoken answer grounded in actual studies? A retrieval-augmented system with voice interface makes research synthesis conversational.