Home
Short Bio
I am a Founding Researcher at Elorian AI, where we're building AI that natively understands the visual world through multimodal reasoning. Previously, I was a Staff Research Scientist at Google DeepMind on the Gemini team and co-creator of OSS Vizier, working on LLM posttraining and reward models, prompt/hyperparameter optimization, and theoretical machine learning. I also dabble a bit in AI alignment, counterfactuals/fairness and the intersection of Faith and AI.
I steward the Global Christians in AI (CHAI) community, where AI practitioners, academics, theologians, and entrepreneurs come together monthly to discuss relevant topics in the intersection of Christianity and AI. We organize talks and socials/workshops at Google and academic conferences; please join the community if you would like to learn more!
I am grateful to have graduated with a PhD in Applied Mathematics and Computer Science at UC Berkeley, where I was fortunate to be advised under Prof. Satish Rao and Prof. Nikhil Srivastava. My interests are in the intersection of optimization, theoretical computer science and machine learning. Previously, I graduated in the Great Class of 2014 from Princeton University.
Research Focus & Interests
It's more important to ask the right questions. My work sits at the intersection of optimization, learning theory, and large-scale ML systems — right now, the question I find most exciting is how to build AI that natively reasons over the visual world. A summary of the areas I've worked on or am actively exploring (see publications for the full list):
- Multimodal Reasoning & Vision-Language Models — current focus at Elorian AI: building VLMs that reason through images directly rather than translating them to text.
- Prompt & Black-Box Optimization — co-creator of OSS Vizier, with both theoretical foundations and practical algorithms for hyperparameter and prompt tuning. See e.g. non-vacuous bounds for prompt optimization, the Vizier GP Bandit algorithm, and OptFormer.
- LLM Post-training, Alignment & Reward Modeling — preference learning, RLHF, and the gap between stated and learned preferences. See Preference Learning Algorithms Do Not Learn Preference Rankings.
- Randomized Linear Algebra & Sketching — trace estimation, low-rank approximation, and dimensionality reduction. See sparse-vector dimensionality reduction and dynamic trace estimation.
- Bandits, Online Learning & Optimization Theory — adaptive regret, zeroth-order optimization, and distributional RL. See Gradientless Descent and Adaptive Regret for Bandits.
- AI Alignment, Fairness & Faith in AI — counterfactuals, belief diversity, and cultural perspectives in generative AI; also steward of the CHAI community.
Contact & Professional Networks
- LinkedIn: richard-zhang-1b358a39
- Personal Email: qiuyizhang (at) gmail (dot) com
- GitHub: github.com/qiuyiz