Finetuning 500m AI agents in production with 2 engineers — Mustafa Ali & Kyle Corbitt
AI Summary
Video Summary: Scaling Production with AI at Method
Introduction
- Host: Kyle Corbett from Open Pipe
- Guest: Mustafa Ali from Method
- Topic: Scaling production to over 500 million agents using data aggregation.
About Method
- Centralizes liability data from multiple sources (credit bureaus, card networks, banks).
- Provides enhanced data to fintechs, banks, and lenders for:
- Debt management
- Refinancing
- Loan consolidation
- Personal finance management
Challenges Faced
- Early customer requests for more detailed liability data (e.g., payoff amounts, escrow balances).
- Lack of a central API for necessary data; working directly with banks would take years.
- Current solutions involve inefficient manual processes and offshore teams, resulting in:
- High costs
- Slow response times
- Potential for human errors.
Solutions and Technologies Used
- Implementation of AI tools (GPT-4) to parse unstructured data efficiently.
- Developed an agentic workflow using GPT-4 for:
- Data extraction
- Real-time processing.
- Ran into high costs ($70,000 in the first month) and scaling challenges.
Important Metrics
- Error rates fluctuated between models:
- GPT-4: ~11% error rate.
- OpenAI’s 03 Mini: ~4% error rate.
- Latency issues:
- GPT-4 ~1 second response time
- 03 Mini ~5 seconds.
Fine-Tuning Approach
- Adopted custom model fine-tuning to improve performance metrics:
- Reduced error rates significantly.
- Lowered costs for AI operations.
- Fine-tuning required more engineering investment but yielded better results tailored to specific use cases.
Conclusion
- Productionizing AI solutions requires patience and openness from engineering and leadership teams.
- Proper benchmarking and iterative improvements to achieve efficiency and scalability.