Finetuning 500m AI agents in production with 2 engineers — Mustafa Ali & Kyle Corbitt



AI Summary

Video Summary: Scaling Production with AI at Method

Introduction

  • Host: Kyle Corbett from Open Pipe
  • Guest: Mustafa Ali from Method
  • Topic: Scaling production to over 500 million agents using data aggregation.

About Method

  • Centralizes liability data from multiple sources (credit bureaus, card networks, banks).
  • Provides enhanced data to fintechs, banks, and lenders for:
    • Debt management
    • Refinancing
    • Loan consolidation
    • Personal finance management

Challenges Faced

  • Early customer requests for more detailed liability data (e.g., payoff amounts, escrow balances).
  • Lack of a central API for necessary data; working directly with banks would take years.
  • Current solutions involve inefficient manual processes and offshore teams, resulting in:
    • High costs
    • Slow response times
    • Potential for human errors.

Solutions and Technologies Used

  • Implementation of AI tools (GPT-4) to parse unstructured data efficiently.
  • Developed an agentic workflow using GPT-4 for:
    • Data extraction
    • Real-time processing.
  • Ran into high costs ($70,000 in the first month) and scaling challenges.

Important Metrics

  • Error rates fluctuated between models:
    • GPT-4: ~11% error rate.
    • OpenAI’s 03 Mini: ~4% error rate.
  • Latency issues:
    • GPT-4 ~1 second response time
    • 03 Mini ~5 seconds.

Fine-Tuning Approach

  • Adopted custom model fine-tuning to improve performance metrics:
    • Reduced error rates significantly.
    • Lowered costs for AI operations.
  • Fine-tuning required more engineering investment but yielded better results tailored to specific use cases.

Conclusion

  • Productionizing AI solutions requires patience and openness from engineering and leadership teams.
  • Proper benchmarking and iterative improvements to achieve efficiency and scalability.