Product Meeting Minutes

Agenda: Update Timeline for deliverables and assign tasks to each member

Summary

The meeting disucssed whether the project was moving along smoothly and on time. It was decided that the next pressing order of business would be to webscrape for data to populate the database. It was decided that due to the overall speed efficiency, a RAG Framework would be used for the implementation.

Podcast of Meeting:

See Comments & Concerns for issues.

Comments & Concerns

Curry brought up that the 8 billion parameter model might lead to slow processing speed for prompts, and suggests we look into methods to speed up the overall runtime. Perhaps some way to cut down time during tokenization, vectorization or similarity search.
Mayer suggests looking into vLLM hosting.
Morrissey suggests testing 4-bit quantization for the Llama-Instruct Model to reduce parameter precision and thus speed up overall runtime.

Timeline

Decided on framework
Webscrape data to populate the database
Write a system a prompt
Implement a prototype for backend
Model Hosting
Create a basic frontend
Connect the frontend to the backend

Assignments

Curry: make use of existing APIs such as Yelp Fusion
or Firecrawl to scrape for menu data.
Leclerc: Work with Mayer to implement RAG pipeline.
Mayer: Work with Leclerc to implement RAG pipeline.
Morrissey: Using Llama-3-8B-Instruct, try to run it with 4-bit quantization

Unfinished Business: Consensus on RAG vs MCP Framework

RAG Framework

Pros: Fast information retrieval, high control, lots of tools

Cons: Static maintenance, prone to issues if you switch models

MCP Framework

Pros: Reduced overhead, can be used for different agents

Cons: Slower, less tools