Building a RAG system, Part 1 Update - Data Pivot, YC Companies

A week ago I posted about a little RAG project I'm working on, and I've spent the last week poking around looking for data about snowshoeing in Tahoe. As I've done this, I had a bit of an aha moment, in many ways inspired by YC holding their first in-person demo day in a long time this week.

While I love snowshoeing, I'm not sure this is something that would actually get much use, and well - even if I'm just playing around, I like the idea of building something that's actually pretty useful. And, heck, I already know where I like to snowshoe, what I don't know, but find totally fascinating is which of the YC companies coming out of demo day will end up on the path to unicorn status.

Of course, there's no magic rubrics for this, but there are some very clear patterns, many related to fundraising and team building, a lot of which is available on X. 

So I'm pivoting my tutorial idea here and I'm going to build a RAG system that leverages previous YC funding data. The idea then is that you could share updates about a current YC company, and it could use past YC startup data to better analyze their trajectory. Here's an example of how I think the model could be trained to think.

Supposed there's a really positive correlation between YC companies that raise from a16z going on to become unicorns, then sharing with model that a particular company just raised from a16z would put them on that path. On the other hand, if a company decides to bootstrap, probably not. 

Of course, the power of RAG and LLMs here is the ability to take in a lot of data and use it all together, because as we all know, not every company that leaves YC and raises from a16z become a unicorn. 

As for how I'm going to collect this data, I'm still evaluating all the different data sources that are out there, this might take a few weeks so expect a bit of a gap from me on this as I'm investigating. The nice thing about doing stuff like this is, there's no rush, I'm just incredibly curious about RAG systems and am having fun exploring and sharing with the handful of you that read this blog what I learn as I go.

Okay, well it's Friday morning and I just finished my first cup of coffee, time for more coffee. TGIF everyone and thanks for reading!