Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Reading the press release, my jaw dropped when I saw 32k. The workaround using a vector database and embeddings will soon be obsolete.


That’s like saying we’ll not need hard drives now that you can get bigger sticks of RAM.


> The workaround using a vector database and embeddings will soon be obsolete.

This is 100% not the case. Eg I use a vector database of embedding to store an embedding of every video frame which I later use for matching.

There are many NLP-only related tasks this helps for but equally as many that still require lookup and retrieval.


True. I should have clarified that the workaround used for many NLP tasks, utilizing libs such as Langchain, will become obsolete. And after further thought, obsolete is wrong. More likely just used for more niche needs within NLP.


I think LangChain will be more important.

The GPT-4 paper even has an example of this exact approach. See section 2.10:

The red teamer augmented GPT-4 with a set of tools:

• A literature search and embeddings tool (searches papers and embeds all text in vectorDB, searches through DB with a vector embedding of the questions, summarizes context with LLM, then uses LLM to take all context into an answer)

• A molecule search tool (performs a webquery to PubChem to get SMILES from plain text)

• A web search

• A purchase check tool (checks if a SMILES21 string is purchasable against a known commercial catalog)

• A chemical synthesis planner (proposes synthetically feasible modification to a compound, giving purchasable analogs)


Quite the contrary. Utilising such libs makes GPT-4 even more powerful to enable complex NLP workflows which will likely be a majority of real business use cases in the future.


What about an AI therapist that remembers what you said in a conversation 10 years ago?


One solution would be to train the AI to generate notes to itself about sessions, so that rather than reviewing the entire actual transcript, it could review its own condensed summary.

EDIT: Another solution would be to store the session logs separately, and before each session use "fine-tuning training" to train it on your particular sessions; that could give it a "memory" as good as a typical therapist's memory.


Yeah I was thinking that you can basically take each window of 8192 tokens or whatever and compress it to a smaller number, keep the compressed summary in the window, then any time it performs a search on previous summaries if it gets a hit it can then decompress that summary fully and use it. Basically integrate search and compression into the context window


If the context window grows from 32k to 1m, maybe the entire history would fit in context. It could become a cost concern though.


I'd be willing to pay good money for a 1m limit.


Cost is still a concern, so workarounds to reduce context size are still needed


Good point! I realized after I wrote the comment above, that I will still be using them in a service I'm working on to keep price down, and ideally improve results by providing only relevant info in the prompt


I don't see how. Can you elaborate?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: