Reading the press release, my jaw dropped when I saw 32k. The workaround using a...

teaearlgraycold · on March 14, 2023

That’s like saying we’ll not need hard drives now that you can get bigger sticks of RAM.

nl · on March 14, 2023

> The workaround using a vector database and embeddings will soon be obsolete.

This is 100% not the case. Eg I use a vector database of embedding to store an embedding of every video frame which I later use for matching.

There are many NLP-only related tasks this helps for but equally as many that still require lookup and retrieval.

pstorm · on March 15, 2023

True. I should have clarified that the workaround used for many NLP tasks, utilizing libs such as Langchain, will become obsolete. And after further thought, obsolete is wrong. More likely just used for more niche needs within NLP.

nl · on March 15, 2023

I think LangChain will be more important.

The GPT-4 paper even has an example of this exact approach. See section 2.10:

The red teamer augmented GPT-4 with a set of tools:

• A literature search and embeddings tool (searches papers and embeds all text in vectorDB, searches through DB with a vector embedding of the questions, summarizes context with LLM, then uses LLM to take all context into an answer)

• A molecule search tool (performs a webquery to PubChem to get SMILES from plain text)

• A web search

• A purchase check tool (checks if a SMILES21 string is purchasable against a known commercial catalog)

• A chemical synthesis planner (proposes synthetically feasible modification to a compound, giving purchasable analogs)

siva7 · on March 15, 2023

Quite the contrary. Utilising such libs makes GPT-4 even more powerful to enable complex NLP workflows which will likely be a majority of real business use cases in the future.

bick_nyers · on March 14, 2023

What about an AI therapist that remembers what you said in a conversation 10 years ago?

gwd · on March 16, 2023

One solution would be to train the AI to generate notes to itself about sessions, so that rather than reviewing the entire actual transcript, it could review its own condensed summary.

EDIT: Another solution would be to store the session logs separately, and before each session use "fine-tuning training" to train it on your particular sessions; that could give it a "memory" as good as a typical therapist's memory.

bick_nyers · on March 16, 2023

Yeah I was thinking that you can basically take each window of 8192 tokens or whatever and compress it to a smaller number, keep the compressed summary in the window, then any time it performs a search on previous summaries if it gets a hit it can then decompress that summary fully and use it. Basically integrate search and compression into the context window

pstorm · on March 15, 2023

If the context window grows from 32k to 1m, maybe the entire history would fit in context. It could become a cost concern though.

jbkkd · on March 15, 2023

I'd be willing to pay good money for a 1m limit.

justanotheratom · on March 14, 2023

Cost is still a concern, so workarounds to reduce context size are still needed

pstorm · on March 15, 2023

Good point! I realized after I wrote the comment above, that I will still be using them in a service I'm working on to keep price down, and ideally improve results by providing only relevant info in the prompt

siva7 · on March 14, 2023

I don't see how. Can you elaborate?