Curious about how Replit Agent works? At @Replit we hosted tech talks discussing agent's internals. Let's dive into what goes into an LLM that powers the agent - slides from my talk. π§΅
How does OpenAI plugins/Browser work? A thread of detailed analysis on the server interaction. S/o to @CrisGiardina for helping with access.
TLDR: Browsing (and possibly plugin) is a different model with 8k seq length support and a toolformer like operation structure.
.@langchain conversation agents can easily consume @OpenAI plugins when given the URL. Here is one example for @Klarna using their hosted API information: klarna.com/.well-known/aiβ¦
It should be easy to onboard others plugins even when they require authentication.
The best @OpenAI plugins will not be programmatic automation tools but freelance service providers like @Taskrabbit or @fiverr.
Long context length and retrieval support would allow LLMs to orchestrate workers towards real-world tasks - using humans as an API to the real world.
Bing Jailbreak: The new Bing search is susceptible to token-smuggling attack. We can get it to generate output for a prompt of adversaries choice! Here is my first attempt at tricking the system to generate malicious output (discretion is advised). #Microsoft#Bing#jailbreak
This flew under the radar for some reason.
The key insight is that simply training on synthetic data with positive pairs (correct problem-solution pairs) can lead to the model learning spurious correlations; creatively introducing negative samples makes the process efficient.
Was curious about the basic arithmetic abilities of Claude 3 Opus against GPTs. Designed a tiny experiment over the weekend and the results are surprising.
Opus is much better than GPTs with numbers! Let's dive into the results π§΅