AI's Cambrian Explosion

OpenAI's Code Interpreter, Microsoft's Bing Chat & Google Lamenting the Gauntlet of Open Source Innovation

Everyday AI Logo

AI's Cambrian Explosion

OpenAI's Code Interpreter, Microsoft's Bing Chat & Google Lamenting the Gauntlet of Open Source Innovation

The Big Stuff

OpenAI Code Interpreter Casts a Spell

Open AI releases the Code Interpreter plugin to ChatGPT, and while it's not available to everyone, we've started to see glimpses from people who do have access. What we are seeing is beyond exciting! Code Interpreter is designed to act as a programmer and data analyst.  It has demonstrated proficiency in solving mathematical problems, conducting data analysis and visualization, and converting files between formats. This may sound dull until you see the results:

  • Upload a CSV of SF crime data and visualize it (link)

  • Segment music markets based on a spreadsheet and come up with business strategies for each segment (link)

  • Render an animated GIF like The Matrix with just instructions (link)

  • Convert an uploaded GIF to a longer MP4 with slow zoom (link)

  • Extract colors from an image to create a palette.png (link)

  • Make an interactive map of airline delays (link)

  • Decompose seasonality via text (link)

  • The best Twitter thread we've found on this is from artist @SHL0MS, where they analyze Spotify data (link)

Our only question is how we get our hands on this! Code Interpreter is only available to ChatGPT Plus subscribers who sign up to a waiting list.

Or...you could just use Bing Chat for free, right now. What??

Bing Chat is Now Open Preview - Will it Kill ChatGPT?

Microsoft released Bing Chat in Open Preview. This means that everyone can access it today for free. The new Bing Chat includes several new features, including: multimodal answers, visual search in chat, chat history, export & share functionality, improved summarization and edge actions.The multimodal search will include images and graphs right in search results. Chat history will allow you to pick up where you left off. Edge actions allow you to take action on search results. For example, if you ask Bing Chat for movie suggestions, and want to play the movie, you can type "Play on AppleTV", and it will start playing the movie. Bing Plugins will allow you to use other apps, like OpenTable, to book reservations. Summarization will summarize a long website, or a PDF that you upload. Finally, export and sharing allows you to export content from a chat session. For example, if Bing Chat wrote something for you, you can download it to a Word Doc.

Access it here.

Mind Reading AI

Human thoughts were predicted with 82% accuracy using MRI recordings

Researchers have developed an innovative brain-computer interface (BCI) that decodes continuous language through non-invasive fMRI recordings. This BCI can reconstruct full sentences by accessing cortical semantic representations, going beyond previous limitations of small word sets. The technology, which works across multiple brain regions, is versatile in deciphering perceived, imagined speech, and even silent videos. Requiring active user cooperation, this breakthrough could significantly improve communication between humans and machines (link) (link to paper).

Open Source and its "Cambrian Explosion" 

Oops haven't tweeted too much recently; I'm mostly watching with interest the open source LLM ecosystem experiencing early signs of a cambrian explosion.

OpenAI Founding Scientist, Andrej Karpathy

MosaicML Releases MPT-7BLLMs have been changing the world, but until recently, development of them required billions in funding and huge teams of scientists. Open source has started to change this with lots of models being open sourced--LLaMA by Meta, OpenLLaMA from Berkley, Pythia from EleutherAI--to name a few. However, most of these models have not been available for commercial use, or had small context windows, or smaller training sets. MosaicML has changed this by releasing MPT-7B. MPT-7B is available for commercial use, was trained on 1T tokens, is optimized for fast training and inference, and comes with open-source training code. There are four models available: Base, StoryWriter-65k, Instruct, and Chat. StoryWriter-65k has a 65 thousand token context window! This is the largest context window yet of any LLM. Instruct is a model for short form instruction following, and Chat is a chatbot-like model for dialogue (link). StoryWriter is an absolute game changer. Having a 65k context window allows prompts to be able to accept 65k tokens of text. To put this in perspective, one Twitter user fed the entire text of the Great Gatsby into a prompt and asked for an epilogue! (link). Stable V Stability AI released StableVicuna, which is the first open-sourced chatbot that has been tuned with reinforcement learning with human feedback (RLHF). This means that humans reviewed and trained feedback. StableVicuna can do basic math, write code, and help with grammar.

StarCoderStarCoder-15B was released this week. StarCoder is an open-source LLM trained for the specific purpose of coding, and reaches 40.8% on HumanEval benchmark, which beats Google's PaLM while being 1/30th the size. StarCoder was trained on permissively licensed data on GitHub on 80 programming languages. There is a VSCode integration available too, which may be a good alternative for Copilot for those looking for a least costly solution.Benchmarking LLMs with Elo RatingsLmsys.org releases Chatbot Arena, which is a benchmark platform for large language models as a leaderboard. With the proliferation of open source models, Elo ratings hopes to provide open benchmarking to help inform users of the performance of each model (link).Google - "We have no moat, and neither does OpenAI"In a leaked document, Google claims that neither they, nor OpenAI have a "moat", or something that would inhibit competitors.

The most salient portion of the document is this passage:

We have no moat, and neither does OpenAI. We've done a lot of looking over our shoulders at OpenAI. Who will cross the next milestone? What will the next move be?

But the uncomfortable truth is, we aren't positioned to win this arms race, and neither is OpenAI. While we've been squabbling, a third faction has been quietly eating our lunch.

I'm talking, of course, about open source. Plainly put, they are lapping us. Things we consider "major open problems" are solved and in the people's hands today.

Leaked document from Google

The document highlights that while Google's models hold a slight edge in terms of quality, the gap is closing very quickly, and open-source models are "pound-for-pound more capable". See the full document here.Google is clearly not wrong in their analysis. However, are they too short-term focused and are they losing site of the big picture? Google may be falling into the classic trap of competitor feature comparison, while not seeing the bigger picture. Meanwhile, their competitors have their eyes on a different prize: artificial general intelligence (AGI). Sam Altman said this week that OpenAI plans on raising upwards of $100 billion to build AGI.

AGI will undoubtedly be expensive to build, just as the first large scale GPT models were. However, it begs the question of what will happen in the interim. To what extent will LLMs be commoditized? 

Video Search

Video search has mostly been relegated to YouTube, and while YouTube search is great for finding specific videos, it is not great at searching within those videos. For example, let's say you wanted to find the classic dance scene from Pulp Fiction, but you can't remember the movie, and the only thing you remember was that Uma Thurman's character was wearing a white shirt. Plugging "woman wearing a white shirt dancing" into YouTube returns the top two videos, "slim busty romanian girl in white shirt dancing in wedding", and "pouring water over yourself while wearing a white shirt". This is not what you want.

However, plug that search into Twelve Labs, and you'll get the Pulp Fiction scene. With Twelve Labs, you can search anything within a video, including objects, text on screen, speech and people. See a demo here (link).

More Big Stuff

OpenAI Keeps Releasing, Google Scrambling, and MidJourney 5.1 Wowing

  • OpenAI may raise up to $100 billion to achieve its aim of developing artificial general intelligence (AGI) (link)

  • OpenAI releases ChatGPT 32k context window. This person gave it a 23 page congressional hearing, and started asking questions (link)

  • OpenAI releases Shap-E, a text-to-shape model (link)

  • Sam Altman and Greg Brockman on AI and the Future (Podcast) (link)

  • Google Plans to Make Search More ‘Personal’ with AI Chat and Video Clips (link)

  • "Godfather of AI", Geoffrey Hinton leaves Google, and discusses the possible end of humanity from AI (link)

  • Biden meets AI CEOs to discuss AI dangers (link) (link)

  • Midjourney Version 5.1 Released and Its Impressive! (link)

Smaller But Still Cool Things:

  • Chegg stock drops 50% after CEO says ChatGPT is impacting their education platform (link)

  • Box AI can read through extremely long documents, like the Fed's 120 page review of the SVB collapse, and answer any kind of question. (link)

  • Sal Khan's 2023 TED Talk: AI in the classroom can transform education (link)

  • IBM plans to replace 7,800 jobs with AI over time, pauses hiring certain positions (link)

  • Researchers develop novel AI-based estimator for manufacturing medicine (link)

  • Slack releases Slack GPT, native LLM support within Slack (link)

Going Deeper

  • How to summarize a book without sending 100% of your tokens to a LLM (link)

  • Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes (link)

  • A guide to prompting AI (for what it is worth) from Ethan Mollick (link)

  • How to make recipes with ChatGPT including Midjourney prompts to see what the food looks like (link)

Tweets of the Week

Possibly the funniest tweet we've ever seen. Oh Yud.

Eye Candy

All images are Midjourney 5.1, released this week
Video

Ear Candy

Do you have 30 seconds for a quick survey to help us improve Everday AI?

We'd love your feedback! Click here.

Do you like what you're reading? Share it with a friend.