AI Augmentation Advancing

Tomorrow's interfaces arriving as AI becomes an integral part of operating systems and day-to-day tools.

Everyday AI Logo

AI Augmentation Advancing

Tomorrow's interfaces arriving as AI becomes an integral part of operating systems and day-to-day tools.

The Big Stuff

Microsoft Windows Copilot

This week, Microsoft introduced Copilot, an AI-powered assistant that is built in to Windows. Copilot is integrated into the task bar and offers many different capabilities, such as centralized assistance, customization and commands, summarization, question answering, making calls, and even planning trips. Centralized assistance allows users to ask technical questions and get answers right in context, with links to solutions (e.g. turn on dark mode).

Developers can leverage Bing and ChatGPT plugins to create innovative experiences for users. Windows Copilot will be available for preview on Windows 11 in June, so stay tuned for updates and be part of this exciting journey.

Video Reconstruction from Brain Waves

Scientists have made significant progress in reconstructing human vision from brain activities, shedding light on our cognitive processes. While previous research focused on reconstructing static images, a new study introduces Mind-Video, a technique that can recover continuous visual experiences in the form of videos. By analyzing brain data using advanced modeling and learning algorithms, Mind-Video can reconstruct high-quality videos at various frame rates. The reconstructed videos were evaluated using metrics for semantic understanding and visual quality, outperforming previous methods by a significant margin. Importantly, the researchers demonstrated that Mind-Video aligns with established physiological processes, making it both biologically plausible and interpretable. This breakthrough brings us closer to understanding how the brain processes and represents visual information, paving the way for exciting applications in the future. (link)

Google Bard Now Includes Image Generation

The first of a series of features announced at Google's I/O conference is now live: the ability to include images in responses, adding a dash of visual spice to your interactions and opening up a whole new world of possibilities. By clicking on an image, you can view its source. Bard, Google's AI, now includes images in relevant responses and even when specifically requested, starting with English. But this is just the beginning. In the coming weeks, Google plans to roll out support for more languages, introduce the capability to generate images, and even allow users to prompt Bard with images using Google Lens. This is a significant step towards making Bard more visual and interactive. (link)

Google Introduces AI Generated Ads

Google is ushering in a new era of advertising with the integration of conversational AI into Google Ads. Simply provide it with your landing page, and prepare to be amazed as it generates keywords, headlines, descriptions, images, and additional elements for your campaign.

Adobe’s Generative Fill Drops Some Jaws

Adobe Photoshop is shaking things up with its latest feature, Generative Fill. This AI-powered toolset is all about giving you more control over your images, and it does so in a way that's as simple as typing. With a text prompt, generative fill can: generate objects, backgrounds, extend images, and remove objects. Love it or hate it, pro photographers are waking up to AI collaboration to their craft. Releases Copilot

Perplexity Copilot is a versatile tool, adept at handling your queries, even in complex scenarios such as trip planning or shopping. By comprehending your initial question, Copilot gathers pertinent information from various sources like the web, news, or Wolfram|Alpha. What enhances its functionality are interactive buttons that pop up as needed, prompting you to clarify or refine your query. This feature helps narrow down your searches, ensuring that you receive more relevant and targeted answers. As a result, you spend less time wading through unrelated responses and making tedious edits to your questions. Whether you're planning a journey or hunting for the best deals, Perplexity Copilot is equipped to make the process more efficient and satisfying. (link)

Google Soundstorm

SoundStorm represents a transformative development in the field of audio technology. Using input data from AudioLM, another prominent system, SoundStorm is capable of swiftly and efficiently generating high-quality audio content.

What distinguishes SoundStorm from other models is its remarkable processing speed. In fact, it's able to create 30 seconds of high-fidelity audio in just half a second when operating on specialized hardware known as a TPU-v4 device. This level of performance is a considerable leap forward in terms of speed compared to traditional audio generation techniques.

But speed isn't SoundStorm's only strength. It also excels in creating extended, authentic-sounding audio sequences. When provided with a dialogue transcript, along with speaker cues and brief voice samples, SoundStorm has the capability to synthesize detailed dialogue segments that faithfully reproduce the speakers' original voices. (link)

More Big Stuff

  • Anthropic raises $450M series C (link)

  • Open AI releases shared links (link), iOS App in multiple countries (link), Web Browsing with Bing (link), and a blog post on the Governance of superintelligence (link)

  • The FDA has approved Neuralink launching its first human clinical study (link)

Smaller But Still Cool Things:

Whenever I have a new idea, I ask GPT-4 to write the most basic version, I provide feedback, it apologizes, and we iterate until we reach the 1.0 version I have in mind. I use up my GPT-4 quota (25 entries/3 hours) multiple times a day. With the support of GPT-4, I feel unstoppable.

The Leverage of LLMs for Individuals
  • The Leverage of LLMs for Individuals (link)

  • ChatGPT is saving my life (link)

  • 58% of US adults have heard of ChatGPT; 14% have tried it (link)

  • The Optimist’s Guide to Artificial Intelligence and Work [paywall] (link)

  • The man who put Microsoft in the AI lead (link)

  • Carvana created 1.3 million hyper-personalized videos. (link)

  • JPMorgan Chase files for ChatGPT-like investment advice service (link)

  • A view into ChatGPT Code Interpreter’s ability to analyze data (link)

Going Deeper

I’m excited to announce Voyager, the first lifelong learning agent that plays Minecraft purely in-context. Voyager continuously improves itself by writing, refining, committing, and retrieving code from a skill library.

Dr Jim Fan
  • Voyager: An Open-Ended Embodied Agent with Large Language Models (link)

  • Want to level-up your prompt engineering? Read this guide from Microsoft (link)

  • How to use ChatGPT with your Google Drive in 30 lines of Python. (link)

  • LangChain Retrieval Webinar (link)

  • Cohere launches LLM University (link)

  • LIMA: LLaMA 65B + 1000 supervised samples = {GPT4, Bard} level performance (link)

  • Reflective Linguistic Programming (RLP): A Stepping Stone in Socially-Aware AGI (SocialAGI) (link)

  • Any-to-Any Generation via Composable Diffusion (link)

  • BioDEX, a dataset for Biomedical adverse Drug Event Extraction, containing 19k papers and 256k expert-created drug reports. (link)

Tweets of the Week

Eye Candy


DALL-E2 appears to be getting some upgrades, including text. Here is a preview of what’s to come from OpenAI’s Adam.GPT.

Ear Candy

Do you have 30 seconds for a quick survey to help us improve Everday AI?

We'd love your feedback! Click here.

Do you like what you're reading? Share it with a friend.