⚡️ Continual Training

PLUS: Nvidia could use generative AI to design chips

Good Morning. How do we keep foundational AI models working optimally over time as newer data streams in? A dynamic approach might just be the answer—time to dive in.

Today’s Highlights:

  • Biden Administration unveils AI.gov

  • European AI startup Mistral is looking to raise $300M

  • Nvidia could use generative AI to design semiconductor chips

DEEP DIVE

Apple And CMU Researchers Introduce the First Web-Scale Time-Continual Benchmark

Large multimodal foundation models like CLIP, Flamingo, and Stable Diffusion have triggered a transformation in multimodal learning. However, can they evolve with changing data—especially future data? Researchers from Apple and Carnegie Mellon University may have answered the question.

The researchers have introduced a set of dynamic classification and retrieval tasks spanning from 2014 to 2022 to put OpenAI's CLIP models to the test. Results indicated a steady performance by OpenCLIP models, but a considerable decline in retrieval performance from OpenAI's models on more recent data.

But why is continuous training of foundation models vital?

  • As Internet data evokes, older benchmarks (like ImageNet) often fall short, requiring the models to synchronize with fluctuating data distributions.

  • The common practice of retraining models from scratch with new image-text data is not only time-consuming but also economically and environmentally taxing.

  • To adapt existing models to new inputs, ongoing learning techniques within a restrictive computational budget are the need of the hour.

Introducing TIC-DataComp, the researchers' first-ever web-scale time-continual (TiC) benchmark, compiling 12.7 billion timestamped image-text pairs. The team's commitment to continuous training doesn't end here—they're also leveraging sizeable internet datasets collected from platforms like Reddit and Flickr for this purpose.

Can these learning techniques replace starting models from scratch?

Certainly, the study indicates that by beginning at the latest checkpoint and replaying all past data, the cumulative technique delivers comparable performance to a retrained model—but with 2.7 times the computing efficiency.

But adaptability is a two-way street. The researchers underscored the significance of learning rate schedules for sequential training and revealed interesting trade-offs between buffer sizes for static and dynamic performance.

PUNCHLINES

Conducting Arbitrage: AI infrastructure provider Voltage Park plans to lease out the (newly acquired) $500M worth of Nvidia H100s.

Chit Chat: UK PM Rishi Sunak to hold a live interview with Elon Musk on X during the upcoming AI summit.

Eyes On Me: European AI startup Mistral is looking to raise $300M just four short months after its $113M seed round.

Negative Externalities: Phishing email attacks up by 1,265% since the launch of ChatGPT.

Tour de Force: VentureBeat sets off on a nationwide tour to help businesses integrate GenAI.

TLDR

Biden Administration unveils AI.gov: The White House introduced a new website, AI.gov, to promote responsible use of artificial intelligence (AI) and provide resources and guidelines. The website also serves as a recruitment platform for AI-related federal roles and demonstrates the US's commitment to leading AI advancements.

Nvidia could use generative AI to design semiconductor chips: Nvidia’s research explores the usage of proprietary LLM, ChipNeMo for designing semiconductor chips. ChipNeMo is trained on internal data and aids chip design by automating tasks and optimizing software, leading to productivity enhancements.

G7 introduces voluntary code of conduct for AI development: The G7 leaders have announced the International Code of Conduct for Organizations Developing Advanced AI Systems, an 11-point framework encouraging dedication to developing safe, secure, and trustworthy AI technologies.

TRENDING TOOLS

📝 PatentPal: Automate the mechanical writing of your patent applications with AI

🇰🇷 Kimchi Reader: Learn Korean in an immersive setup with a popup dictionary

⏱️ flowRL: Personalize user interface in real-time with AI power

💻 Momentic: Enable intelligent End-to-End tests with no code required, all in natural language

🛠️ Taskade: Supercharge team productivity with five AI-powered tools in one

That’s all for today—if you have any questions or something interesting to share, please reply to this email. We’d love to hear from you!

P.S. If you want to sign up for the Supercharged newsletter or share it with a friend, you can find us here.

Reply

or to participate.