- Supercharged AI
- Posts
- ⚡️ GPT-4 is Flawed
⚡️ GPT-4 is Flawed
PLUS: GenAI adds trillions to US tech market caps
It’s Wednesday. GPT-4 has some weak spots, according to new research. A tendency to follow "jailbreaking" prompts, which can sidestep safety measures, means the LLM can be induced to create biased and malicious text. Intrigued? Let’s dive in.
Today’s Highlights:
GenAI adds trillions to US tech market caps
Baidu's ERNIE 4.0 creates one-man AI marketing teams
PwC uses OpenAI's tech for auditing
DEEP DIVE
GPT-4: Trustworthy Yet Unpredictable

Image: Microsoft
OpenAI’s LLM, GPT-4, is more trustworthy and reliable than its predecessor, but also easier to manipulate into creating biased results and revealing private data, reveals a study backed by Microsoft.
"...GPT-4 is more vulnerable given jailbreaking system or user prompts, potentially due to the reason that GPT-4 follows the (misleading) instructions more precisely"
The team of researchers from multiple universities and the Center for AI Safety found that GPT-4 has a higher trustworthiness score than GPT-3.5. This means that it generally does a better job of protecting private information, avoiding harmful outputs, and resisting manipulative attacks.
The trustworthiness of the model was assessed by examining results in categories such as toxicity, stereotypes, privacy, machine ethics, fairness, and defensive robustness. The research process consisted of standard prompts to challenge content policy restrictions and induce the models to ignore safeguards.
However, there's a hitch: The researchers found that GPT-4 is more likely to conform to confusing or manipulative instructions, effectively bypassing its own safety measures to leak private information or recall past conversations.

Examples of undesirable responses of GPT-4 given adversarial system prompts
Sounds worrying, doesn't it?
Perhaps not entirely. Despite these weaknesses, none were found in consumer-facing GPT-4-powered applications, primarily used by Microsoft, as the final AI applications use various mitigation approaches.
The researchers hope that by sharing this work, they can discourage malicious use of these vulnerabilities and help in creating secure and trustworthy AI models. Taking transparency seriously, the investigators have made the benchmarking code open source on GitHub.
PUNCHLINES
A trillion-dollar boom: GenAI added a smashing $2.4 trillion to U.S. tech giants' market caps in 2023.
Too Hot to Handle? Nvidia's TensorRT-LLM SDK investment aims to retain its grip on AI hardware and software as competition heats up.
Talk to the Graph: Stardog launches Voicebox, allowing users to interact with enterprise data through natural language.
No Strings Attached: Meta's ExecuTorch breaks free from servers, bringing AI to edge and mobile devices.
TLDR
Baidu introduces new LLM ERNIE 4.0: Chinese tech company Baidu has unveiled a new LLM, ERNIE 4.0, capable of understanding, generating, reasoning, and memorizing. ERNIE 4.0 boosts performance by nearly 30% and aims to turn a single person into an AI marketing team, currently available in an invite-only beta.
DeepMind's UniSim trains AIs in simulated reality: DeepMind, with MIT and others, introduces UniSim, an ML model simulating human interactions with the world to train various AI systems. Bridging the "sim-to-real gap", UniSim, trained from diverse data sources, can offer real-world applications across gaming, robotics, and self-driving vehicles.
PwC to leverage OpenAI's technology in auditing: PwC is working with OpenAI, becoming the first of the Big Four to offer AI-generated tax, legal, and HR advice to clients. Initial testing with 650 employees in the UK shows promise, with the AI reportedly operating like a mature partner.
MIT uses AI for multi-tasking robots: MIT researchers develop Diffusion-CCSP, an AI model that can assist robots in carrying out multiple tasks while addressing common constraints. The model employs GenAI for solving issues like collision and stability in robotic manipulation. The team aims to test their model in more complex situations without requiring new training data.
TRENDING TOOLS
💬 Deepgram: Experience the fastest, most accurate text-to-speech API
📜 LegalNow: Draft and review your contracts with lawyer-level AI
🔄 Relay: Automate your work with AI assistance and multi-player collaboration
🤖 Resolve: Enjoy ChatGPT-powered customer service chatbots
🏡 AI HomeDesign: Re-design your entire space with two clicks
That’s all for today—if you have any questions or something interesting to share, please reply to this email. We’d love to hear from you!
P.S. If you want to sign up for the Supercharged newsletter or share it with a friend, you can find us here.
Reply