site:venturebeat.com - Search News

Prompt injection is exploiting enterprise AI's biggest design flaws by targeting agents, RAG pipelines and model routers

Prompt injection remains the most effective way to compromise enterprise AI systems because it exploits the fundamental way ...

Claude Code turned every engineer into three. Now companies need more product thinkers

AI compressed the build. Fundamentals matter more, not less, and the product funnel is now where engineers earn their keep.

OpenAI unveils GPT-5.6 Sol, Terra and Luna models — but only accessible to limited preview partners for now, per US Gov

Sol and Terra set new high benchmark scores, while Luna performs near GPT-5.5 levels on several tests despite being ...

OpenAI's updated GPT-5.5 Instant is better at shopping, complex constraints, and understanding user intent — and it's already in the API

OpenAI is moving away from models that require heavy hand-holding and toward systems that can better infer the user’s goal, ...

Liquid AI's smallest model yet LFM2.5-230M beats models 4X its size at data extraction, can run 'anywhere'

LFM2.5-230M proves that while 3-billion-parameter models like VibeThinker are solving advanced calculus, a ...

Xiaomi's HarnessX rewrites its own AI scaffolding mid-task — and smaller models gain the most

Xiaomi's HarnessX autonomously rewrites AI agent harnesses mid-execution, delivering +14.5% avg performance gains — and +44% ...

Mistral launches OCR 4, turning document extraction into a full enterprise AI play

Mistral AI's OCR 4 delivers structured document intelligence with bounding boxes, confidence scores, and self-hosted ...

Anthropic launches Claude Tag, replacing its Slack app with a persistent AI teammate that learns, monitors and works autonomously

Anthropic has launched Claude Tag, a persistent AI agent for Slack that lets enterprise teams delegate work, automate tasks, ...

Alibaba's model never trained as an agent — and improved agent performance across seven benchmarks

Real environments can't inject edge cases on demand. Alibaba's Qwen-AgentWorld simulates them — and outperformed ...

OpenAI unveils first custom AI inference chip, Jalapeño, with Broadcom — and its development was sped-up with OpenAI's own models

The companies attributed this speed to a deep software-hardware co-development process that actively used OpenAI’s own models ...

New agentic memory framework uses 118K tokens per query. LangMem burns through 3.26M.

NUS researchers' MRAgent framework reduces LLM agent memory retrieval to 118K tokens per query — vs. 3.26M for LangMem — using step-by-step reasoning.

How Shopify built an AI stack that doesn't care which models survive

Shopify built an LLM proxy and distillation pipeline so its engineers keep working when any model goes away — and often get ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results