Promptfoo is joining OpenAI (2 minute read)
Promptfoo has agreed to be acquired by OpenAI. It will remain open source and continue to serve users and customers. The company was founded in 2024 to make it easy for developers to systematically test AI applications. It is joining OpenAI so that the security, evaluation, and compliance platform can have the greatest impact on how teams build and deploy AI. OpenAI will improve and integrate Promptfoo's core tech within its model and infrastructure layers so that teams can catch vulnerabilities early and ship secure AI from the beginning.
|
Anthropic Sued US Defense Department (3 minute read)
Anthropic filed lawsuits in California and Washington, DC after the US Department of Defense labeled the company a supply-chain risk, which could bar government contractors from using its models. The dispute followed Anthropic's refusal to allow unrestricted military use of its AI systems, citing concerns over mass surveillance and fully autonomous weapons.
|
Copilot Cowork: A new way of getting work done (5 minute read)
Microsoft's Copilot Cowork automates tasks across Microsoft 365, enhancing productivity through intelligent workflow management. Utilizing Work IQ, it coordinates emails, meetings, and files, performing actions like rescheduling and document preparation while maintaining user control. Currently in Research Preview, broader availability is expected in March 2026.
|
The Top 100 Gen AI Consumer Apps (15 minute read)
Generative AI usage in consumer apps has evolved significantly, with products like CapCut, Canva, and Notion integrating AI as core features. ChatGPT remains the leading AI product globally, but competitors like Gemini and Claude are gaining traction, especially with US paid subscriptions. The AI landscape is diversifying geographically and functionally, with notable developments in video generation and agentic AI, notably OpenClaw, suggesting a shift beyond traditional browser and app metrics.
|
|
The Debt Beneath the Dream (9 minute read)
SoftBank's shares are deflating fast. Investors are more and more concerned that the company cannot pay its debts, and debtors are getting nervous. It is unclear how SoftBank will meet its commitments to OpenAI in April. Its borrowed money and illiquid assets mean that the firm's margin for error is far thinner than any announcement ever suggested.
|
How Coding Agents Are Reshaping Engineering, Product, and Design (2 minute read)
Coding agents have collapsed the traditional waterfall process by making implementation cheap and shifting the bottleneck from writing code to reviewing it. High throughput is the new constraint, favoring builders with product sense and systems thinkers who can manage rapid cycles. Generalists gain the most advantage by eliminating communication overhead, while the bar for pure specialization rises because domain depth alone is no longer enough to justify a role.
|
When Code is Free, Research is All That Matters (2 minute read)
As implementation costs collapse, the scarce resource becomes knowing what is worth building and whether it is feasible. Software engineering optimizes toward known solutions, but research operates without ground truth, making it resistant to the automation currently impacting traditional development. Research taste remains portable and nearly impossible to train, acting as the ultimate differentiator in a world where barriers to entry have dropped.
|
|
Code Review (7 minute read)
Code Review for Claude Code allows developers to set up automated PR reviews that catch logic errors, security vulnerabilities, and regressions using multi-agent analysis. It analyzes GitHub pull requests and posts findings as inline comments. Findings are tagged by severity, and the tool doesn't approve or block PRs, allowing existing review workflows to stay intact. Claude Review is billed based on token usage, with reviews averaging $15 to $25.
|
Teaching LLMs to reason like Bayesians (5 minute read)
Google researchers taught LLMs to reason in a Bayesian manner by training them to mimic the predictions of an optimal Bayesian model. Bayesian inference defines the optimal way to perform updates over multiple interactions. It could help LLMs optimize user interactions by updating their estimates of a user's preference. The researchers' approach significantly improves LLM performance on recommendation tasks and enables generalization to other tasks, suggesting the method taught LLMs to better approximate Bayesian reasoning.
|
PenguinβVL: Efficient VisionβLanguage Models (GitHub Repo)
Tencent AI Lab released PenguinβVL, a compact VLM family designed to improve multimodal efficiency by redesigning the vision encoder. It introduces PenguinβEncoder, initialized from a textβonly LLM to align visual features with language representations and improve dataβefficient multimodal reasoning.
|
|
AI assistants now equal 56% of global search engine volume (3 minute read)
AI assistants generate 45 billion monthly sessions, making up 56% of global search engine volume. Notably, 83% of AI usage occurs in mobile apps, with ChatGPT dominating at 89% of global AI sessions. Despite AI's rise, search engine activity remains crucial, with combined usage growing 26% globally since 2023.
|
No, it doesn't cost Anthropic $5k per Claude Code user (4 minute read)
A recent article by Forbes on Cursor claims that Anthropic's $200 per month Claude Code Max plan can consume $5,000 in compute. It is likely that the article's sources are confusing retail API prices with actual compute costs. The actual compute cost is roughly 10% of retail API prices. Most users don't come anywhere near their limits. Anthropic isn't a profitable company, but inference isn't why.
|
|
|
Love TLDR? Tell your friends and get rewards!
|
|
Share your referral link below with friends to get free TLDR swag!
|
|
|
|
Track your referrals here.
|
|
|
|