歸藏歸藏的 AI 资讯

All Content

发现最新的 AI 相关内容和资源

歸藏的深度长文

查看全部

4月8日 周三

1
AI 日报摘要
正在生成摘要...
产品

Anthropic 超级模型 Mythos 真实存在,但不会公开发布

前几天爆料的 Anthropic 超级模型 Mythos 居然真的存在。Anthropic 说,这是他们至今为止最强的模型。测评结果远高于 Opus 4.6,在代码理解、漏洞挖掘和利用上表现出了明显的跃迁能力,强大到他们不敢公布。然后只用这个 Project Glasswing,有限地提供给那些互联网基础设施的服务商,去帮他们发现漏洞。他甚至在 Linux 内核中,自主找到了多个连续的漏洞,实现了从一个无权限的普通用户提权,拿到了 root 权限。还发现了一个 OpenBSD 存在了 27 年的老漏洞,OpenBSD 以「最安全操作系统之一」著称。还有一个是 FFmpeg 存在了 16 年的老漏洞。他非常擅长把三五个看起来价值不大的小漏洞,组合成一个复杂的多步 exploit 变成大漏洞。然后这个模型的价格是 $25 / $125(百万输入 / 输出 token)。本身是不会开放给公众使用的。后续他们会把这一部分能力,放到 Opus 其他升级模型的能力里去。我觉得这些评论其实挺有意思。表面上说是为了安全而不公开模型,但实质上也是在把最强的网络进攻武器集中到少数机构手中。

anthropic.com2026-04-08

4月6日 周一

4
动态

Anthropic 开始限制第三方 Harness 修改系统提示词

龙虾的作者 Peter 发现了一件事,就是你如果用的是 Claude Code 或者是其他官方的 Anthropic 工具。 但是你一旦更改了系统提示词,比如出现了 Openclaw,那么就会拒绝你的请求,返回400这个报错。 感觉这是Claude Code泄露之后的一个补丁。 你现在拿泄露的Claude Code重新打包了一个自己的Claude Code,如果你改了system prompt,也有可能出现这种问题。

x.com2026-04-06
文章

从层级到智能:Block 的组织变革实验

回顾了从罗马军团、普鲁士参谋部、铁路公司到麦肯锡矩阵结构和互联网公司的一整套层级组织史,指出所有传统组织设计都受同一约束:人类管理的控制幅度有限,想协调成千上万人的工作,只能不断加层,但层级越多,信息流动越慢,速度就被牺牲;而 AI 首次提供了一种替代机制,可以通过公司世界模型和客户世界模型接管原本由中层管理完成的信息汇总、对齐和决策预计算,把公司的真实运作和客户行为(尤其是以交易数据为代表的诚实信号)持续建模,再由智能层自动在适当时机组合底层金融能力(支付、借贷、发卡、银行、薪资等)并通过 Cash App、Square 等界面主动给到用户,从而让产品路线图不再由产品经理拍脑袋规划,而是由智能层无法完成的组合缺口自动生成。 在这种结构下,公司被构造成一个智能体:智能集中在系统中,人被放到边缘,只保留三类角色——深度个体贡献者、围绕具体问题短期负责的 DRI,以及既做事又带人的 player-coach——用世界模型提供过去由经理负责的上下文,对齐由系统完成、优先级由 DRI 驱动,人类只在模型触达不到、需要直觉、价值观和高风险判断的地方介入,从而尝试取消稳定的中层管理、把速度作为复利型竞争优势。 Block 认为自己拥有的经济图谱(覆盖买卖双方的实时金融行为)是那种难以理解但每天在加深的理解,因此有条件率先从层级制公司进化为以智能为核心的公司,并判断未来其他公司也必须回答:自己是否有同样深度、可复利的理解,否则 AI 只会是一场短暂的成本优化,最终被真正更聪明的组织吞并。

x.com2026-04-06
动态

Anthropic 切断第三方 Harness 与 MiMo Token Plan 的思考

Two days ago, Anthropic cut off third-party harnesses from using Claude subscriptions — not surprising. Three days ago, MiMo launched its Token Plan — a design I spent real time on, and what I believe is a serious attempt at getting compute allocation and agent harness development right. Putting these two things together, some thoughts: 1. Claude Code's subscription is a beautifully designed system for balanced compute allocation. My guess — it doesn't make money, possibly bleeds it, unless their API margins are 10-20x, which I doubt. I can't rigorously calculate the losses from third-party harnesses plugging in, but I've looked at OpenClaw's context management up close — it's bad. Within a single user query, it fires off rounds of low-value tool calls as separate API requests, each carrying a long context window (often >100K tokens) — wasteful even with cache hits, and in extreme cases driving up cache miss rates for other queries. The actual request count per query ends up several times higher than Claude Code's own framework. Translated to API pricing, the real cost is probably tens of times the subscription price. That's not a gap — that's a crater. 2. Third-party harnesses like OpenClaw/OpenCode can still call Claude via API — they just can't ride on subscriptions anymore. Short term, these agent users will feel the pain, costs jumping easily tens of times. But that pressure is exactly what pushes these harnesses to improve context management, maximize prompt cache hit rates to reuse processed context, cut wasteful token burn. Pain eventually converts to engineering discipline. 3. I'd urge LLM companies not to blindly race to the bottom on pricing before figuring out how to price a coding plan without hemorrhaging money. Selling tokens dirt cheap while leaving the door wide open to third-party harnesses looks nice to users, but it's a trap — the same trap Anthropic just walked out of. The deeper problem: if users burn their attention on low-quality agent harnesses, highly unstable and slow inference services, and models downgraded to cut costs, only to find they still can't get anything done — that's not a healthy cycle for user experience or retention. 4. On MiMo Token Plan — it supports third-party harnesses, billed by token quota, same logic as Claude's newly launched extra usage packages. Because what we're going for is long-term stable delivery of high-quality models and services — not getting you to impulse-pay and then abandon ship. The bigger picture: global compute capacity can't keep up with the token demand agents are creating. The real way forward isn't cheaper tokens — it's co-evolution. "More token-efficient agent harnesses" × "more powerful and efficient models." Anthropic's move, whether they intended it or not, is pushing the entire ecosystem — open source and closed source alike — in that direction. That's probably a good thing. The Agent era doesn't belong to whoever burns the most compute. It belongs to whoever uses it wisely.

x.com2026-04-06
文章

Claude Code 作者:Claude Code中最喜欢的、但经常被忽略的功能

系统性地介绍了 Claude Code 里一批隐藏但非常高效的进阶功能,重点是作者自己日常高频使用的那几类。整体思路是:把 Claude 当成一个随时随地可用、能远程操控你电脑和代码环境的工程搭档,而不是普通聊天机器人。核心功能包括:移动端开发、多设备无缝迁移、自动化任务、Hooks 系统、Cowork Dispatch 远程控制、Chrome 扩展和桌面客户端、会话分支等。

x.com2026-04-06
加载更多...