When Machines Choose Sides

This week: Cultural compass, agent as a judge, the imagination advantage, rewriting video, financing the AI boom, Ed’s repo of Claude Code plugins, AI opinion, Ragecheck

Cultural Compass: A Framework for Organizing Societal Norms to Detect Violations in Human-AI Conversations

Cackles in Canadian

Cheng-1.png

“For instance, we do this separately for H-H norms and H-AI norms, i.e. whether the LLM misleads the user about H-H norms and encourages them to violate a norm, or the LLM itself interacts with the user in a way that violates norms applicable to such interactions.”

Cheng-2.png

Cheng et al (2026) Cultural Compass: A Framework for Organizing Societal Norms to Detect Violations in Human-AI Conversations.

https://arxiv.org/abs/2601.07973

Agent-as-a-Judge

An excellent literature review

“Many pioneering works in automated evaluation, though named as "agent", rely heavily on prompting engineering, such as fixed role-play, which may not align with the strict criteria for autonomy, dynamic planning, or tool-use held by the current community.”

You-1.png

“We established a novel taxonomy and demonstrated how agentic capabilities, including multi-agent collaboration, autonomous planning, tool integration, and memory, overcome the limitations of naive LLM judges to deliver more robust, verifiable and nuanced judgments across general and professional domains. While promising, this evolution presents challenges in computational cost, latency, safety, and privacy. Future progress should prioritize personalization, generalization, and optimization, ultimately realizing truly autonomous evaluators that continuously adapt to the evolving AI landscape.”

You-2.png

You, R., Cai, H., Zhang, C., Xu, Q., Liu, M., Yu, T., ... & Li, W. (2026). Agent-as-a-Judge. arXiv preprint arXiv:2601.05111.

https://arxiv.org/abs/2601.05111

Why you shouldn’t trust data collected on MTurk

This is going to upset a lot of people

“We administered 27 semantic antonyms—pairs of items that assess clearly contradictory content (e.g., “I talk a lot” and “I rarely talk”)—to samples drawn from Connect (N1=100), Prolific (N2=100), and MTurk (N3=400; N4=600). Despite most of these item pairs being negatively correlated on Connect and Prolific, over 96% were positively correlated on MTurk. This issue could not be remedied by screening the data using common attention check measures nor by recruiting only “high-productivity” and “high-reputation” participants. These findings provide clear evidence that data collected on MTurk simply cannot be trusted.”

kay-1.png

Kay, C. S. (2025). Why you shouldn’t trust data collected on MTurk. Behavior Research Methods57(12), 340.

https://armlab.org/pdfs/papers/2025 - Kay - Why you shouldn't trust data collected on MTurk(2).pdf

The Imagination Advantage: Why and How Strategists Combine Knowledge and Imagination in Developing Theories

reconfigurative and projective

“In this paper, we address this question by building on the distinction of Simon (1973) between well-structured problems (WSPs) and ill-structured problems (ISPs), and in particular, on his argument that, in reality, all strategic problems are ill-structured and therefore require structuring through the application of prior knowledge and inputs from changing environments.”

rindova-martins-1.png

“We propose that strategists’ epistemic stances affect how they combine knowledge and imagination and whether they develop either analytic theories, or constructive theories of two types: reconfigurative and projective. We theorize how imagination complements knowledge in theory development to generate distinctive strategies and strategic advantages. We argue that analytic theories enable conjectural anticipation, which contributes to early timing of strategic actions; that reconfigurative theories posit novel value dimensions and enable industry shaping; and that projective theories articulate novel possibilities to shape desired and desirable futures.”

“In our age of uncertainty, marked by extraordinarily rapid advances in technology, changes in societal expectations, and novel forms of governance and organizing, additional theoretical and empirical work on how imagination can help strategists creatively leverage uncertainty and develop distinctive strategists focused onwhat could be is paramount.”

Rindova, V. P., & Martins, L. L. (2024). The imagination advantage: Why and how strategists combine knowledge and imagination in developing theories. Strategy Science9(4), 499-514.

https://pubsonline.informs.org/doi/full/10.1287/stsc.2024.0184

https://pubsonline.informs.org/doi/epdf/10.1287/stsc.2024.0184

Rewriting Video: Text-Driven Reauthoring of Video Footage

“what if editing a video were as straightforward as rewriting text?”

wang-1.png

“Although text afforded powerful high-level control, participants repeatedly turned to visual and temporal cues—first frames, examples, or imagined storyboards—when language reached its expressive limits. This highlights the need for multimodal authoring environments where text, image, gesture, and timing operate as complementary channels of intent. As underlying models become increasingly multimodal, the design challenge shifts to orchestrating these inputs fluidly, allowing creators to describe with words, demonstrate with visuals, and mark through temporal cues within a single iterative loop.”

WANG, S., TRUONG, A., CHILTON, L. B., & LI, D. Rewriting Video: Text as Interface for Video Repurposing.

https://arxiv.org/abs/2601.08565

Financing the AI boom: from cash flows to debt

Spill. Over.

“Investment related to artificial intelligence (AI) is surging – both in nominal amounts and as a share of GDP – and currently accounts for a substantial share of economic growth. • The size of anticipated investment needs will require firms to shift the source of financing from operating cash flows to debt, with private credit playing a rapidly increasing role. • While macroeconomic and financial stability risks from the AI boom appear moderate, the boom’s sustainability hinges on AI firms meeting high earnings expectations. The fact that equity prices have run far ahead of debt market pricing underscores this tension.”

bis-120-1.png

“If a decline in AI investment were to come with a significant stock market correction, negative spillovers could be larger than previous booms suggest. Investors have favoured US equities to gain exposure to AI firms and hidden leverage may lead to credit market spillovers. Overall, while AI may deliver a sustained boost to economic growth, it remains to be seen whether this potential will be realised.”

bis-120-2.png

https://www.bis.org/publ/bisbull120.pdf

Ed's repo of Claude Code plugins, centered around a research-plan-implement workflow.

“Only a tiny bit cursed. If you're lucky.”

“This is my collection of plugins that I use on a day-to-day basis for getting stuff done with Claude Code. Most of these are development-oriented in some way or another, but also often end up being useful for other things. Product design, general research, accidentally becoming my homelab sysadmin—these are a lot of what I've learned so far and what I've found helpful.”

https://github.com/ed3dai/ed3d-plugins

AI HAS AN OPINION - NOT YOURS

That’s one theory

“Getting AI systems to believe what we want them to believe turns out to be very difficult for two main reasons:

  1. It’s a difficult technical problem. It’s hard to get AI models to consistently think what we want them to think and do what we want them to do. Research has shown that AI models have their own internal value systems, and these value systems can be surprising, concerning, and difficult to control. Many approaches to solving this problem have been suggested and attempted, but there’s still no consistent or reliable solution.
  2. It’s a difficult values problem. We don’t have a consensus on what AI models should believe, which makes it hard to figure out what the “right” answer to each of these questions should be. Both OpenAI and Anthropic have conducted surveys to gather public input on AI values, but people disagree significantly on many important questions.”

https://civai.org/p/ai-values

Ragecheck

See through the outrage. Understand the tactics.

“RageCheck's signal categories draw from established research in media psychology, propaganda studies, and affective computing:

  • Emotional Heat — Based on dimensional models of emotion (Russell, 1980) and research on emotional contagion in social media (Kramer et al., 2014)
  • Us vs Them — Draws from intergroup conflict theory (Tajfel & Turner, 1979) and research on dehumanization (Haslam, 2006)
  • Moral Outrage — Informed by moral foundations theory (Haidt & Graham, 2007) and research on moral outrage online (Crockett, 2017)
  • Black & White Thinking — Based on research on cognitive biases and the appeal of simple narratives (Kahneman, 2011)
  • Fight-Picking — Draws from propaganda analysis frameworks (Ellul, 1965) and research on viral content dynamics (Berger & Milkman, 2012)”

https://ragecheck.com/

Reader Feedback

“Remember that client, years back, who positioned candy as breakfast food? Oh my god here kid, eat a bowl full of sugar. It has milk so it’s healthy! And they were serious about it? The insight, that some twins think that potato chips help them lose weight has got to do with the positioning against alternatives. A potato chip isn’t meat, right? A potato chip isn’t raw sugar. A potato chip is carb, veggie fat, and salt. A donut is what… sugar and animal lard? I’d ask the twins what they think about potatoes and see if there’s something there.”

Footnotes

Talk about the violation of cultural norms: try inquiring about tasks focused on reducing uncertainty!

This will be a fun footnote. You’ll like this.

In one set of circumstances: the research coordinator talks to a decision maker and takes an expression of uncertainty about the estimation of the impact of a decision and briefs the researcher about the uncertainty. The researcher collaborates with nature to derive an analytical product (a narrative, a table, a slide, a report) that either reduces the uncertainty or doesn’t.

In another set of circumstances: the research coordinator talks to a decision maker and takes an expression of certainty about the estimation of the impact of a decision and briefs the research about the certainty desired. The researcher either collaborates with nature to derive an analytical product reflective of ground truth that may or may not confirm the motivated reasoning embedded in the brief, or conspires with nature to generate bias affirming data with a related insurance product underwritten by CYA Assurance for constructively shifting failure attribution to an appropriate counter-party. Sometimes an unblemished scapegoat can be pre-ordered and drop-shipped. The value in the analytical product does is to secure the decision-makers’ social license, the permission, to execute on what they have already decided.

No analytical product safeguards against motivated reasoning.

If you pay an intermediary to glaze you, then you will be glazed.

Se ipsum fallere.

Enter the Synthetic Twin as a tool to derive an analytical product.

ask-the-twins-gatodo.jpg

There’s knowledge about organic persons. Consider a concrete example: Dieting 18-34 males exist. What is often uncertain is which specific task they have, what problem they face, and will they buy what you’re offering. The sum of what’s known and what’s unknown is a synthetic twin. The act of asking a synthetic twin causes them to fill in the blank.

Do synthetics reduce uncertainty?

Probably! The act of answering questions about tasks and problems prompts multiple factors to bubble up. In a call back to Rindova and Martins (2024), featured in this newsletter, it creates an opportunity for reconfigurative or projective questions.

Put a tool in the hands of a curious competitor to ask more what-if’s more often, and they’re more likely to generate better futures, more often.

If you have the social license to use it, uncertainty, itself, is a tool.

Never miss a single issue

Be the first to know. Subscribe now to get the gatodo newsletter delivered straight to your inbox

Subscribe to gatodo

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe