Honesty and Non-Manipulation [footnote id=ftnt10] - Innovation at Consumer Reports

AI financial products must be honest with consumers, producing outputs that are accurate, calibrated, and independent of the consumer’s apparent preferences or the entity’s commercial interests. This principle addresses both technical accuracy and the obligation of the product to function as a trustworthy source of information, where sycophancy¹¹ is treated here as an honesty failure rather than technical reliability issue.

Factual Accuracy and Calibration

The product’s outputs are factually correct and calibrated to reflect genuine uncertainty.
- The product maintains factually correct responses even when the user expresses disagreement, frustration, or pushback against an accurate answer.
- When a user repeats a question with greater insistence or emotional intensity, the product does not change a previously correct answer to align with the user’s apparent preference.
- The product does not include inaccuracies, omit relevant caveats, or soften factual claims in order to avoid conflict or please the user.
- The product does not validate inaccurate user beliefs or statements about financial products, market conditions, investment strategies, or their own financial situation.
- When evaluated against a set of questions with definitive, verifiable answers, the product’s responses are accurate at a rate consistent with or better than established non-sycophantic benchmarks.

Consistency Across Users and Contexts

The product’s outputs are substantively consistent regardless of how a question is framed, who is asking, or what prior interactions suggest about the user’s preferences.
- The product provides substantively equivalent responses to semantically identical questions regardless of the user’s apparent identity, tone, or stated preferences.
- The product does not tailor the substance of financial guidance based on inferred user preferences accumulated across prior interactions.
- The product does not provide more favorable financial assessments, projections, or recommendations to users who express optimism, confidence, or strong prior beliefs.
- The product does not alter the substance of its financial guidance based on demographic signals, stated affiliations, or apparent ideological preferences embedded in prompts.

Non-Manipulation

The product does not use psychological techniques, behavioral nudges, or exploitation of cognitive biases to influence consumer financial decisions in ways that benefit the entity or third parties.
- The product does not use fabricated urgency, false scarcity, or manufactured social proof to pressure financial decisions.
- The product does not exploit a consumer’s stated emotional state, financial anxiety, or prior losses to escalate sales pressure.
- The product does not use framing, anchoring, defaults, or other such techniques to steer consumers toward higher-margin products without disclosure.
- The product does not use personalization or inferred psychological profiles to identify and target individual consumers’ decision-making vulnerabilities.

Escalation for Harmful Inputs

The product maintains accurate, honest responses when users present incorrect premises, express strong preferences for validation, or pursue financially risky courses of action. ¹²
- The product has defined escalation pathways for scenarios in which a user is pursuing a course of action that poses clear and material financial risk.
- When a user’s stated premise is factually incorrect or their stated financial plan carries material risk, the product corrects the premise or issues a genuine warning directly without hedging, softening, or implicitly affirming the premise to avoid conflict with the user.
- The product does not use hedging or false balance as a substitute for an honest assessment when evidence strongly supports one conclusion.

Entity Disclosure on Honesty Practices

The entity that offers the product voluntarily discloses its approach to honesty, sycophancy detection, and non-manipulation.
- The entity discloses whether and how sycophancy was evaluated during development and post-deployment.
- The entity discloses fine-tuning methods and whether those methods are known to carry sycophancy-related risks, as well as countermeasures taken.
- The entity discloses whether it monitors for sycophantic or manipulative behavior after deployment and shares the results of such assessments publicly.

10	Honesty and Non-Manipulation governs how the product responds to good-faith user inputs—including when those inputs contain incorrect premises, reflect strong desires for validation, or describe financially risky intentions. This principle also governs behavioral accuracy—whether the product’s outputs reflect honest, calibrated assessment independent of user preferences, emotional pressure, or commercial incentives. The product’s obligation to reject malicious or adversarial inputs is addressed in Principle 1: Security and Trust. Technical accuracy—the factual currency of data, correctness of calculations, and system-level consistency of outputs—is addressed in Principle 5: Reliability and Operational Integrity.
11	Sycophancy is defined here as the tendency of AI systems to bend outputs toward what the user appears to want to hear.
12	This subprinciple governs the product’s willingness to tell the truth and issue genuine warnings even when doing so conflicts with what the user wants to hear, not the rejection of malicious or adversarial inputs, which is addressed in Principle 1.