CR AI Fellow: Building Trust and Safety in Generative AI through Observability

Consumer Reports (CR) has been exploring generative artificial intelligence (AI) through the launch of AskCR, an experimental chatbot that helps people get to CR’s trusted information faster. But as with all generative AI experiences, wrangling that power of AI in a way that is consistent, transparent and accountable presents its own challenges (see part 1 and part 2 of CR’s Responsible Tech series around that process). The new and evolving capabilities of generative AI — delivered through applications incorporating Retrieval Augmented Generation, input guardrails, and other design patterns — are exciting in terms of what they can offer to consumers, but also raise new problems with respect to trust, clarity, and accountability.

I joined CR as a Summer AI Fellow via Stanford’s Tech Ethics and Policy Fellowship Program to explore the intersection of responsible deployment and policy for generative AI. I worked with the AskCR team, researching, analyzing, and proposing trust and safety considerations for AskCR. I also worked with the advocacy team, considering, quantifying, and analyzing emerging AI policy trends, building a comprehensive state level AI policy tracker. Doing this work with the policy team, while interesting on its own, has informed my work with AskCR, as it refined my approach to integrating ethical considerations and regulatory insights into the product development process, allowing me to better anticipate and address potential trust and safety issues in our AI deployment.

In this post, I will dive into the research insights that shaped my process and walk through the strategies we are exploring to increase trust and safety in AskCR.

Research

My initial research consisted of developing a foundation of understanding for the current ethical landscape of generative AI. This exploration included reading through key articles and research papers, such as Google DeepMind’s Ethics of Advanced AI Assistants and NIST’s Artificial Intelligence Risk Management Framework: Generative Artificial Intelligence Profile. I identified and dove further into various avenues in which generative AI can be developed and deployed in a safe manner. It was interesting to take stock of the extensive research that exists, and then observe how only some generative AI chatbots have effectively implemented these safety concepts, highlighting a significant gap in the field. I enjoyed having in-depth conversations with members of the AskCR team about the different aspects of these ideas that they had considered, and the ways in which they deployed responsibly and continue to work on ways to improve AskCR.

My research served to inform the development of a prototype that addresses the trust and safety implications that deployers and developers of generative AI must prioritize. I decided to first focus my research on anthropomorphism to better understand its trust and safety implications to AskCR.

Anthropomorphism

AI is not the first technology in which anthropomorphism, the attribution of human-likeness to non-human entities, has been observed. A comprehensive meta-analysis published in Science Robotics investigates the effectiveness of anthropomorphism in human-robot interaction, finding that there is a positive overall effect of anthropomorphism on a person’s perception of a robot (Roesler, Onnasch, & Manzey, 2021). A different paper systematically discusses moral concerns with voice assistants, including a section focusing on the new ways that voice assistants present ethical quandaries relating to anthropomorphism (Seymour, Coté, Zhan & Such, 2023).

AI driven by large language models (LLMs) is usually designed to engage in fluent, human-like conversations with users. This inherently encourages anthropomorphism, as a quality generative AI system will have users attributing human-like qualities to it. This tendency can influence users’ expectations and interactions with generative AI, leading to potential misunderstandings about the capabilities, limitations, and reliability of these systems. While deployers of generative AI chatbots aim to build user trust by creating realistic conversation environments, it’s critical for developers to assess the potential trade-offs involved. Striking the right balance between fostering trust and ensuring users are fully aware of the technology’s limitations is essential to prevent over-reliance or misplaced confidence in AI systems.

Anthropomorphism is expressed in chatbots through various design elements that make the technology appear more human-like. This includes incorporating emotional language in their responses to express empathy or enthusiasm and foster a connection with users. In addition, the use of personal pronouns like “I” or “we” also add a personal touch to interactions and helps make the chatbot feel more like a conversation partner rather than just a machine. Another example is that having a virtual avatar that replicates facial expressions, motions, and even lip syncing during conversation amplifies the sense of interacting with a human, resulting in a more captivating and relatable experience for individuals.

Through my discussions with the AskCR team, I discovered that they had carefully considered the conversational dynamics of their chatbot, particularly how word choices, tones, and voices could influence user perception. They had experimented with different approaches, refining their use of language and personal pronouns through iteration. What I learned from these conversations, in addition to going through more directed user research, was that in the context of product research and shopping, consumers place far greater importance on the underlying processes and decision-making logic than the technology’s interface or expressive qualities. This insight highlighted the importance of prioritizing transparency and explainability in AskCR when I was considering trust and safety implications.

Data Transparency

In the context of AskCR, which utilizes a Retrieval-Augmented Generation architecture, understanding and showcasing data sources is crucial for maintaining transparency and managing user expectations. The Retrieval-Augmented Generation system in AskCR produces responses by dynamically retrieving information from an established set of sources, specifically CR product articles and product data. This method capitalizes on CRs unique knowledge, which is the reliable and data-driven product testing we provide.

It is imperative that consumers are aware of the scope and purpose of AskCR to effectively calibrate their expectations and fully understand the system’s capabilities. Reviews of feedback during user testing of AskCR showed that a notable problem was users not being aware of the sources AskCR was able to draw from, and being unhappy with the canned responses they would receive to questions falling outside the intended scope of AskCR. To support this level of transparency, clear documentation must be provided about the sources of data that the chatbot draws from. I began thinking about prototyping and implementing a model card for AskCR that explains the purpose of the chatbot, outlines the iteration that it has undergone, and describes the processes that are involved in its operation.

Although slightly different, a similar approach can be seen in system cards offered by developers of foundation models. For example, GPT-4, the large language model released by OpenAI in March of 2023, is accompanied by a 60-page system card. The document details the safety challenges involving its limitations and capabilities, and then the safety techniques that were adopted during the development process. Similarly, creating a “model card” for AskCR would offer users a comprehensive view of the chatbot’s operational framework, fostering greater trust and helping users set accurate expectations. By clearly documenting its purpose, iterative improvements, and data sources, the model card would enhance transparency and align user understanding with the system’s functionality.

Conversations with the AskCR team revealed the work and brainstorming they had put into similar ideas during the development of AskCR. Considering feedback from user testing, there was a consensus that increasing transparency around the data sources was crucial. I began brainstorming ideas for a model card, focusing on ways to display all the overarching information about the technology being used within AskCR. As I was going through more research about transparency in AI systems, I found the following quote: “Transparency: humans tend to trust virtual and embedded AI systems more when the inner logic of these systems is apparent to them, thereby allowing people to calibrate their expectations with the system’s performance” (The Ethics of Advanced AI Assistants, page 121). Based on this premise, I began brainstorming an observability feature that would offer users insights into what happens on the backend with each query, increasing the explainability of the system’s logic to users.

Observability Feature

As chatbot technology continues to evolve and advance, observability features are becoming increasingly important as tools for transparency and user trust. Airline, banking, and other organizations are developing chatbots for their companies, and integrating observability features can be of critical importance. Allowing users to understand how each response is generated not only enhances user trust but can increase the usefulness of the application, by pointing users to sources of information and more authoritative content beyond the chatbot’s immediate response. In addition, these features can increase user’s understanding as to why their query generated a particular response, helping them interpret the results and refine their questions for more relevant information.

For AskCR, an observability feature is crucial for allowing users to understand the processes that occur on the backend with each query, creating more transparent responses. The user will understand that their query is first sent through a “sanitization” step, such as OpenAI’s guardrails, to ensure the response adheres to safety and ethical standards. The query is then routed based on intent and topic, which refines it to match the response style that will be most helpful to the user. The question is also revised into a refined query to improve the likelihood of relevant responses between the query embeddings and the document embeddings being searched, information that will now be shown to the user. The refined query is then subject to a semantic search through CR product articles, before being subject to reranking software such as Cohere, used to prioritize the most relevant results before the final text is synthesized.

By incorporating an observability feature that describes each of these steps, AskCR would provide users with a clear view of these backend processes, increasing trust by showing the logical flow of their queries and the subsequent responses generated. This transparency would also allow users to give more targeted feedback on specific parts of the process – whether the question routing was incorrect, whether the reranking actually improved response quality, and so on – allowing the AskCR team to identify areas for improvement more effectively. This feature could help set a new benchmark for transparency in large language model based chatbots.

The remainder of the summer was spent developing and implementing the prototype of the feature. I iterated on the prototype in Figma, animating layers and creating the applicable components to ensure clarity and ease of use. I created detailed product requirement documents to align the rest of the team on big-picture things like objectives and functionalities, and detailed things like word choice and design. Working in a local branch of AskCR on my computer, I began working to design the frontend implementation and connect it to the backend. The feature will be implemented as an interactive popup within the application, providing users with a step-by-step breakdown of the query processing pipeline. The feature continues to progress toward full deployment, and will eventually foster greater trust and engagement with the application.

As I wrap up my time at CR, I feel lucky to have had the opportunity to explore the intersection of AI policy and deployment. The people I have worked with have broadened my perspectives on the most pressing challenges faced right now, but also strengthened my optimism for the future of AI. I look forward to continuing to work in this field in one way or another, and am also excited to see what Consumer Reports accomplishes in the future. I am always happy to chat about related topics (or really about anything broadly related to technology or policy), so feel free to reach me at jacobr1@stanford.edu.

CR AI Fellow: Building Trust and Safety in Generative AI through Observability

Research

Anthropomorphism

Data Transparency

Observability Feature

Consumer Reports & OneTrust Successfully Complete End-to-End Testing of the Data Rights Protocol in a Production Environment

Meet CR’s 2025 Public Interest Tech Fellows

Why We’re Introducing Model Legislation to Prevent Zombie IoT Devices

New Report: Do These 6 AI Voice Cloning Companies Do Enough to Prevent Misuse?

Research

Anthropomorphism

Data Transparency

Observability Feature

Related Blog Posts

Get the latest on Innovation at Consumer Reports