February 25, 2025, Orem, Utah

Testing AI systems before they reach the public is critical to AI success, according to a Microsoft AI safety expert who addressed key concerns at a forum on February 24, 2025, at Utah Valley University (UVU).

"AI is one of the most powerful tools humanity has ever created," said Blake Bullwinkel, AI Safety Researcher and AI Red Team Member at Microsoft. "It's up to us to ensure it is used for good, and that means being proactive about its risks and ethical considerations."

AI is rapidly transforming industries, but with great power comes the need for responsible oversight. To ensure AI functions safely and ethically, Bullwinkel recommends that AI companies focus on the essential role of AI red teaming—an advanced method for stress-testing AI systems to identify vulnerabilities, prevent security threats, and mitigate unintended consequences or misuse.

Bullwinkel provided a lecture in the Noel and Carrie Vallejo Auditorium in the Scott C. Keller Building Building. The event was organized by UVU’s College of Engineering and Technology (CET) in cooperation with the university's honors department and AI Institute. The lecture is part of UVU's efforts to bring cutting-edge discussions to students and faculty, particularly on AI-related topics. The CET aims to bridge the gap between academia and industry by hosting speakers like Bullwinkel, who provide real-world insights into evolving technologies. The event drew a diverse audience representing several university departments, highlighting the growing interest in AI safety among students and professionals alike.

The Growing Need for AI Red Teaming

Bullwinkel emphasized that AI red teaming is vital for identifying vulnerabilities in AI models before they become real-world problems. He described how red teams actively attempt to exploit AI systems, much like ethical hackers do in cybersecurity. This practice is essential in preventing harmful consequences, such as AI-generated persuasion tactics, misinformation, and the bypassing of safety protocols using encoded prompts.

One particularly alarming method of exploitation involves attackers disguising harmful queries using hashtags or special characters. Instead of outright asking an AI model for dangerous information, an attacker might encode the request in symbols or distort words to bypass built-in safety filters. While a less sophisticated AI model might not decode the request, an advanced model could interpret it correctly and inadvertently provide dangerous responses. This loophole underscores why continuous AI safety testing and red teaming are necessary.

Real-World Case Study: The Risks of Poor AI Implementation

Bullwinkel shared a case study about Spain’s Viogén system, an AI-powered tool used to predict domestic violence risks. Designed to help law enforcement allocate resources efficiently, the system misclassified nearly half of the highest-risk cases as low priority, leading to dangerous oversights. This case highlights how AI, despite good intentions, can fail without rigorous testing and human oversight.

Viogén, like many AI-driven decision-making systems, relied heavily on historical data patterns to predict which individuals were at risk of repeat domestic violence incidents. However, AI models process vast amounts of data without inherently understanding the context behind each case. The system’s incorrect classifications demonstrated how flawed machine learning applications can cause real-world harm when they replace, rather than support, human judgment.

Automating AI Red Teaming

To keep pace with evolving AI risks, companies are using automated red teaming tools, such as the Python-based Pirate framework. These tools use AI to test AI, simulating adversarial attacks and revealing weaknesses before bad actors can exploit them. By automating parts of the red teaming process, security experts can scale up testing efforts, improving AI resilience across different industries.

Beyond security, AI red teaming also addresses ethical concerns. Bullwinkel explained that rigorous testing ensures AI models do not unintentionally reinforce biases, spread misinformation, or respond in harmful ways to culturally sensitive inputs. As AI expands into sectors like finance, healthcare, and law enforcement, these considerations become even more critical.

AI’s Role in Persuasion and Manipulation

Another aspect Bullwinkel discussed was AI’s increasing role in persuasion and behavioral manipulation. AI-driven chatbots, voice assistants, and recommendation algorithms can subtly influence people’s decisions, whether in advertising, politics, or social interactions. Red teams assess whether AI models can be manipulated to spread disinformation, impersonate humans, or facilitate financial fraud.

He explained that AI-generated scams, in particular, have grown in sophistication, with models capable of imitating human speech patterns, tone, and emotions. Attackers use AI to craft highly persuasive phishing messages, creating an urgent need for red teams to analyze and mitigate such risks before AI tools fall into the wrong hands.

AI Transparency and Trust

One of the biggest challenges in AI governance is transparency. AI systems are often viewed as black boxes, with limited understanding of how they arrive at their conclusions. AI red teaming plays a role in demystifying these processes, ensuring that AI models operate with fairness and accountability.

Bullwinkel emphasized that transparency is key to building public trust in AI-driven systems. When AI models impact critical areas, such as hiring decisions, loan approvals, and medical diagnoses, stakeholders must understand how decisions are made and whether biases exist. He advocated for stronger industry regulations that promote transparency, ethical AI use, and continuous red teaming efforts to protect users.

During the Q&A session, Bullwinkel reiterated that responsible AI development requires ongoing testing, adjustments, and ethical oversight. "The best AI isn’t just the smartest—it’s also the safest. We have to push AI forward while ensuring it doesn’t outpace our ability to control it."

Looking Ahead

The lecture underscored the importance of AI safety as AI technologies become more integrated into daily life. Bullwinkel’s insights provided a compelling case for proactive AI security measures, ensuring that AI continues to be a force for good. While the future of UVU’s AI lecture series remains uncertain, events like this foster critical discussions that help shape a responsible AI-powered world.

The CET’s initiative in hosting this event highlights UVU’s commitment to fostering meaningful conversations about AI. More discussions and lectures like this will help keep students and faculty engaged with the latest advancements in artificial intelligence. As AI continues to evolve, experts, industry leaders, and students must collaborate to ensure these technologies serve society responsibly. As new AI challenges emerge, the role of red teaming will only grow in importance, shaping the ethical foundation of AI for generations to come.

For more information about UVU's CET programs and guest lectures, click here.

To learn more, see Andrej Karpathy's YouTube Channel and Grant Sanderson's animated math visualizations. Both are recommended resources that were mentioned by Bullwinkel in his lecture.

TechBuzz welcomes Tommy Ladd as our latest intern. Born and raised in Jacksonville, Florida, Ladd is a senior at Utah Valley University majoring in marketing. With a strong passion for business, Tommy enjoys exploring the stories of successful entrepreneurs and companies shaping the industry.

Share this article
The link has been copied!