When Artificial Intelligence Starts Rewriting Reality – The Health Care Blog

Photo of author

By news.saerio.com

When Artificial Intelligence Starts Rewriting Reality – The Health Care Blog


By BRIAN JOONDEPH

Image created by/using ChatGPT

Artificial intelligence is quickly becoming a core part of healthcare operations. It drafts clinical notes, summarizes patient visits, flags abnormal labs, triages messages, reviews imaging, helps with prior authorizations, and increasingly guides decision support. AI is no longer just a side experiment in medicine; it is becoming a key interpreter of clinical reality.

That raises an important question for physicians, administrators, and policymakers alike: Is AI accurately reflecting the real world? Or subtly reshaping it?

The data is simple. According to the U.S. Census Bureau’s July 2023 estimates, about 75 percent of Americans identify as White (including Hispanic and non-Hispanic), around 14 percent as Black or African American, roughly 6 percent as Asian, and smaller percentages as Native American, Pacific Islander, or multiracial. Hispanic or Latino individuals, who can be of any race, make up roughly 19 percent of the population.

In brief, the data are measurable, verifiable, and accessible to the public.

I recently carried out a simple experiment with broader implications beyond image creation. I asked two top AI image-generation platforms to produce a group photo that reflects the racial composition of the U.S. population based on official Census data.

The first system I tested was Grok 3. When asked to generate a demographically accurate image based on Census data, the result showed only Black individuals — a complete deviation from reality.

After more prompts, later images showed more diversity, but White individuals were still consistently underrepresented compared to their share of the population.

ywAAAAAAQABAAACAUwAOw==
Grok’s 2nd try
ywAAAAAAQABAAACAUwAOw==
Grok’s 1st try

When asked, the system acknowledged that image-generation models might prioritize diversity or aim to address historical underrepresentation in their results.

In other words, the model was not strictly mirroring data. It was modifying representation.

For comparison, I ran the same prompt through ChatGPT 5.0. The output more closely matched Census proportions but still needed adjustments, with the final image below. When asked, the system explained that image models might prioritize visual diversity unless given very specific demographic instructions.

ywAAAAAAQABAAACAUwAOw==
ChatGPT did a little better…

This small experiment highlights a much bigger issue. When an AI system is explicitly told to mirror official demographic data but ends up producing a version of society that’s adjusted, it’s not just a technical glitch. It shows design choices — decisions about how models balance the goal of representation with the need for statistical accuracy.

That tension is particularly important in medicine.

Healthcare is currently engaged in active debate over the role of race in clinical algorithms. In recent years, professional societies and academic centers have reexamined race-adjusted eGFR calculations, pulmonary function test reference values, and obstetric risk scoring tools. Critics argue that using race as a biological proxy may reinforce inequities. Others warn that removing variables without considering underlying epidemiology could compromise predictive accuracy.

These debates are complex and nuanced, but they share a core principle: clinical tools must be transparent about what variables are included, why they are selected, and how they impact outcomes.

AI adds a new level of opacity.

Predictive models now support hospital readmission programs, sepsis alerts, imaging prioritization, and population health outreach. Large language models are being incorporated into electronic health records to summarize notes and recommend management plans. Machine learning systems are trained on massive datasets that inevitably mirror historical practice patterns, demographic distributions, and embedded biases.

The concern isn’t that AI will intentionally pursue ideological goals. AI systems lack consciousness. Presently at least. However, they are trained on datasets created by humans, filtered through algorithms developed by humans, and guided by guardrails set by humans. These upstream design choices affect the outputs that come later. Garbage in, garbage out.

If image-generation tools “rebalance” demographics to promote diversity, it is reasonable to ask whether clinical AI tools might also adjust outputs to pursue other goals, such as equity metrics, institutional benchmarks,  regulatory incentives, or financial constraints, even if unintentionally.

Consider predictive risk modeling. If an algorithm systematically adjusts output thresholds to avoid disparate impact statistics rather than accurately reflecting observed risk, clinicians might receive misleading signals. If a triage model is optimized to balance resource allocation metrics without proper clinical validation, patients could face unintended harm.

Accuracy in medicine is not cosmetic. It is consequential.

Disease prevalence varies among populations because of genetic, environmental, behavioral, and socioeconomic factors. For instance, rates of hypertension, diabetes, glaucoma, sickle cell disease, and certain cancers differ significantly across demographic groups. These variations are epidemiological facts, not value judgments. Overlooking or smoothing them for the sake of representational symmetry could weaken clinical precision.

None of this argues against addressing healthcare inequities. On the contrary, identifying disparities requires accurate and thorough data. If AI tools blur distinctions in the name of fairness without transparency, they may paradoxically make disparities harder to identify and fix.

The solution is not to oppose AI integration into medicine. Its advantages are significant. In ophthalmology, AI-assisted retinal image analysis has shown high sensitivity and specificity in detecting diabetic retinopathy.

In radiology, machine learning tools can highlight subtle findings that might otherwise go unnoticed. Clinical documentation support can help reduce burnout by lowering clerical workload.

The promise is real. But so is the responsibility.

Health systems adopting AI tools should require transparency regarding model development, variable importance, and policies for output adjustments. Developers should reveal whether demographic balancing or representational changes are integrated into training or inference processes.

Regulators should focus on explainability standards that enable clinicians to understand not only what an algorithm recommends, but also how it reached those conclusions.

Transparency isn’t optional in healthcare; it’s essential for clinical accuracy and building trust.

Patients believe that recommendations are based on evidence and clinical judgment. If AI acts as an intermediary between the clinician and patient by summarizing records, suggesting diagnoses, stratifying risk, then its outputs must be as true to empirical reality as possible. Otherwise, medicine risks moving away from evidence-based practice toward narrative-driven analytics.

Artificial intelligence has remarkable potential to improve care delivery, increase access, and boost diagnostic accuracy. However, its credibility relies on alignment with verifiable facts. When algorithms start presenting the world not only as it is observed but as creators believe it should be shown, trust declines.

Medicine cannot afford that erosion.

Data-driven care relies on data fidelity. If reality becomes changeable, so does trust. And in healthcare, trust isn’t a luxury. It is the foundation on which everything else depends.

Brian C. Joondeph, MD, is a Colorado-based ophthalmologist and retina specialist. He writes frequently about artificial intelligence, medical ethics, and the future of physician practice on Dr. Brian’s Substack.



Source link

Leave a Reply