Artificial intelligence can now generate videos, images, and audio that closely resemble real people. These creations, known as deepfakes, are not edited recordings. They are entirely synthetic outputs produced by machine learning models trained on large amounts of data.
How AI Deepfakes Are Created
Deepfakes are generated using generative models; neural networks trained on massive datasets of faces, voices, and movements. These models don’t simply copy existing recordings. Instead, they learn the patterns that make people look, sound, and behave the way they do, then use that knowledge to synthesize new, convincing content.
Most modern deepfake systems are built on one of two main model types: generative adversarial networks (GANs) and diffusion models.
GANs function as a sort of digital sparring match. On one side, there’s a generator trying to create synthetic content that could pass as real. On the other, there is a discriminator that works to tell real from fake. The generator improves by learning to fool the discriminator, and over time, the synthetic outputs become more lifelike.
In face-swapping systems, for example, the model learns a shared representation of facial features, which allows it to overlay one person’s expressions and movements onto another, while keeping key elements of their identity intact.
Early GAN-based systems weren’t exactly subtle. They often left behind telltale signs such as odd lighting, flickering eyes, or blink rates that felt just a bit off. But improvements in training data, architecture design, and post-processing have polished these rough edges. Today’s outputs are far more refined and, in many cases, harder to distinguish from real footage.
That said, deepfake technology has continued to evolve, and many of the latest systems now rely on diffusion models.
These work quite differently. During training, the model learns how images degrade when noise is added gradually, and, crucially, how to reverse that process. When generating content, it starts with pure noise and slowly refines it, step by step, until a realistic image or video frame emerges. The result tends to be smoother, more stable visuals than what earlier GAN-based methods produced.
What makes diffusion models especially versatile is that they can be steered. Identity information, text prompts, or even audio cues can guide the generation process, making the output highly controllable. That flexibility, however, poses new challenges for detection tools, many of which were originally tuned to spot the specific flaws left behind by GANs.
Creating a convincing deepfake video requires consistency over time. Lip movements need to align precisely with speech. Lighting must remain stable from frame to frame. Head and eye motion should feel natural and fluid.
To manage all this, many systems combine visual generators with audio processing networks. Some use transformer-based models to match speech sounds (phonemes) with mouth shapes (visemes). Others apply motion-tracking or smoothing techniques to keep everything coherent and believable as the video plays out.
The end result is synthetic media that moves, reacts, and responds in ways that mimic real human behavior with remarkable accuracy.
Synthetic Harm and Public Trust
Now, with this level of accuracy comes a level of apprehension.
Much of the public concern around deepfakes has focused on abuse, particularly non-consensual intimate imagery, political misinformation, and the erosion of trust in audio-visual evidence. These issues are real and can have serious implications. Studies have even shown, for instance, that much of this synthetic harm disproportionately affects women, with growing use in disinformation campaigns, leading to a rise in public skepticism toward legitimate media.
But beyond personal and political harm, deepfakes introduce a broader epistemic challenge: they destabilize the credibility of recorded evidence. When people can no longer be sure what’s real, trust in journalism, democratic institutions, and even scientific data begins to erode.
Scientific Integrity and Synthetic Data
One of the most pressing challenges is the potential misuse of generative AI to produce scientific data, medical imagery, or entire datasets that appear authentic but are fabricated.
A 2025 PNAS article warns that researchers, companies, or regulators may use generative models to fabricate results that appear methodologically sound. These synthetic datasets might then be presented as the outcome of actual experiments in an attempt to mislead the scientific community.
The risks include:
- Irreproducible results: Fabricated data may influence others who attempt to build on it, leading to wasted effort and misleading conclusions.
- False confidence in findings: Charts, figures, and tables may appear methodologically sound but could be entirely synthetic.
- Privacy breaches: Even when datasets are "synthetic," if the generative models were trained on real human data, it’s possible to reverse-engineer sensitive personal details.
These scenarios blur the boundary between real and synthetic harm. A fabricated MRI image, for instance, might influence diagnosis or treatment decisions, not through malice, but because ethical safeguards were absent.
This raises urgent questions about professional trust, the evidentiary standards in science, and the long-term credibility of academic publishing.
A Case Study: Deepfakes in Forensic Science
One major area that has become a particular concern is forensic science, a field that depends heavily on the integrity of digital evidence. Forensic practitioners are increasingly faced with the task of verifying the authenticity of video and audio content that may have been synthetically manipulated.
As deepfakes become more realistic, traditional methods of detection, such as identifying visual artefacts or irregular speech patterns, are no longer sufficient.
A 2023 review published in the Journal of Imaging surveyed 36 studies from 2021 to 2024, identifying a clear surge in academic attention to this issue. Key insights from this systematic review include:
- Threat to evidence credibility: Deepfakes threaten to erode the reliability of digital evidence, which could compromise the fairness of legal proceedings.
- Detection methods: Current forensic techniques include Convolutional Neural Networks (CNNs), spatio-temporal modeling, and frequency analysis, all aimed at spotting subtle inconsistencies in synthetic media. However, these models vary widely in methodology, accuracy, and reproducibility.
- Lack of standardization: A major limitation across the reviewed studies was the absence of a unified framework for detecting and reporting deepfakes in forensic contexts.
- Psychological and legal impact: The possibility of being falsely depicted or misrepresented in deepfaked evidence creates psychological harm and legal uncertainty for both victims and professionals.
The study highlights the need for technological innovation, but also stresses the importance of interdisciplinary collaboration across forensic science, computer vision, ethics, and law. It offers a handful of recommendations here in relation to how to help curb some of the issues. This includes:
- Developing benchmark datasets and standardized evaluation metrics
- Promoting explainable AI in forensic settings
- Integrating forensic deepfake detection into courtroom admissibility standards
Without these safeguards, deepfakes could be used to falsify evidence, discredit witnesses, or undermine legal procedures, all of which jeopardize the justice system’s integrity.
As the review concludes, maintaining the credibility of digital evidence will require not only better detection tools, but also strong ethical frameworks and clear legal guidelines tailored to forensic applications.
Responding to the challenges of synthetic harm requires a coordinated effort across technology, law, academia, and civil society.
Regulatory and Ethical Responses
Right now, there aren’t many formal rules that directly address how synthetic media can or can’t be used in scientific work. Some legal efforts focus on malicious deepfakes, like those used in harassment or political misinformation, but when it comes to science, the picture is murkier. Was the synthetic data used responsibly? Was it clearly labeled? Or was it passed off as real? These questions aren’t always easy to answer.
This is where journals, funders, and institutions need to step up.
For example, publishers could ask researchers to disclose when generative tools were used, build in screening for altered images or figures, or request access to source data when something seems off. Conferences and preprint platforms could do the same, maybe through simple checklists or clearer ethics policies.
Ethics boards also have a growing role to play here. As AI tools become part of everyday research, questions around consent, privacy, and how data is generated get more complicated. If a synthetic medical image closely resembles a real patient, even unintentionally, that still raises real concerns. We need better guidance around these gray areas.
Education is just as important. Most researchers haven’t been trained to spot synthetic manipulation or think through its implications. Building AI literacy and digital ethics into scientific training could make a big difference, helping people use these tools thoughtfully and flag potential issues early.
And because science is global, the solutions need to be too. That means more collaboration across borders, between journals, universities, funders, and ethics bodies, to create consistent, practical standards for how we handle synthetic content in research.
If we’re going to bring these technologies into science (and in many ways, they’re already here), we need the guardrails to match. Not to slow things down, but to make sure innovation doesn’t come at the cost of credibility.
Conclusion
The ethical risks of synthetic harm in science aren’t something we can afford to ignore. As deepfake technologies become more advanced and easier for anyone to use, they start to chip away at something science depends on: trust.
Deepfakes can damage democratic processes, invade personal dignity, and blur the line between what’s real and what’s fabricated. In a scientific context, that’s especially worrying. A manipulated clinical image or a forged video presented as evidence can distort decisions, outcomes, and public confidence.
So the response can’t be panic or blanket bans. It needs to be measured and well thought out. We have to build frameworks that recognize the difference between harmful misuse and legitimate innovation. Above all, the focus should stay on protecting people, preserving reliable evidence, and maintaining accountability in scientific work.
Ultimately, whether these technologies strengthen scientific discovery or slowly erode its foundations will depend on how deliberately and responsibly we choose to handle them now.
References and Further Reading
- Fisher, S. A. et al. (2024). Moderating Synthetic Content: The Challenge of Generative AI. Philosophy & Technology, 37(4), 133. DOI:10.1007/s13347-024-00818-9. https://link.springer.com/article/10.1007/s13347-024-00818-9
- Singh, S., & Dhumane, A. (2025). Unmasking digital deceptions: An integrative review of deepfake detection, multimedia forensics, and cybersecurity challenges. MethodsX, 15, 103632. DOI:10.1016/j.mex.2025.103632. https://www.sciencedirect.com/science/article/pii/S2215016125004765
- Umbach, R. et al. (2024). Non-Consensual Synthetic Intimate Imagery: Prevalence, Attitudes, and Knowledge in 10 Countries. In Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI '24), Honolulu, HI, USA. ACM, New York, NY, USA. DOI:10.1145/3613904.3642382. https://dl.acm.org/doi/10.1145/3613904.3642382
- Pawelec, M. (2022). Deepfakes and Democracy (Theory): How Synthetic Audio-Visual Media for Disinformation and Hate Speech Threaten Core Democratic Functions. Digital Society, 1(2), 19. DOI:10.1007/s44206-022-00010-6. https://link.springer.com/article/10.1007/s44206-022-00010-6
- Vaccari, C., & Chadwick, A. (2020). Deepfakes and Disinformation: Exploring the Impact of Synthetic Political Video on Deception, Uncertainty, and Trust in News. Social Media + Society. DOI:10.1177/2056305120903408. https://journals.sagepub.com/doi/10.1177/2056305120903408
- Clark, S., & Lewandowsky, S. (2026). The continued influence of AI-generated deepfake videos despite transparency warnings. Communications Psychology, 4(1), 13. DOI:10.1038/s44271-025-00381-9. https://www.nature.com/articles/s44271-025-00381-9
- Resnik, D. B. et al. (2025). GenAI synthetic data create ethical challenges for scientists. Here’s how to address them. Proceedings of the National Academy of Sciences of the United States of America, 122(9), e2409182122. DOI:10.1073/pnas.2409182122. https://www.pnas.org/doi/10.1073/pnas.2409182122
- Fisher, S. A. et al. (2024). Moderating Synthetic Content: The Challenge of Generative AI. Philosophy & Technology, 37(4), 133. DOI:10.1007/s13347-024-00818-9. https://link.springer.com/article/10.1007/s13347-024-00818-9
- Loovens, J. and Tinmaz, H. (2025) ‘A systematic literature review of Deepfakes in forensic science’, Forensic Imaging, 43, p. 200647. DOI:10.1016/j.fri.2025.200647. https://www.sciencedirect.com/science/article/abs/pii/S2666225625000259#:~:text=Deepfakes%20can%20also%20be%20used,account%20of%20events%20%5B22%5D.
- Amerini, I. et al. (2025) ‘Deepfake Media Forensics: Status and future challenges’, Journal of Imaging, 11(3), p. 73. DOI:10.3390/jimaging11030073. https://pmc.ncbi.nlm.nih.gov/articles/PMC11943306/
Disclaimer: The views expressed here are those of the author expressed in their private capacity and do not necessarily represent the views of AZoM.com Limited T/A AZoNetwork the owner and operator of this website. This disclaimer forms part of the Terms and conditions of use of this website.