In the beginning, before I knew anything about statistics, I knew it was bitterly divided into two rival camps, two schools of thought that opposed each other with every fiber of their beings. These were the Bayesians and the Frequentists.
I first learned of this rivalry in a very curious way. During my undergraduate years, I discovered to my delight that universities had access to an endless archive of academic papers. These were exclusive communications among scientists that were unavailable to the general public, documents which even a clueless novice like me could download and read for free using the campus internet. Like many an academic, I fell into a regular practice of browsing for interesting papers and squirreling them away in my Read Later folder.
One day I happened upon a strange kind of paper. It didn’t describe an experiment, a study or a mathematical derivation. It was instead a bundle of short essays, a collection of responses, one after another, to the question of which approach to statistics was best, Bayesianism or Frequentism. What drew me to this journal article was how deliciously scathing the arguments were. Experts accused other experts of misunderstanding not just Statistics 101, but the basic principles of learning and life itself, things about being alive that even a child would have grasped in their earliest years.
Dear reader, can you imagine my delight at entering this world that had previously been invisible to me, where passionate, world-class scientists argued over the nature of reality itself? It was as if I had stumbled into a glade of arch-wizards battling in the moonlight. I was transfixed.
Frequentists, as I understood them, were the establishment. They wore suits and ties. They had office jobs. Theirs was a world of committee meetings and quarterly reports, of clean lines and neat conclusions. They treated probability as something hard and factual, something you could count, and touch, and measure.
Bayesians seemed to me a little wild-eyed, bordering on the religious. It did not help that the movement traced its origins to a literal religious leader, the Reverend Thomas Bayes. They talked about mathematical elegance and logical coherence with passionate fervor, as if they were not just practically useful but moral goods in themselves. For Bayesians, probability was a deeply held personal sense of how the world tended to go, constrained only by the need to be consistent with their beliefs as a whole, and by compatibility with the available data. 
The Problem of Science
Human beings collect data because we want to learn something about the world. Every experiment is an act of faith that knowledge is possible, that the patterns of life can, at least in part, be measured and understood through reason. We assume that if we measure, record, and compare, we will find patterns, and these will reveal objective, generalizable truths about reality. We call these empirical truths science.
Suppose that we are curious to know whether use of Tylenol during pregnancy causes autism. Within our study, we find that autism appears more often among women who take Tylenol during pregnancy than among those who do not. Two possibilities present themselves. Either Tylenol really does cause autism, or the pattern is a coincidence. Logically, the data we observe can only have arisen in either a universe where Tylenol causes autism or in one where it does not. The challenge for the scientist is to determine which of those universes we inhabit.
Mathematically, we can approach this problem through the lens of probability. Given our data, we want to know how likely it is that we live in a universe where Tylenol causes autism. This is not absolute truth but truth conditioned on what we observe. It depends on our data. In essence, we are asking what our data say about the world.
The Power of Probability
A useful way to think about probability is as an accounting of all the ways in which the world could unfold. It represents the share of all possible pathways that lead to a particular outcome. A coin lands heads half the time because half of all possible coin-flipping pathways end with heads and half with tails. Likewise, the probability of observing our data is the total weight of all possible pathways that could have produced it, some from worlds where Tylenol causes autism, others from worlds where it does not. The probability that we live in the first kind of world is simply the share of all pathways leading to our data that correspond to a world where Tylenol causes autism.
If all this talk of pathways and worlds feels a bit abstract to you, I completely understand. It feels that way to me too, even after all my years of studying statistics. But think about something as ordinary as a coin flip. The coin lifts off your thumb, spinning through the air, tumbling end over end as the air pushes against it in unpredictable ways. It rises, slows, and falls back down again, landing wherever that invisible tangle of forces takes it. No two flips are ever the same, yet all that chaos resolves into a perfect fifty percent balance between heads and tails. Finding mathematical order in such apparent randomness seems impossible, yet the world is constructed in such a way that the proportion of heads across many tosses converges toward a stable, replicable number. The same deep mechanism that turns chaos into order is operative in every dataset we collect. That power to draw knowledge out of the confusion of the world is what first made me fall in love with probability and statistics.
Getting back to the problem at hand, I have so far made it seem very simple. I have made it seem as if, given data from a well-designed study, we could simply plug in the numbers and know, with as much certainty as the data allow, which kind of world we live in. Not quite. The probability of getting our data via a particular causal pathway has two components. The first is the probability of being on that pathway to begin with. The second is the probability of observing our data, given that we are already on that pathway. The first is the real problem. 
To know the true probability of getting our data, we would need to know the baseline probabilities of living in each kind of world, and that is something we can never observe directly. There are two ways to interpret these unknown probabilities. The first assumes that the inherent plausibility of a belief is an objective feature of reality. The second imagines the world itself emerging from a process of random selection among every competing possibility. In the first case, in mathematically modeling these probabilities, we are attempting to model a god-like mind. In the second case, we are attempting to model God.
The Frequentist View
The Frequentist response is to accept the limitation. We acknowledge that we can never know the true probabilities and adapt our analysis to stay within those constraints. If we cannot know the true probability that a particular kind of world exists, we should not guess. Instead, we focus on what can be known, the probability of observing our data if that world were real.
Because the highest possible probability of being in any particular world is one, there is a hard logical limit on how likely any explanation can ever be. No adjustment for the probability of being in a particular world can make a pathway to the data within that world sufficiently plausible if the data themselves are extremely unlikely. Therefore, we can recognize when a probabilistic pathway provides a poor explanation for the evidence and treat that as evidence of its implausibility. 
The limitation of this approach is that we cannot meaningfully compare probabilities across different worlds. In many practical cases, the data we collect are unlikely simply because all possible outcomes are unlikely. As a result, we may rule out certain explanations as implausible when, in truth, each explanation in a logically exhaustive list of possible worlds is improbable. Yet logic still dictates that one of them must be true. In such situations, we would wish to choose the most likely explanation, but estimation of that probability lies beyond the scope of the Frequentist perspective, which regards it as fundamentally unknowable.
The Bayesian View
The Bayesian response to this lack of knowledge is to start with a guess. If we do not know how likely it is that our world has a given property, we can assign numbers that seem reasonable to us and update them as new data arrive. This process has a remarkable property. With enough data, it converges toward the same answer independent of the initial guesses. Two Bayesians, given the same statistical model and enough data, will eventually agree even if they start from very different initial beliefs. This is a mathematically provable fact.
What remains uncertain is what this convergence really means. What we can say for sure is that the answer to which Bayesian procedures converge is consistent with the results we would obtain in a world that truly arose out of random selections of causal rules. In other words, if the universe had in fact been assembled through such a process of chance, Bayesian reasoning would still uncover its underlying structure. For those who view probability as a way of quantifying the plausibility of beliefs rather than a literal feature of nature, this result is encouraging because it shows that rational methods of reasoning about uncertainty align with the way the world would behave if chance were real. Whether that description fits our own world is something we cannot know, though in simple cases such as coin flips, the mathematical theory seems to match our physical reality exactly.
Despite its mathematical elegance, Bayesian reasoning rests on a very particular notion of truth. We know that, given enough evidence and the same model, different practitioners will eventually reach the same conclusion, but the framework says little about what these probabilities mean along the way. In practice, each scientist’s initial assumptions can shape the results, a feature that seems at odds with the ideal of objectivity in science. It also extends probability to questions that are, in principle, unknowable, making its claims difficult to falsify. For these reasons, Bayesian methods are more tightly regulated in legal and medical settings than their Frequentist counterparts.
The Moral of the Story
Historically, the differences between the major schools of statistical thought have been framed as philosophical, grounded in questions about the nature of knowledge. Bayesianism is often described as a form of rational subjectivity, treating probability as a property of the mind, constrained by internal consistency and by the data. Frequentism is cast as its opposite, treating probability as an objective feature of the world and resisting any extension of statistical reasoning beyond what can be directly observed.
In truth, these approaches are responses to a concrete scientific problem: how to infer general rules from limited data when the underlying propensities of the universe are unknown. Any method for doing this, even our intuitive sense of what is true, requires us implicitly or explicitly to assign some degree of plausibility to the possibilities before us. Any method of applying mathematics to the physical world requires a logically coherent response to the problem of inferring generalizable truths from noisy observations. Bayesianism and Frequentism are not fanciful philosophical indulgences but practical necessities, forced upon us by the central challenge of science, the task of drawing meaningful conclusions from data.


A reference to that “strange kind of paper” please!
The explanation about Bayesians makes sense in the case where the Bayesian updates his/her prior belief based on the observations. But there are some prominent Bayesian papers in epidemiology where the researchers start off with an unstated belief, then when they see data that conflicts with the belief they either ignore that data altogether or "adjust" the data to fit their beliefs. As a result, their conclusion naturally aligns with their prior belief, and they claim that they were right. And they do all that without actually telling the reader what their prior belief is.
Don't believe me? Read the Global Burden of Disease papers on autism, and search for the source "data", which are actually the output of the aforementioned prior-based model.