There is a way which seemeth right unto a man, but the end thereof are the ways of death. –King Salomon(Proverbs 14:12 KJV)
A look at History
Those who fail to learn from History are doomed to repeat it. — George Santayana
1. Over generations, philosophers have been concerned about what truth is. Many theories have been developed but the one most commonly accepted is the “corresponding theory of truth”, which I too happen to subscribe to. According to this theory, truth is what describes the actually real world. In Aristotle’s words, “To say of what is, that it is, or of what is not, that it is not, is true.“ Or in more chewable words by Alfred Tarski: “The sentence ‘snow is white’ is true if and only if snow is white.” The idea here is simply that propositions are like maps, and therefore are said to be ‘true’ if what they tell us corresponds to what is on the territory.
2. At the moment, the understanding of what truth is seems to be a settled case. There are no serious objections to the corresponding theory of truth. [Of course there are people who even think there’s no such thing as truth, but again there are also people who think Jusitn Bieber is a great singer. You can’t waste time arguing with those.] The bigger epistemological issue is rather about how to get to the Truth. Different methods have been proposed, the first one being the armchair philosophy of people like Plato, many religious founding fathers, and to a lesser extent Socrates, who thought that we can sit on our couch and get to the Truth through mere thinking and logical deductions.
The Bayesian Enlightment
Those who ignore the Bayes Theorm are doomed to reinvent it.–Aimable
3. History goes on and on. And then comes the Scientific Method in the 17th century. This is simply the collection of data through observation and experimentation, and the formulation and testing of hypotheses. It is, in my opinion, the greatest idea/innovation in the human history(Thanks to Francis Bacon and the giants on whose shoulders he stood.).
4. Although the idea of the Scientific method is new, it is mirrored in many ancient stories. In the Bible for example, Daniel conducted a complete scientific experiment with a control group, a treatment group, and properly concluded that his hypothesis–that a vegetarian diet is healthier than what the king was suggesting– is true. Earlier than that, Elijah settled the problem of Baal by conducting a well-controlled experiment obliging Baal’s priests to put their lives where their mouths are. Hypothesizing that Jehovah is the real God, Elijah asked that the priests of Baal place their bull on an altar(control group), and Elijah will place a similar bull on a similar altar, with the same wooden fuel. Neither will be allowed to start the fire; whichever God is real will call down fire on His sacrifice. Very scientific! Elijah even pours water on his altar to emphasize his deliberate acceptance of the burden of proof, and to minimize the p-value up to much lower than 0.05. The fire comes down on Elijah’s altar(observation) and the people of Israel serve as the peer review committee, shouting “The Lord is God!”. Acta est fabula plaudite!
5. Although the Scientific method has roots in Empiricism, Inductive reasoning and some very clever heuristics such as Occam’s razor(see the next post for details), its theoretical foundation lies in Bayesianism.
You may have thought that Bayesianism has something to do with the Bayes theorem/rule of conditional probability, and it does. That is however an established mathematical formula which both Bayesians and their counterparts, Frequentists, use. The two schools of thought don’t differ from each other about the formulas they use or the results they get, but about what those results mean. The central dogma of the bayesian thinking is that probability means uncertainty, or more specifically an agent’s degree of belief, and does not have anything whatsoever to do with the likelihood or the relative frequency in the real world as Frequentists argue. If a Bayesian says that there is a 0.3 probability that it’s raining outside, she means that she believes at a degree of 30% that it’s raining, while a Frequentist would be meaning that if we simulate such a day a sufficiently big number of times, the ratio of rainy to non-rainy days would be 30 to 100.
6. There is another way to put this: probability describes an agent’s ignorance(or knowledge) about the world and has nothing to do about how the world itself is. There is an example I often use to illustrate this: Suppose that I toss a die and get the number “3”. Bob and Mike saw me tossing the die but didn’t see the result yet. I ask them “The number which came up is not “4”. Do you think it’s less or greater than “4”?“ Note that the die has already been tossed and I very well know the answer. The truth is there in the real world. The correct probabilities here are P(less than four) = 3/5 and P(greater than four) = 2/5 because the answer is among the numbers 1,2,3,5, and 6. Suppose that I further tell Bob: “Hey, you know what, I want to make things easier for you. The number is certainly not “1” or “2”.” In the light of this new information, for Bob P(less than 4) = 1/3 while P(greater than four) = 2/3 because he now knows that it is 3, 5 or 6. For Mike the probability is still the same. Yes, the world is out there, it is what it is. It’s either raining or not. It’s either a tail or a head. It’s either a boy and a girl. Probabilities are just in our minds.
Go to the Territory and Update your Beliefs.
“If the fact will not fit the theory—let the theory go.”
― Agatha Christie
“In theory, theory and practice are the same. In practice, they are not.” –Jan L. A.
7. Another important aspect of Bayesianism illustrated in the above example is that according to this world view, we ought to change our beliefs as soon as we acquire a new knowledge(evidence). After all, our brains are just our map making engines and the real world is the territory. If a map does not correspond to the territory, it’s always the map’s problem. It needs to be changed.
8. When I’m coding for example, I suspect with high probability that there are some bugs in the code, based on the previous experience I had(priors). This probability has nothing to do with the actual state of the code since either it has bugs or not. It’s all about the information I have about the code. The only way to know the status quo is to run a test. If it succeeds, I “update” my beliefs about the code, and employ the Bayes rule to change the probability of having bugs or not but I’m not quite sure yet. [There is no point to which I will be 100% certain that the code is bug-free.]
That’s the essence of Bayesianism and hence the scientific method: go to the territory, gather data and then update your beliefs. Let Nature speak for herself. You can’t just sit in your room and make map of a city.
The principles of rationality are laws in the same sense as the second law of thermodynamics: obtaining a reliable belief requires a calculable amount of entangled evidence, just as reliably cooling the contents of a refrigerator requires a calculable minimum of free energy. — LW>>> import numpy >>> from sklearn.naive_bayes import MultinomialNB >>> clf = MultinomialNB() >>> clf.fit(X, y) >>> clf.predict(X)
9. Traditionally, we know rational thinking such as using the scientific method, something to do because it’s virtuous. Logic rules were known as social rules, with violations interpretable as cheating. For many it’s still the same because their practice of rationality goes only as far as their class and dinner debates go. They avoid errors of reasoning just because “if you don’t, they would call your argument fallacious”.
But science in real life has gotten us to understand things differently. We should follow the rules of Bayesian/Scientific thinking because we need to know how the world works and that’s the only way. Even if the whole world votes that the law of gravity is nonsense, you will still fall off a cliff. If you don’t employ systematic bayesian/scientific reasoning while building a robot or any other engineering artifact, it will most likely fail to run. You need evidence, you need data, not to sound reasonable but to actually accomplish something.
10. Data means a lot in my academic and professional life. My Machine Learning/AI projects obviously involve large datasets and statistical inference techniques. You can’t properly rank products or reviews for your users if you don’t take into account their previous preferences. You can’t properly predict a farmer’s production without relying on previous records. You won’t be able to properly make complex classifications without relying on previous ones. Even my experience in Information Security showed me that day after day it’s all getting more and more about data analysis. To be able to predict and then prevent cyber-threats, you will need to look into large datasets about the nature and behaviors of previous attacks and make sense of them. Paul Graham’s bayesian spam filter comes as a quick and simple example of a successful application of Machine Learning techniques in Infosec.
My Utopian Views
In short, the greatest contribution to real security that science can make is through the extension of the scientific method to the social sciences and a solution of the problem of complete avoidance of war.
— Edward U. Condon
11. Businesses have begun to understand the role of data. They are using them to improve our search results, translate languages, suggest us products we may be interested in, improve our news feeds, accurately suggest photo tags, and to suggest relevant people as “people we may know”. This cannot be achieved without rigorous analysis of consumers behaviour.
12. Call me crazy but I dream a world where the use of data is not limited in academia and businesses but where it is found in all aspects of our daily lives. Where using evidence and trial and error are the norm.
13. I dream of a world where data doesn’t have to be necessarily “big”, or institutionalized. A world where everyone understands that an open mind and evidence driven thinking are required if we need things to work. Where one’s relationship decisions for example, are based on analyses of each case on its own merits, instead of relying on random principles, often from people who even never had a relationship of their own. Or people who have a vested interest in having you make certain choices. Often leading you to taking for granted what people rarely get twice. I couldn’t be any louder.
14,15. I stand on my dreams because we have examples of countries taking off against the odds because their leaders have a practical –or dare I say, Bayesian– mindset. They focus on what works in the particular situations and historical context of their countries instead of ideal notions such as liberal(western) democracy, free speech, free market, “risk averse approach”, or other political science and (M)BA theories. A principle may be beautiful, appealing and ” seemeth right unto a man, but the end thereof may be the ways of death.” Let that voice be heard in policy making rooms and corporate board meetings. A true expected utility maximizer, which most leaders claim to be, should aim to maximize utility, not formality, defensibility, or methodicalness. What counts is what works. All else can be reserved for armchair Philosophy and class discussions.
 It’s worth noting that I’m not dismissing Frequentism entirely. I use some of its algorithms such as least squares linear regression and expectation-maximization when solving data analysis and ML problems. The point is that Bayesianism makes more sense in practice, and illuminates my understanding on what those frequentists methods actually mean. For example, the law of Large numbers, often considered to be a curse of the Bayesian thinking actually makes sense from a Bayesian point of view. [We are almost sure that the sun will rise tomorrow because the proposition that it wouldn’t, has been constantly proven wrong in the past, rising up our confidence.]
 My dream goes even further: the leaders should apply the Bayesian approach in those offices with as many problems as their total employees, or those whose heads are changed almost as often as the front page of the calendar. The “fail faster” approach is Bayesian but it should go along with an evidence-minded executive ready to learn from previous mistakes. That can be achieved by making a long term plan, where each stage is given its proper time to follow its natural course (You know, Getting nine women pregnant won’t get you a child in one month.) And convincing the executive in charge that their priority is to execute the long-term plan, and not necessarily to come up with novel alternatives which are not backed by any decent field-level data. Here is what rather happens: A new executive Bob takes the office, and one of his staffers Mat tells him: forget evidence, you need novel decisions. Mat shows Bob how his predecessors have done close to nothing to commission the kind of solid evidence needed and how by the time he can be able to produce his own evidence base, he’ll be on to his next job. They then decide to take a dogmatic action without solid evidence. And the circle continues. Until, hopefully, Mat is no more, and Bob is a man with a multinational reputation in the field.