It is finally here! Our data packed and evidence based book on major issues affecting the health of the U.S. population, including smoking, diet, physical activity, and the policy options to move us in the right direction is now available. You can download a no cost PDF version of this book (and other books from the Roadmap series) from the website of the Arizona State University’s Healthcare Delivery and Policy Program. A paperback version is also available from Amazon (no profits to us). We hope that this book will be useful to a wide range of people interested in the topics of population health, physical activity, exercise and diet. We have focused on basic data related to these topics and what policies might be used to promote healthier lifestyles for both individuals and society as a whole.
A couple of weeks ago Ethiopia’s Genzebe Dibaba broke the women’s world record for the 1500m run with a time 3:50.07. The believability of this performance will certainly be questioned because most of the women’s world records in track and field have been stagnant for decades and date to the era of industrial strength doping in the 1980s and 90s. The 1500 record was set by a Chinese athlete in 1993 who was almost certainly doping. Many of the men’s distance running records are also “old” and occurred after the emergence of the blood boosting drug EPO in the late 1980s and before the advent of better (but far from perfect) drug testing regimens in the later 2000s.
A reasonable rule of thumb is that world records in women’s middle and long distance running “should” be on the order of about 11-12% slower than men’s. This is based on the fact that maximal aerobic power is typically that much lower in elite women than men, while other key physiological factors related to lactic acid build up and running efficiency that determine running performance are generally similar. The current fastest time by a man for 1500m in the pre EPO era was set by Said Aouita at 3:29.46 in 1985! The best time since drug testing got better is 3:26.69 by Asbel Kiprop of Kenya set earlier this year (the world record for men is 3:26 set by Hicham El Guerrouj in 1998).
Historically even better performances, but not faster times, were achieved by Jim Ryun and Kip Keino in the late 1960s. Ryun ran a 3:33.1 on a cinder track at the LA Coliseum in 1967. It was also hot that day. A modern optimally tuned track might be worth 3% and if you adjust Ryun’s performance you get an estimated time of about 3:26 and change.
An even more remarkable performance came a year later when Kip Keino ran 3:34.9 at high altitude to win the gold medal at the Mexico Olympics. Mexico City has an altitude of almost 7,400 feet (2,250m), and the best data suggests that lack of oxygen at that altitude should reduce aerobic power by about 10%. Now Keino was altitude adapted because he had spent his life in the highlands of Kenya, but adaptation only gets you so much. So if we are conservative and adjust his performance by 5% an estimated time just over 3:24 seems “possible”. Old school “point tables” from the 1960s and early 70s also suggest that the 5000m times run by Dibaba and also her world record holding sister equate to times under 3:50.
Which brings me back to Dibaba and the women’s 1500m record, her time is a little more than 12% slower than what Keino might have run and between 11 and 12% slower than the projection for Ryun. It is just over 11% slower than the best time for men since drug testing got better. There are all sorts of reasons to be suspect of any world record in sports like track and cycling and the East Africans have done their share of doping. However, given the analysis above, Dibaba’s record seems like it is at the edge of believable to me.
I have recently had the opportunity to hear tech industry leaders discuss how the combination of gene sequencing in large populations plus various forms of “big data” were going to transform medical knowledge, medical practice, and ultimately public health. To be frank these have been pretty standard recitations of the catechism that once we know your genome and link it to enough data about you we will be able to Predict and Prevent most diseases and/or Personally (or Precisely) treat them in a way that maximizes your Participation in all of the relevant decision making and outcomes. This general scheme has been called P4 Medicine.
As I heard these recitations, a couple of things hit me and I began wonder just how insulated the major players in the tech world are from medical and biological reality. So I will list a few concepts for the techies to consider.
- It is all about MAGOTS or multiple assorted genes of tiny significance. This is term coined by the writer David Dobbs and is a pretty good description of the fact that for most common diseases a clear picture of how genetic factors contribute to them has not emerged even when hundreds of thousands of people have been studied. It also seems like the picture is not going to get a whole lot clearer when millions of people are studied. So the signal might not be there. There are also a host of pretty straight forward statistical considerations about what makes a useful clinical test that the tech folks may not have been exposed to. Giving people useful advice based on a biomarker is more than just considering the odds associated with a gene variant. For many common diseases so-called gene scores don’t improve risk prediction much if any over conventional means.
- For some uncommon and very rare diseases seen in children, gene sequencing is providing insights into causes. Unfortunately, many of these tragic diseases are essentially one-offs and it is unlikely that knowledge of the gene defect is going to lead to breakthrough therapies. Gene therapy has been a bust so far and there are currently no licensed products in spite of 25 plus years of strong efforts in the area. There have been reports of some niche successes but it is unclear how long lasting they will be.
- In tech there is something called Moore’s Law about the computing power of semi-conductors doubling every 24 or so months. In drug development there is something called Eroom’s law that describes how, in spite of all the advances in molecular biology and omics, it is getting harder and harder and more expensive to develop new drugs – the reverse of Moore’s Law. There are many potential reasons for this, but unlike most tech things the costs to develop and market new drugs is not coming down, it is skyrocketing. The chart below shows this. Maybe if the techies study up on this chart they will understand they are dealing with a different animal and that what they think about when they think about hardware, search engines, apps, big data, and gizmos of various sorts doesn’t apply to biology and medicine. Bill Gates for one seems to be coming to that realization, but it only took ten years.
- Whatever the limitations in the biology, no worries for the techies. They can just use big data approaches to mine medical records and the smart watch monitors that “everyone” will soon be wearing. The problem here is that electronic medical records are primarily billing, coding, and compliance documents. The quality of the data has far more limitations than is generally known. As for all of this remote monitoring, first people actually have to wear the monitors, second the information has to be reliable, and third people then might have to change their behaviors based on all of this monitoring. There are a lot of what-ifs in all of this and it is unclear just how willing most people are to be actively or passively monitored. More importantly, all sorts of people know they need to not smoke, exercise more, and eat less but getting them to do it is going to be a challenge. Maybe the gizmos will work, but my bet is they will end up like a lot of exercise equipment that gets bought used for a while and then ends up stored in the basement. Sort of like “all diets work” provided people adhere to them.
- Of course one of the promises of tech is that all of this is going to reduce costs. Well, as mentioned above the costs of developing drugs are going up and for cancer the price of new drugs is unrelated to outcomes. There is also evidence that getting a gene screen leads to more not less medical usage by anxious people with in reality nothing to worry about, and then there are likely to be large number of people in what might called the genomic twilight zone with tests that are a little off and no clear course of preferred action. Also, if people do choose to take action at least some of these actions like extra scans, tests, and biopsies are not without risk. They also will increase costs. Monitors that track people and get people to change behavior might work, if people use them.
- Now we can forgive the techies for not knowing much biology and not having full knowledge of the limitations of the biological ideas underpinning P4 medicine. However, shouldn’t we expect them to know about the limitations of “big data”. Robert McNamara – at some level the inventor of big data – attempted to “manage” the Viet Nam War on the basis of metrics, analytics and hard data. He had tried to do the same when he was the CEO of Ford Motor Company and in both cases, but especially Viet Nam, his approach became a sort of tragic cult of data unrelated to reality. The chart below summarizes what has been termed the McNamara Fallacy and is one I use in my talks to academic audiences all over the world on these topics. To me it summarizes many of the perils of big data.
Ultimately, the techies have a lot of money and a lot of toys and a lot of influence. However, it is unclear if they have any insight into what they don’t know or the inherent limitations of their “model”. The blind faith they have in their world view and their self-image as modern day frontiersmen creating a better world is also a disturbing echo of Robert McNamara.
Over the last few months I have run across a couple of ideas — really catchy phrases — that are influencing the way I think about trends and hopefully progress (or lack of it) in medicine. The phrases are idea bubbles, biological plausibility, and bio-babble.
Idea Bubbles & Alzheimer’s Disease
I ran across the phrase idea bubbles when I was doing web search on the amyloid hypothesis for Alzheimer’s Disease. The idea that first emerged in the 1990s is that a buildup of amyloid proteins in the brain is central to the development of Alzheimer’s. This has led to the development of animal models that generate excess amyloid in their brains and also drugs that either slow the buildup or help clear it. It has also led to a number of promising early stage human drug trials in patients with Alzheimer’s that have ultimately failed in larger trials. At this time there are no effective approved anti-amyloid therapies on the market in spite of this vast effort.
All of this was reviewed in a great blog post on Forbes by David Grainger and he discusses why the hypothesis lives on to fight another day and why drug companies, investors and the scientific community is continuing to make “large bets” on the amyloid hypothesis:
— There are a few important lessons from this sorry tale, that extend well beyond Alzheimer’s Disease. It highlights the danger of what I previously called “idea bubbles” – that a hypothesis gains so much credibility over a long period of time that even when the data tells you otherwise, adherents (acolytes may be a better word) question everything but the hypothesis.–
As I drilled down I found the definition of an idea bubble and how it relates to better known bubbles like stock market bubbles.
— An “ideas bubble” occurs when, over a long period of time, positive social feedback disconnects the perceived validity (of the idea) from the real underlying validity – in the same way price and value dissociate in a stock market bubble.” –
Bubbles are sustained by gold rush mentalities, optimism, the fact that careers have been invested in one thing or another, and the general problem of sunk costs.
Bubbles similar to Alzheimer’s and amyloid have occurred for all sorts of cancer therapies (surgery, radiation, chemo) over the last hundred years and I wonder if the next big bubble is going to be the precision medicine bubble and as the noted public health expert David Hunter recently pointed out:
— “In searching for a cure for cancer, we have repeatedly climbed on various bandwagons. They include the radical mastectomy for breast cancer, high-dose chemotherapy, immunotherapy, and — more recently — molecularly “targeted” therapies. In each case, it took someone with courage to point out the limitations or futility of the approaches.
Hope is critical to cancer patients and those treating them, but hope that is not rooted in the facts risks becoming an illusion. As Mikkael Sekeres of the Cleveland Clinic has commented, we should not delude ourselves into believing targeted therapies will be a panacea for cancer treatment.”
Bubbles & Biological Plausibility
One of the things required for a bubble (or perhaps a bio-bubble) to take off is the need for a narrative that makes biological sense and can then underpin a big idea. In a great editorial in the British Medical Journal, David Healy traces how the serotonin hypothesis emerged as an explanation for depression and led to the generation of whole new classes of drugs with marginal efficacy that have been vastly over-used. Another good example is the idea that “free-radicals” cause cancer, aging, and heart disease and that taking anti-oxidants can make people healthier. In fact big clinical trials show the opposite, anti-oxidants can be associated with worse instead of better outcomes. However, the theory lives on as do the pitches. Each of the cancer therapies mentioned above had at the time of their adoption a tight and biologically plausible back story.
The biology of most hard to treat or cure diseases is complicated and usually defies a simple linear story. However, we persist in seeking them. One example that hits home for me as an anesthesiologist and physiologist is what has been described as the “Cult of the Swan-Ganz Catheter”. In the early 1970s it became possible to routinely put big catheters into the hearts of most patients in intensive care units. The idea was that by carefully and precisely measuring the pressure, oxygen levels, and blood flow various places in the heart “goal directed” therapy could be used to give just the right amount of fluid and drugs to patients. This would then improve outcomes for hard to treat diseases like heart failure or severe infections.
Sounds like a good idea, but it has not worked. What is also interesting is that the general narrative appeared in the early 70s, was questioned in the middle 80s, was not really evaluated objectively on a large scale until the 90s, and a firm consensus about the limitations of these catheters only really emerged in the 2000s with an “obituary” written in 2013.
As an aside, when I was a resident in the late 1980s, the placement of these catheters in the ICU was almost like a religious ceremony or sacrament. The more senior Drs. served as high priests while the younger interns and medical student acolytes watched on and waited for ordination and their chance at the altar.
Ideas that are perhaps too good to be true die hard. This is true in medicine and health in so many ways. That the some of the “smartest people around” continue to fall into the same cognitive traps over and over again should make us all think twice before jumping on any bandwagons that are “sure” to cure anything.
There has been yet another wave of “too much” exercise stories in the media based on a recent study of 1 million women from the UK. The idea is that moderate levels of physical activity most days with occasional bouts of strenuous activity can cause big drops in both cardiovascular and all-cause mortality. However, doing a lot of hard training is not as beneficial.
This topic has been recycling for the last couple of years. Alex Hutchinson (who has a Ph.D. in physics) has done an excellent numerical/statistical breakdown on one of the key studies that “supports” the too much exercise hypothesis. Put simply there are many limitations to the whole argument. I have done a couple of posts on what both the epidemiology and physiology tell us on the topic. The first was in 2012 and another one with Brad Stulberg in 2014. I too am a skeptic.
I am in the camp that 30-60 minutes of physical activity most days is the sweet spot for general health and that more is not better, but it is not worse either. Those who really push it most days are also likely motivated by things other than return on investment thinking about their health. Perhaps they want to compete in races or are into pushing themselves to meet more hard core physical goals.
The Swedish Skiers
Whenever this topic comes up I also bring up a paper that followed about 50,000 male finishers of the 90km (~55 miles) Vasaloppet cross country ski race in Sweden. This study used the Swedish medical records system to look at mortality in the race finishers. In preparing for an upcoming talk on exercise and health, I asked my colleague Andy Miller to generate some figures from the skiing study. The one below shows that mortality is about 50% or less than predicted for race finishers compared to the expected rate gleaned from Swedish population records. It also shows that finishing more races was not associated with an uptick in mortality, if anything it was associated with a down tick. Who knows exactly what these folks were doing, but those who finished a number of races certainly had to be doing a lot of strenuous training over many years.
I have repeatedly asked those in the “too much” much camp to rebut this paper and point out any major flaws in it. The bottom line is that it is at least as strong or stronger than the studies “supporting” the too much exercise hypothesis. Until data comes along that clearly refutes the data in the chart above I will remain a skeptic.
There is a fascinating recent study from Finland on pairs of identical twins with very different exercise habits. This is unusual because widely divergent behavior patterns between identical twins are uncommon. There were some pretty striking differences in thinks like exercise capacity, metabolism and even brain structure in the active vs. inactive twins showing that even when the “genes are the same” behavior can really make a difference. The details of the paper were beautifully summarized by Gretchen Reynolds in the New York Times with some excellent insights from the authors of the paper included in her article. Some additional thoughts about what this all means are available in an excellent commentary by Alex Hutchinson in Runner’s World.
This study and the outstanding pieces by Gretchen and Alex reminded me of a paper from the early 1980s on the different physiological adaptations to strength and endurance training. The paper included the pictures below of identical twin brothers. One was an endurance runner, the other a weight lifter.
The picture speaks for itself. The lifter was 16kg (35 lbs) heavier than the runner, but the runner’s heart was about 25% larger and his maximal oxygen uptake more than 50% greater than his brother’s. Of note, the height of the brothers and things like their hair patterns are strikingly similar. For those who want to know more about the strengths and limitations of twin studies and what can be inferred from them here is an informative link.
That such big differences in physiology can be seen in people who have “identical” genes is pretty convincing evidence that for many things our genes are not our destiny.
One of the ideas riding the wave of enthusiasm for precision medicine is that with enough big data it should be possible to make increasingly accurate “forecasts” about who gets what disease and how it might be prevented, treated or even “cured”. An analogy to precision weather forecasting and climatology is frequently drawn. Cynicism aside about just how good weather forecasting is and how much it has improved, there are a couple of basic intellectual issues with the comparison that are typically glossed over by advocates of the analogy.
Problem 1: The Nature of the Data
Weather data includes things like continuously monitored surface temperature and wind patterns over essentially most of the world. Some of the data is very granular with high spatial and temporal resolution. Things like pressure measurements, above ground temperatures, below ground temperatures, satellite photos, and information on things like humidity are available. There is also a vast store of historical records dating back 100 or more years in many places. This type of multilevel, highly accurate data with essentially continuous time resolution is simply not available even in the most monitored humans living in the real world even with the best monitors. The accuracy of various wearable devices, the granularity of the data, and the historical information they provide pales in comparison to the available weather related data. As someone who has been making some the most detailed possible measurements of human physiology since the late 1970s, things have been miniaturized and made portable, but the quality of the data has not improved and in many ways has gotten sloppier or at least harder to calibrate.
Problem 2: Predicting What?
The other thing to remember is that with weather prediction the goal is to predict what it is going to be like “outside” in a given place on a given day at a given time. Precision weather forecasting does not tell us anything about the temperature and humidity inside “your house” much less inside a given room inside the house. To make that sort of estimate all sorts of additional information is needed about the size of the house, the surface area exposed to the outside world, the heating and cooling system, how insulated the house is, how good the thermostat is etc., etc, etc… Then there is always the possibility that a window is open or that on a cold day you choose to wear a sweater and reduce the temperature “set-point” on the thermostat. The same issues also apply to a given room the house.
The point here is that for human disease, except perhaps for some elements of dermatology, we are generally interested in what is happening inside a specific room inside the body like the “heart” room, or the “liver” room or the “kidney” room. For things like diabetes or high blood pressure that affect multiple rooms, we are interested in the overall house. Also many diseases of specific rooms also frequently do collateral damage to “the rest of the house”. In many of these diseases the ultimate problem that “brings people to the Dr.” has something to do with a complex feedback control system that has gone haywire. That is certainly the case for diabetes, heart failure, and high blood pressure. In heart failure shortness of breath and exercise intolerance is usually the problem patients complain about vs. a weak heart.
So the weather is an outside condition we are trying to predict based on outside data. Medical conditions are generally inside conditions and predicting them from outside data of questionable quality with limited time resolution and historical tracking is clearly an area where the precision medicine vs. precision weather analogy breaks down. Things like biopsies, images and blood tests are inside samples but they are small snap shots and not the sort of continuous measures available to the weather forecasters
Problem 3: What About Inside-out Prediction?
The flip side of the weather analogy is the idea that if you know enough about the building blocks (the cells for example) that make up the house you can predict what is going on inside the house as a whole. Of course the outside world influences what is happening inside and those who favor an inside-out paradigm tend to ignore or discount that problem. Another issue is that unlike static structures humans can move around and change their behavior depending on the conditions outside. When I lived in Arizona I went outside mostly during the cooler parts of the day. In Minnesota where I now live, most of the year, I go outside during the warmer parts of the day. A cell based approach to modeling what is going on inside the body can miss this key but obvious point.
Then there is the problem of the cells as building materials. Imagine decorative concrete blocks like the ones used in the wall shown below.
Depending on the orientation of such blocks, a wall made from them can have very different properties. Flip them on their side and a solid vs. porous wall “emerges”. Thus, the temperature inside a structure made from such concrete blocks could vary widely depending on their orientation. However, subject the blocks from a wall of any design to chemical analysis and the “basic” properties of the wall are the same. Things of course get even more complicated if you add a heating and cooling system with a thermostat or other design features that influence the temperature in your concrete block building. This sort of inside-out modeling would be less problematic if the DNA in our cells was a better blueprint for what the “whole building”, but it turns out that DNA is a pretty sloppy and much more adaptive blue print than was once thought.
I would be curious to see just how much better or accurate weather forecasting has gotten over the years. If there is data on this topic perhaps someone will post a source in the comments section. In the meantime, I hope the concepts noted above make you question the precision medicine, precision weather analogy.