EDGE MASTER CLASS 2010 CANCERING |
CANCERING: Listening In On The Body's Proteomic Conversation (PART I) Right now, I am asking a lot of questions about cancer, but I probably should explain how I got to that point, why somebody who's mostly interested in complexity, and computers, and designing machines, and engineering, should be interested in cancer. I'll tell you a little bit about what I am doing in cancer, but before I tell you about that, I'm going to tell you about proteomics. Before I tell you about proteomics, I want to get you to think about genomics differently because people have heard a lot about genes and genomics in the last few years, and it's probably given them a misleading idea about what's important, how diseases work, and so on. Let me start by talking about genes, and giving you a different way of looking at genes, I want to start by clearing up, well maybe not misunderstandings, but putting a different emphasis on how genes work. That will explain why I'm interested in proteomics, and that will explain why I'm interested in cancer. You've probably heard the genome described as like a blueprint for producing an organism. That's a very misleading analogy because a blueprint is interesting because it says how everything is connected, and how the parts relate to each other. In fact, the genome, at least the part of the genome that we understand how to read, actually doesn't tell you that at all. It's kind of a list of the parts. It does have some control information on it about when different parts should be made, but for the most part we don't know how to read that control information right now. What we know how to read is the parts list. While that's a very useful thing, it's probably not the most important thing that we need to know to understand what's going on. An analogy might be restaurants. Let's say you were trying to understand the difference between a great restaurant and a bad restaurant, and what you had to work with was a list of the ingredients that they had in their storehouse. Sure enough, if you snuck in at night and looked at their inventory list, you might be able to tell some things about the restaurant. You could probably tell the difference between a French restaurant, and a Chinese restaurant just by the ingredients list. And indeed, you can tell the difference between a European person, and an Oriental person just by looking at their ingredients list, but you probably can't tell a lot about what their personality is like. Now, sometimes you can tell about defects, so if a restaurant was completely missing salt, or they only had lard for oil, you could say, "Well, this restaurant might be improved if they started using salt, or if they had some butter instead of lard", or something like that. So there might be some gross things that you can tell about inadequacies, that they were missing a key ingredient, something was broken about the ingredients list, but to really understand whether the food was good or not, or how they were making food, you really have to watch what's going on in the kitchen, and watch the process. You have to actually watch the dynamic process; the list of ingredients doesn't work. The way I think about this is more like computer programs. The genome is like a listing of your operating system, but missing all the control information, so missing all the jumps, and things like that, and that's a fairly useless thing to have if you're trying to figure out, if you're trying to debug a program. It's not totally useless, but it's not that informative. What a programmer would really want to know is they'd like to dynamically look at what's going on inside the machine, what's getting loaded into the registers. That kind of dynamic trace is the much more useful thing for debugging them in kind of a partial listing of the code. If I put it that way, you might ask what's the big deal about the genome, why all the excitement about genomics? I think it has a couple of historical reasons. One of them is the gene is the great theoretical triumph of biology, it's the one kind of theoretical construct that was predicted. Like the physicists predicted there ought to be a positron, and when they looked, there was a positron. That happens all the time in physics. The equation says there should be a black hole, and we look, and we find black holes. In biology, that almost never happens, and the great dramatic example of it happening was genes. So genes were kind of theoretically predicted by Mendel and they were the core of what Darwin needed, that unit of inheritance. Then Watson and Crick looked and actually discovered it! It was like actually finding the black hole that was predicted. That was in some sense the most exciting thing that ever happened in biology, and since it so stands out, there's nothing close to that, it almost has a religious significance in biology. It is the triumph of the one great theory. The other thing about it, a practical thing, is that it turned out that people like Kary Mullis worked out this very neat way with tools that the biologists had on their bench, so they could actually measure a gene. In fact, you can almost do genetics in your own kitchen with a few extra pieces of equipment, if you have the right enzymes around. You heat something up, and cool it down, and heat it up, and cool it down, and then you pour it in some jello, run an electric field across it and you actually get a read-out of this nice, digital picture. So not only was it a theoretical construct that had been predicted by biology, but it was also accessible to experiments with the stuff that people had lying around in their labs. Of course, now we sequence genes with much more sophisticated equipment that does it much more rapidly. But it got its start because everybody could do it in their lab. They could see the genes, so everybody could get in the genetics business immediately, and start getting really interesting genetics results. For instance, the field of zoology was transformed by being able to tell what's related to what, like kind of the trick of telling the difference between the French restaurant and the Chinese restaurant. By looking at the ingredients you could find the complete tree of family relationships, and so there's a huge amount of good science that suddenly became possible. You could get a lot of hard data. Of course, people immediately looked at what medical applications it could have. There are dramatic medical examples where you're missing a key ingredient, or one of your ingredients is broken, when there is a disease associated with that - a mutation in a gene, or a missing gene, or duplicated gene or something like that. Cystic fibrosis is an example of that, where the problem is in a single gene. So there are definitely examples like that, conditions that can be identified, and understood in a certain sense by looking at this parts list, or looking at this ingredients list. But if you really want to know what's going on, in most cases a much more interesting thing to do is to look at the dynamics. That is in the proteins that are actually getting generated. Some of them are getting generated directly from genes, some of them are getting generated and then modified by after they are produced. There's a lot that happens after the genetics. And the proteins are controlling which genes are expressed. So, to me, there is a much more interesting kind of analogy, based on process. The analogy we have so far is about structure. We emphasize the structure of things, so we think of the building blocks, and the things that get built, and the parts. I think it's much more interesting to look at the process that builds all of these parts. It's true that the human body is an amazing structure, but what's much more interesting is the process that builds it, that maintains it, modifies it. That's not really in the genes, it's in the conversation that's happening between all the parts of the body, and the conversation is happening within the little molecular machines within the cell, or between the cells in the body. Your body has tens of trillions of cells in it, more than the population of the earth, and all these cells are talking to each other, sending each other signals, there's signaling going on within the cell. To emphasize this other way of looking at it I like to look at the genome, not just as a parts list, but as the vocabulary list for this conversation. It's a useful thing to know, but the really interesting thing to do is to listen in on the conversation. What are these machines all saying to each other? That's what proteomics is about. Proteomics is the study of all the proteins. "Omics" means "the study of all". The idea became popular when people like Wally Gilbert who started saying, "We should have all the human genes." Then by generalization, people were saying, "Well, we should know all the proteins in the body, we should know all the connections between neurons, and we should know all the metabolites." There are a lot of different kinds of "omics". What is really interesting about proteomics, is the dynamic conversation; it's the study of the molecules that the genes are making, the ones that are controlling the genome. It is the conversation between the parts. This conversation is happening within the cell, and between cells, the elements of this conversation are proteins that are being sent around, and being absorbed by the cells, or being sent from one part of the cell to another. It's taking place in the medium of proteins, and so if you could see where all of those proteins are, and how they're dynamically changing, then you would, in fact, be listening in on the conversation. That would be a great thing to hear. Biologists have recognized for a long time that it's a great thing to do. They've tried to do it. It's turned out to be technically much, much more difficult than genomics, for a couple of reasons. One of them is it's essentially an analog process, not a digital process. It matters how much of the protein is there. But another thing is there just wasn't this wonderful technology for dealing with it like replicating DNA. You couldn't really do it well with the equipment that was lying around in the lab. People have tried to do it, but it was a very unrepeatable process, a very noisy process, and so the first publications about it tended to be wrong because people had mismeasured it, they couldn't measure the same thing a second time. So basically what happened was that it kind of got a bad name in biology, and people said, "Well, we can't get much useful information out of this", because, in fact, they couldn't get it with the stuff they had lying around the lab. That's where I came in. I had looked at this in the abstract, years and years ago, I thought that this would be a great thing to do, but when I looked in to the details, I thought it would be too difficult. Then just a few years ago, I got approached by the oncologist, David Agus, who said, "We really need this information for treating cancer patients", and he convinced me to look at it again with the new tools that had come along. The tools typically are things like mass spectrometers for weighing molecules, and liquid chromatography, which is basically sliding a molecule past a bunch of other molecules, and seeing how much it sticks. We can also make antibodies that stick to very specific molecules. That is a set of tools that hasn't changed very much, but when I started looking at it, I realized that the big problem was that people were using these tools basically in a lab bench, and treating it almost like they were treating genomics, as if it was a digital process. They were going through a sequence of experimental steps, but the way that they were controlling it wasn't possibly good enough to even get the same result twice if they measured the same sample, much less to look for subtle things in the changes. I realized that it really needed a couple of other things, one of which is some better application of physics, which is how the instruments were actually tuned to do this problem. Another thing it needed was some plain process engineering. What was required was much more like making a semiconductor line than it was like sequencing DNA. There were many, many steps that had to be refined, and highly, highly controlled in order to get a repeatable result in protein. So this was essentially an engineering problem. There are certainly hundreds of thousands of different protein variants, and maybe more. Nobody really knows which variations are significant. But certainly every gene produces a protein, and then those proteins get modified by the processes, and combined, and produce other proteins, and so on. The big problem was that there was no way of looking at all the proteins, say in a drop of blood that was repeatable, that you could measure the same drop of blood and see the same proteins, and part of the problem is because they occur in vastly different amounts. Some of them are a million times more diluted than others, so there's a huge dynamic range. But also there are hundreds of steps in the process of measuring them. So if you're doing this with graduate students in a bio lab, and one of them goes and has a cup of coffee at one step, and you leave it 15 seconds longer than an enzyme, you get a completely different result. What needs to be done is a super tightly controlled engineering process. Since that was essentially an engineering problem, I thought that could be an interesting problem for an engineer like me to work on, so I started working on that. Then it turns out that once you get that, there's a huge mathematical problem at the end, which I was also interested in. It was a computing problem, of interpreting all of these results. If you know that this protein is going up or down, how do you make any sense of that, and correlate that to anything useful. That is essentially a computing problem. Since it was a computing problem, and an engineering problem, I thought that I had something to bring to the table, and started working in the area really just to get the engineering worked out. Applied Minds can do projects in the exploratory stage without going off and getting any funding; we do it with our own profits from other projects, so we started exploring proteomics with David Agus. And that's how we realized that we could do it if we could really build a line like an assembly line for doing it, which involved robotics, and changing the mass spectrometers, and things like that. We got to the point where we actually knew how to do it, and at that point we raised some money from some angel funders, and made a company called Applied Proteomics, which has worked out how to do this, and built this assembly line which does these hundreds of steps, and measures along the way, and does it in an automated way. For the first time, the results are accurate and repeatable. When we test the same sample, we get the same result. You can take a drop of blood, and get a repeatable measurement over a hundred thousand repeatable stable features. We don't know necessarily what all of them are, but many thousands of them we can identify as known proteins, and we now have genes associated with them. Often that means we know something about the function, or where they are created in the body, or something like that. Let me show you the results of that process, which is on this slide. Figure 1: Differential Feature This is actually a small part of the measure that we get out of a drop of blood; this is actually a small part of a bigger picture. We've spread out the fragments of proteins in two dimensions here. It's a little bit analogous to a gel that you might see, or a gene chip, the same protein feature will always appear in the same position every time. The brightness shows how much of that protein there is. This display actually doesn't show you too much of the dynamic range and brightness, but we are measuring that. In the horizontal direction we're measuring the mass of the protein fragment. The vertical axis is how slippery it is. People have produced pictures like this before, but what's interesting here is that every time you do this, the features come out in exactly the same places. That hasn't been true before. Just to show you how precise these pictures are, you notice that these things tend to occur in these little groups of stripes, tick, tick, tick. You see there are several of them in each group, and they kind of trail off; it's almost like a ring, or an echo. Well, the reason for that is that carbon has different isotopes, and so if there is an extra neutron, you have a different isotope of carbon in the protein, then it's going to be slightly heavier. That distance between the stripes is actually the weight of one neutron; it gives you an idea of how precisely we're measuring things. There's nothing in between because there's no such thing as half a neutron. In fact, measuring things so precisely we can often tell by the shape, how many carbon atoms there are in the protein. The amazing thing about this picture is I'm actually showing you two measurements of two different blood samples on top of each other. One of them is shown in red, and one of them is in green. Things look yellow because the two are exactly on top each other, because the two blood samples are mostly the same. But, if I look closely at this ... actually let me just find a spot which is different ... okay, well, actually here is a good spot. There is some protein that was in one of the blood samples that wasn't in the other one, so you only see it as the green. There is another place where you see there's a little red down there. That is something that is in one sample and not the other. It's almost as though we've got a digital read-out of this highly analog process. That's the amazing sort of engineering feat that I don't believe anyone else has achieved that kind of repeatable precision over that much range before. What we know is the relative concentration of each of those proteins. Now, this pair of tests might be the same person at two different times, or it might be two different people. They probably both have the gene to produce that protein, but for some reason one of them is saying this, and the other one isn't saying this at this time. Now if we have a hundred thousand of these features — and we do have more than a hundred thousand — then the question goes on to what do they mean, what do we do with it? That's the stage that we're at now. It may be that some of those like a genetic test, maybe a single feature will actually tell us something. But probably much more of the information is in the patterns and combinations, and so on. For instance, let's say that we go to cancer patients, and we try out a drug on them, and we find out that only 10 percent of them respond to the drug. It would be very nice, if there were some genetic marker that told us which 10 percent responded to the drug, because it's a miracle drug for those 10 percent, but it's a useless drug if only 10 percent of the people respond to it, and it makes 20 percent of the people sick. You would like to know which are which, and it was a great hope that maybe you would be able to find genetic markers to do that. There are a few drugs that that's true for, but by and large, that information doesn't seem to be just in the vocabulary list. But the information is much more likely to be in here, there's something dynamically happening, and so if we can start to say, "If we see this pattern approaching expression, it means that you've got this thing going on metabolically." All of a sudden we've got hundreds of thousands of symptoms to look at, if you will, or hundreds of thousands of indicators of what is going on at the level of what's actually going on. In the process, I started looking at how we treat cancer, and how we think about cancer. This is another area where I think there's a wrong paradigm that has gotten started because there's been a great success, and that success has been over generalized. In this case the great success has been the treatment infectious diseases and the germ theory of disease. This is the greatest success of a theory in medicine. That was a very cool development, because if you can figure out what species of germ you were infected with, then that sets how you should treat the disease. You could treat the disease with something that would kill that germ. That became the general paradigm of medicine. You would do a diagnosis, a differential diagnosis to figure out what the infectious agent was, and then you would apply a treatment that was very specific for that agent. That's the thing that doctors are basically trying to do, identify the disease, and treat the diagnosis according to the best method. That allows science to come in because you can objectively test whether a particular treatment is effective, or not effective, when dealing with that diagnosis. Does quinine help the symptoms of malaria? Is penicillin the best way to treat anthrax? Once you know what's best, that's the thing that doctors are taught to do. Interestingly enough, that way of looking at things is not the only one in the history of medicine. Historically, doctors had theories that are today more like Ayurvedic medicine, with its emphasis on balances between various forces in the body. Or in the West, a medieval doctor might have tried to make you less choleric or more phlegmatic. The idea was to try to restore the order of the various forces that were controlling the body. It's interesting, at the time that the germ theory of disease was really exploding, and antibiotics were being discovered, J.B.S. Haldane said, "This is a disaster for medicine because we're going to get focused on these germs, and we're going to forget about the system." He was right. Indeed, if you look at what happened, it was a disaster for treating diseases like cancer because we started thinking of them almost like they're infectious diseases. It's a habit of thought, so when a patient comes in, we diagnose them, and we put them in a category, and then we try to apply the treatment that is shown to work on that category. We do a blind clinical trial of how people that are in that category respond to a certain drug. That makes a lot of sense for infectious diseases because infections are species, they speciate, and divide out, so putting them in categories makes a huge amount of sense. But a systems disease like cancer, or an auto-immune disease, is a break down in the system, much more like a program bug. We would never think of debugging a computer by putting it into one of twelve categories, and doing something based on the category. Actually we do, it is kind of "help-desk debugging" that doesn't work very well in complex situations. There is a big difference between help-desk programming debugging, and the kind of debugging a programmer really does when they're trying to more subtly fix a program. What we've got in medicine now is kind of help-desk debugging. We put you into a category. In cancer we start by putting it in a category that's based on the part of the body where symptoms of the cancer have been shown. Then we test drugs that way: Does this drug work on lung cancer, and if it does, well, it's not approved for prostate cancer because we tested it on lung cancer. That's a whole other experiment, that's a different category of disease. Then we subcategorize them. We take a biopsy sample, and we say, "Well, these cells are kind of squishy and long, and those are kind of round, so we have the squishy, long cancer, and the round cancer." We declare that we have two forms of breast cancer. We keep coming up with more kinds of cancer as we measure more things, and then we subdivide the categories. There used to be dozens of kinds of cancer, and now there are hundreds of kinds of cancer. But I actually think there are millions or billions of kinds of cancer. Cancer is a failure of the system. Happy families are all alike, but unhappy families are all unhappy in their own special way, and happy bodies are kind of all alike, but when they break down, they all break down in their own special ways. The breaking down is at the level of this conversation that's going on between the cells, that somehow the cells are deciding to divide when they shouldn't, not telling each other to die, or telling each other to make blood vessels when they shouldn't, or telling each other lies. Somehow all the regulation that is supposed to happen in this conversation is broken. Cancer is a symptom of that being broken, and so when we see a whole bunch of cells starting to divide uncontrollably in an area, we call that "cancer", and depending on the area, we'll call it "lung cancer", or "brain cancer". But that's not actually what's wrong, that's a symptom of what's wrong. To use another kind of analogy, let's say we didn't understand anything about plumbing, but occasionally we came home and our living room is filling up with water, and sometimes we come home, the kitchen is filling up with water, and so we start describing the problem as, "Well, my house has water, that's the problem." We might even divide it and say, "My house has kitchen water, or my house has living room water." If plumbers were like doctors the best they might be able to say is "we've learned about kitchen water, and if we pour a lot of drano in the kitchen, then kitchen water sometimes goes away. Living room water is fixed by pouring a lot of tar on the roof." Indeed, there might be ways of fixing the problem, but what you really need is to understand about plumbing. You should be worried about the process that's creating the water, and understanding about what's supposed to be draining, and what's supposed to be holding it, and so on. In fact, we misunderstand cancer by making it a noun. Instead of saying, "You know, my house has water", we say, "My plumbing is leaking." Instead of saying, "Somebody has cancer", we should say, "They're cancering." The truth of the matter is we're probably cancering all the time, and our body is checking it in various ways, so we're not cancering out of control. Probably every house has a few leaky faucets, but it doesn't matter much because there are processes that are mitigating that, by draining away the leaks. Cancer is probably something like that. In order to understand what's actually going on, we have to look at the level of the things that are actually happening, and that level is proteomics. Now that we can actually measure that conversation between the parts, then we're going to start building up a model that's a cause-and-effect model: This signal causes this to happen, that causes that to happen. Maybe we will not understand to the level of the molecular mechanism but we can have a kind of cause-and-effect picture of the process. More like we do in sociology or economics." Whatever the treatment of cancer, or auto-immune disease, neurodegenerative disease or other system diseases will be like in the future, there won't be a diagnosis step, or at least that's not what will determine your treatment. Instead, what we'll do is we'll go in, we'll measure you by imaging techniques, and taking it off of your blood, looking at the proteins, things like that, build a model of your state, have a model of how your state progresses, and we'll do it more like global climate modeling. We'll build a model of you just like we build a climate model of the globe, and it will be a multi-scale, multi-level model. Just as a global climate model has models of the oceans, and the clouds, the CO2 emissions, and the uptake of plants and things like that, this model will have models of lots of complicated processes happening at lots of different scales, and the state variables of this model will be by and large the proteins that are moving back and forth, sending the signals between these things. There will be other things, too. But most of the information is in the proteins. There will be a dynamic time model of how these things are signaling each other, and what's being up-regulated, and down-regulated, and so on. Then, we will actually simulate that under lots of different treatment scenarios; we'll simulate for your cancering, how we can tweak it back into a healthy state, having it guided back toward a healthy state. It will be a treatment that's very specific. We'll look at those and see which ones are most likely to bring you to a healthy state, and we'll start doing that, and we may treat you in a very different way than we've ever treated any other human before, but the model will say that for you that's the correct sequence to treat it. Right now this would be a huge change in medicine. For instance, the way we pay for medicine is dependent on the diagnosis. You pay a certain amount for prostate cancer, and you pay a different amount for lung cancer. That determines what part of the hospital you get routed to, which doctor sees you, what the insurance company will pay for. If you take that out of the system right now, it's a completely different kind of a system. I don't think this will be an easy switch and I don't know what the sociological/economic processes will be. But it will happen because it will start working better. It will probably happen first with desperate people who aren't getting fixed with the normal methods, will go to this alternate process, and when enough of them start getting fixed by this alternate process, then that will, by some complex sequence that I won't even try to predict, eventually change medicine. Of course there is a lot to be done to make this work. We are dealing with very different time scales and different space scales, too. There are useful things that are hour-to-hour time scale, in your bloodstream, and your cell level is probably more like minute-to-minute, or even faster. Right now we're just beginning to be able to measure proteomics within a single cell, and so right now what we are doing with the National Cancer Institute is trying to bring all of those time scales, and space scales together into a model. This is probably, with today's technology, a ridiculous stretch, but we're at least attempting to do it. We're measuring things both at the inside the cell level, measuring at gene expression the production of proteins, the placement of proteins within the cell, then the conversation between the cells. That we're trying to do at more the minutes time scale, and then we're doing what's happening in the body and the blood, on the days time scale. We should be probably measuring it over hours, but with current technology we can't afford to do that. We're doing this in mice right now, but we can only draw so much blood from a mouse. Those kinds of things are limiting us. We're also doing it with imaging, the actual geometry of the tumors, so we're actually trying to measure geometry; we're trying to measure the genetic evolution of the tumor because the tumor is not homogeneous. Genetically it's different inside than outside, and we are trying to make a model, like one of those kind of global climate models, for lymphoma in mice. We can get genetically identical mice, and we can actually very reliably give them the same kind of lymphoma, and so we can repeat the experiments. So it's much better than the global climate models situation. With global climate, we have one experiment to calibrate our model, and we're in the middle of it. There is no control. In the mice we can do a lot, we can try different variables, and then also try what are the effect of different things that we can do in terms of treatments, of giving chemotherapy of various sorts, or heating them up, changing the pH of their blood, doing all kinds of things like that, and begin to get a perturbation model of not just how the system normally works, but how it works under different kinds of perturbations. Then hopefully, eventually, we'll get to the point where we have a good enough model that we can actually predict: if we do this to this mouse we can actually make it live longer. We are already learning a lot. One is, for this mouse study, we're combining some new techniques like proteomic techniques with a lot of techniques that were developed for other reasons, for instance imaging techniques that are very detailed. We're actually putting little windows into the mouse, and watching the tumor grow, and then we can use things that have antibodies that bind to certain kinds of proteins that are being expressed, and so we can actually see where those proteins are being expressed in the living mouse, so see geometrically where they're being expressed. There are techniques, for instance, where we can actually look within a cell, and see where a protein is within the cell. We can actually do microscopy below the wavelength of light now, which is a fantastic advance, by using basically little flashes of light, and computing on top of it. There are huge advances in the technique and instrumentation and so on that's making this at least conceivable for the first time. It's only a matter of time before it will be possible, and it's quite probable that this first attempt is too early, but we are attempting to do this with the consortium of people in places like Stanford, and Cold Spring Harbor, and USC, UT, NYU and Caltech. The National Cancer Institute, has actually given our group five years of funding, assuming we keep making progress. They had this crazy idea of getting people like me, who are not really biologists, to be the principal investigators of these centers, to work with clinicians to design the program of research, which is then being carried out by a lot of people who know things like how to put windows into mice, and how to image a tumor, or how to get antibodies to glow. So we're using all of those biological lab techniques to do something that's really more like a physical sciences model. I'm optimistic that we'll have enough success that people will at least try to repeat this form of experiment. Whether we actually are able to make accurate predictions is yet to be seen. The group coup would be if we got to the point where we could say, "We can predict that if we do "this" to this mouse, then we can take care of "that" in the mouse". That would be success. But we can learn a lot without getting that far. |
Back to... EDGE MASTER CLASS 2010 |