Monday, July 5, 2010

The Beginning

The phone rang in my San Diego home, and on the other end was John Finn. "Hello Mark?", he said. "How are you doing?"


It had been eight years or more since I had spoken to him. We had worked together in herbicide discovery at American Cyanamid in Princeton, New Jersey. He had been a good collaborator with me. We had begun our careers in the early eighties and had experienced a special time by any measure as the Ag Division exploded in growth. When we arrived, the top selling product of the division was an organophosphate insecticide, terbuphos, with annual sales of around $90 million. Before the end of the 80's, the sales would measure in the hundreds of millions from fifteen new products, most significantly imidazolinone herbicides. We rode a veritable tsunami of commercial success, and the leadership ploughed some of the profits back into "ARD", the Ag Research Division.


Our personal contributions towards the commercial successes were somewhat modest. I was the first plant biochemist hired, and my contribution was to help explain the mechanism of action of the herbicides. This was worked out fairly close to my hiring by others, but I did my part in developing enzyme assays as screens to support discovery research. John worked initially on filling out the patent space for the imidazolinones, always in search of new products. One part of his work helped focus attention on the last imidazolinone to be commercialized. But his main contribution as far as I was concerned was his innovation in bringing tools and approaches to research, especially chromatography and combinatorial chemistry.


At one point, I had collaborated with John to produce large arrays of compounds without purification. The idea we tested was to improve the standard output in synthetic chemistry resulted, about one compound per week per chemist. This output came from a chemist conceiving an end product, then working the steps to the product and finally purifying the product. Our idea was that most such products are useful only as information to direct the next iteration and that the majority effort of purification was a waste for something not actually thought to be useful. If the assays could tolerate crude preparations, such as they tolerated in natural product research, why not make an array of crude preparations, then see if the information is useful enough for the next iteration? John's try at this resulted in hundreds of compounds per week and a significantly larger chemical space explored.


Anyway, John and I developed a common sense of efficiency and drive. When he left the company, I co-wrote a scathing condemnation of chemistry management. It was my distress about his departure, but it was also my distress about the gap in innovation I felt with his absence. Fortunately, the letter was received positively, and the innovation John provided was seen as important to continue; John went on with two other companies, and I lost track of him.



I consider the discipline of small molecule drug discovery from the perspective of a biochemist and a capitalist. Maybe the order should be reversed because for me, the goal of attaining a commercial success is paramount. I can only enjoy the journey if the chance of success exists. Without the chance the activity is a bit sterile and academic. If there is no point to the work, then almost any scientific endeavor is equivalent and difficult to prioritize. I guess the premise of drug discovery is the promise of a product; if the research does not have the promise, then the work is hollow. Perhaps this judgement is too harsh, but there it is.


Monday, June 28, 2010

Drug Design and Lead Optimization


Tools for protein structure determination and visualization
Tools for analyzing protein structures and protein-ligand interactions are often critical for drug design and lead optimization. Similar to many scientific aspects of drug discovery, the origins of such tools are in public sector entities. Many of these tools remain relevant, and are available to non-profit organizations for free, or to industry-associated institutions for a fee. Reviews, open-source software links, tutorials, and other information on this topic are available at www.umass.edu/microbio/chime/top5.htm. A related collection of websites can be accessed at the Computational Chemistry List Ltd's website (www.ccl.net/chemistry/links/software/index.shtml ).

Collaborative Computational Project No 4
One of the oldest protein structure visualization software packages still in use is Collaborative Computational Project No 4 (CCP4). A simple, well-organized website (www.ccp4.ac.uk/) provides up-to-date documentation on all of the CCP4 software, with additional useful information provided at the CCP4 wiki (ccp4wiki.org). This wiki contains several topic links with the bioinformatics link of highest relevance for drug design and lead optimization (strucbio.biologie.uni-konstanz.de/ccp4wiki/index.php/Bioinformatics).

PyMOL
The structural biology research field lost one of its most significant contributors with the sudden death of PyMOL's founder Warren DeLano, in November 2009. The Open Source code of PyMOL remains available (www.pymol.org/rel) , and a separate wiki regarding the program is also available (www.pymolwiki.org/index.php/Main_Page) . In January 2010, Schrödinger Inc (www.schrodinger.com) announced that an agreement had been reached for the company to continue the development, support and sales of PyMOL, including providing support for the program's open-source community.

YASARA View
Yet Another Scientific Artificial Reality Application (YASARA; www.yasara.org) is the name of the molecular visualization, modeling and dynamics program that was created in 1993. Despite being created almost 20 years ago, this program remains relevant because of frequent improvements that have been implemented. A progressive series of programs, the first of which is YASARA View, are available at www.yasara.org/products.htm#view. Recently written macros, which can be used for, among other applications, the analysis of a molecular trajectory in the form of a table of energies, to determine minimum energy and time averaged structures, and to run a molecular dynamics simulation of a membrane protein, can be accessed at www.yasara.org/macros.htm.

Proteopedia
Proteopedia (www.proteopedia.org/wiki/index.php/Main_Page ) is described as 'the free, collaborative 3D encyclopedia of proteins and other molecules'. A brief video guide provides a useful introduction to this resource (www.proteopedia.org/wiki/VideoGuide/ProteopediaVideoGuide1/ProteopediaVideoGuide1.htm ), while a more complete description has been published in a software article (www.genomebiology.com/2008/9/8/R121 ) . Proteopedia provides webpages with embedded 3D structures surrounded by descriptive text that containing hyperlinks that change the appearance (ie, view, representation, color or labels) of the displayed, rotating 3D structure. In the webpages with more extensive information, the descriptive text is heavily linked, both to the Jmol figures and to external resources. The wiki design allows authorized users to contribute webpages, and Proteopedia has already 'seeded' 65,000 webpages (using data from the RSCB Protein Data Bank [PDB]) to promote content development. The most significant tools in Proteopedia are the green links, which provide specific information regarding a protein or ligand in the form of a highlighted link, and the pop-up Jmol window, which changes according to the highlighted link selected. In addition, the Jmol applet delivers a number of functionalities, including different view, style, color, surface and measurements options. The 'AChE inhibitors and substrates' webpage (www.proteopedia.org/wiki/index.php/AChE_inhibitors_and_substrates ) is a good example of a well-developed Proteopedia webpage.

Protein Data Bank tutorials
The RSCB PDB (www.rcsb.org/pdb ) frequently adds new functionalities that aid in the understanding of protein structures and protein-ligand interactions. In April 2010, the PDB announced a collaboration with OpenHelix to provide free access tutorials and training materials for the database (available at www.openhelix.com/pdb ). The downloadable material includes an online narrated tutorial that guides users through searches, report generation, options for exploring structures, and many of the research and educational resources and tools available at the RCSB PDB. The tutorial runs in any browser, and can be navigated using chapters and forward and backward sliders. Other learning resources available include PowerPoint slides, handouts and PDF hands-on exercises. A related website also facilitated by OpenHelix is The Protein Structure Initiative Structural Genomics Knowledgebase (SGKB) (http://kb.psi-structuralgenomics.org/KB/index.html ). The SGKB is also worth monitoring as it is a free, comprehensive resource dedicated to advancing methodology in protein structure determination. Produced in a collaboration between the Protein Structure Initiative (PSI) and Nature Publishing Group (NPG), the site is actively maintained and offers updates on new information through monthly email notification or RSS feed.



NIH Research Resource Center for the Development of Multiscale Modeling Tools for Structural Biology
The homepage of the NIH Research Resource Center for the Development of Multiscale Modeling Tools for Structural Biology (MMTSB; www.mmtsb.org ) includes several links of interest. Of particular relevance is a recent update on the Tool Set Documentation (http://blue11.bch.msu.edu/mmtsb/Main_Page ). Common applications of the MMTSB toolset are displayed at http://blue11.bch.msu.edu/mmtsb/Common_applications_of_the_MMTSB_toolset , including the running and analyzing of protein simulations (from PDB files to CHARMM trajectory), protein-RNA complex simulations (from PDB files to CHARMM trajectory) and protein simulations with replica-exchange; generating and visualizing the electrostatic surface potential of a macromolecule; and patching residues.

Tools based on docking algorithms
Ligand-protein docking algorithms are constantly being developed and refined. e-LEA3D (http://bioinfo.ipmc.cnrs.fr/lea.html ), a computational-aided drug design web server based on such algorithms, is described in detail in a recent paper by Dominique Douguet (Nucleic Acids Res (2010):doi:10.1093/nar/gkq322; nar.oxfordjournals.org/cgi/content/abstract/gkq322). The e-LEA3D web server integrates complementary tools to allow fragment-based drug design. Particularly relevant is the capacity to 'invent' new ligands that optimize user-specified scoring functions. e-LEA3D appears to be design- and diversity-driven. Thus, the liability of reliance on a particular scoring function is compensated by the ability to generate diversity. In addition, the scoring function is presented with both structure- and ligand-based evaluations. This de novo approach contrasts with in silico screening approaches. The approach used in e-LEA3D is also oriented toward scaffold-hopping. In addition to the tool that enables the 'invention' of new ligands, the program has a second tool that provides an alternative virtual-screen approach, while a third module offers a combinatorial chemistry design based on commercially available reactants and a user-drawn scaffold.

FTMap (http://ftmap.bu.edu/ ) is an algorithm that maps sites in proteins with good likelihood of binding high-affinity ligands, thus identifying 'hot-spots' for drug design, and is described in detail in a recent paper by Ryan Brenke et al (Bioinformatics (2009) 25(5):621-627; bioinformatics.oxfordjournals.org/cgi/content/abstract/btp036). Examples of such mapping are provided within the FTMap website.
The related website of interest is DISI (http://wiki.compbio.ucsf.edu/wiki/index.php/Main_Page) self-described as a ‘community-based project to document virtual screening and computer aided drug design’. DISI is an acronym for ‘Documentation is still incomplete’, but the content is focused on docking tools developed and used at the University of California, San Francisco. For orientation, there are twenty-seven tutorials on the various docking programs (http://wiki.compbio.ucsf.edu/wiki/index.php/Category:Tutorials ), most of which relate to the DOCK Blaster program (http://blaster.docking.org). A useful primer on DOCK Blaster, titled “Automated Docking Screens: A Feasibility Study” was published in J. Med. Chem., 2009, 52 (18), pp 5712–5720 and is available at (http://pubs.acs.org/doi/abs/10.1021/jm9006966). The free-access program requires only a PDB code, to launch a full screen of large small molecule libraries, such as ZINC (http://zinc.docking.org/ ) the free database of commercially available compounds for virtual screening. The continuous improvements in automated docking methodology exemplified here further enables drug design as a force in drug discovery and lead optimization.