newheader.jpg

Search CHI's Network

 

Back to Molecular Med Monthly Articles main page

Volunteering for the Cure:
Finding Cancer Fighting Drugs Through Massively Distributed Virtual Screening

According to Davin Potts, Chief Scientist at United Devices, Inc., (UD) in Austin Texas, the majority of personal computers are idle about 95% of the time and they can be harnessed and aggregated to perform useful work as distributed virtual supercomputers. To that end, United Devices, a provider of Internet and intranet distributed software and services, develops and manages infrastructure required to aggregate idle computation, storage and bandwidth resources on the Internet and corporate intranets.

The company's Global MetaProcessor platform is an Internet-based grid, which can be used by organizations to create a computing backbone by harnessing and aggregating idle cycle times of networked PCs, servers and workstations. It is analogous to an electrical grid of high-tension cables through which electrical power is distributed regionally. This public grid, which resides at Grid.org, allows database files to be accessed, applications to be distributed and shared, and individuals and organizations to collaborate on a massive scale comparable to super computing power. Reasons for deploying a grid architecture include: speed of using parallel processing over serial computing; time and cost savings for drug development or for large computing-intensive projects otherwise unfeasible, and better return on investment from existing assets including human capital and computing equipment.

For example, the not-for-profit Cancer Research Project launched in 2001 uses United Device's Global MetaProcessor platform to perform massive scale research and analysis. This large scale, distributed public grid project is the brainchild of Dr. Graham Richards, Chair of the University of Oxford Chemistry Department.

Volunteers can download United Devices' free software program and run it following the installation instructions. Once installed, the two-megabyte program runs unobtrusively in the background. The program works on small parts of the over-all problem that have been divided and distributed to various devices across the public grid. Project participant's machines are sent a "unit of molecules" to analyze. Each unit typically contains 100 molecules that are downloaded. Analysis was done using THINK software, (To Have Information aNd Knowledge). The software attempted to generate 100 suitable derivatives by making small changes to these molecules, yielding a maximum of 10,100 per work unit. Thus starting with a database of 35 million molecules means that 3.5 billion molecules can be analyzed.

The software, developed by Keith Davies of Treweren Consultants and the University of Oxford, analyzes molecular data by creating a three-dimensional model, which changes its shape or conformation as it attempts to dock into a protein binding site. When a conformation docks successfully, it triggers an interaction with the protein and registers as a "hit." The Cancer project depends upon these hits, since any one hit may lead to a cure. All hits are recorded, ranked by strength of conformation, and filed for the project's next stage. Each resultant data set contains the three-dimensional molecular structures and their corresponding scores generated during screening. These data are important for the post-processing phase of the project.

Once processing is complete, which takes about a day, the program sends results back to a server and requests a new data packet. If the participant is not online when the processing is done, his computer will wait to send and receive data packets the next time he connects to the Internet. The program only runs when computing resources are idle. If another application needs computing power, the research program backs down, so computing performance is unimpeded.

"When the Cancer Research project was originally planned, its goals were purposely kept flexible. The number of compounds to be screened could be ratcheted up or down depending upon projected volunteer participation. The project has produced results far beyond our expectations," said Potts.

The initial scope of project was to use two protein targets related to different forms of cancer and to screen them against a library of 200 million drug compounds. These compounds were previously synthesized and evaluated for drug-like physical characteristics, such as those likely to be soluble, reactive or easily metabolized.

Project goals changed over time to meet the demand in volunteer participation. Now 12 protein targets have been screened against 3.5 billion molecules. Even at its original scope, it was still the largest computational chemistry project ever undertaken thus far, and as Potts notes, "a real world 'torture test' of the software." From United Devices' end, the project is managed internally at the company by one full-time equivalent; a database administrator and a systems' administrator each split project responsibilities along with their regular duties.

The project is now moving into the second phase where 'hits' (the molecules) from phase one are put through another virtual screening process. For the second phase a drug discovery software program designed by Accelrys Software called LigandFit refines this data to produce a more manageable list of promising drug candidates for synthesis and testing. LigandFit helps researchers characterize therapeutic targets and identify and assess drug candidates by performing automated docking of flexible ligands to a protein's binding site. This application runs on project participants' computer screens as it evaluates the potential of a ligand library to interact with one of the protein targets.

This second phase of the Cancer Project is being run in parallel with the Smallpox Project. Volunteers can opt in for either one project or both projects to run on their PCs. "These large Internet "public grid" projects differ managerially from an intranet project managed internally at an enterprise," said Potts. In the latter case, the IT department would control what runs on the grid and who can access the data. IT would also work with internal management on project prioritization and resource allocation.

For the Cancer Project, it is common for volunteers who have a family member who is or was battling cancer, to form a group and track their group's aggregate computing power over time. "The Internet has allowed any person who wants to make a genuine contribution to scientific research, and the size and scope of participation has been a welcome surprise to us," states Potts.

Though massive, not-for-profit Internet projects like the Cancer Research Project and the recently launched, (February 5, 2003), Smallpox Project, sponsored by Department of Defense and IBM are very important, they do not make money for the company. United Devices generates revenue by selling its enterprise software, to enable grid computing behind corporate firewalls of life science firms, one of the company's primary verticals. Drug development offers many types of problems to solve including 3-dimensional predictive protein folding and structure determination, and virtual screening, and toxicity property prediction. For example, Novartis, an enterprise customer of United Devices, employs its idle cycle time to research computational structure requirements, and the magnitude of its distributed computing power rivals some of the world's largest supercomputers.

To measure and record computing power, several mechanisms are in place ranging from measuring cycle time on individual PCs, to measuring the aggregate power of groups of PCs for statistical tracking. Each function, like ligands processed and their structure, and number of leads identified can be monitored and tracked for groups on the grid, and statistics can be measured in different ways. Beyond tracking cycle time, data mining tools can be used to customize reports.

Creating a grid by aggregating disparate PCs scattered across the world raises legitimate questions about protecting IT assets, intellectual property, and individual privacy. Thus, security audits at United Devices from potential sponsors and volunteering businesses and individuals are justifiably rigorous. Some security precautions used by United Devices include scanning all of its build environments for viruses and digitally signing information sent to the UD Agent. Files stored locally and files sent to the UD Agent are encrypted, and there is biometric access control to the UD servers. No personally identifiable information is required to run the UD Agent, though some location information is required to have points and CPU time included on some statistics pages.

Participants are told at every instance where their e-mail addresses might be used and can allow or forbid specific e-mail uses. Volunteers also decide what news and information they want to receive from United Devices and they can view and change their preferences any time. The UD Agent itself does not read information beyond its specific directory, except for occasional use of the Windows temporary directory during processing of work units. Beyond this, the only information taken from a volunteer's computer by the UD Agent is the system information required to determine the individual computer's contribution. All transactions involving this information exchange go through secure servers to protect the data.

Notes Potts, "As part of a security audit by Intel we set up a conference call and found out during the call that the number of security experts from Intel on the line asking questions outnumbered all of the employees of United Devices. "We passed their audit," said Potts, "and both Intel and IBM are encouraging employees to download our software to run inside their firewalls for the Smallpox Project."

foot.jpg


Cambridge Healthtech Institute| Beyond Genome | Bio-IT World | Biomarker World Congress | Digital Healthcare & Productivity |
 Discovery On Target | Bio-IT World Conference & Expo  | Insight Pharma Reports | Molecular Medicine Tri-Conference | PEGS
PepTalk
| Pharma WeekWorld Pharmaceutical Congress

Your  Life Science Network

Cambridge Healthtech Institute  |  250 First Avenue  |  Suite 300   |   Needham,  MA  02494
Phone: 781-972-5400  |   Fax: 781-972-5425
chi@healthtech.com