#LoC: #CloudComputing; #AndrewWMellonFoundation; CCHC; #InformationDissemination
Washington/Canadian-Media: With aims to better serve research and creative uses of Library of Congress (LoC) resources, the Andrew W. Mellon Foundation funded $1 million grant in 2019 for the Computing Cultural Heritage in the Cloud (CCHC) initiative, LoC reported.
Cloud Computing. Image credit: Unsplash
CCHC would use the affordances of cloud-based technology to document what is required to support this work–from levels of staff support and costs associated with serving and transforming digital materials.
LoC Labs has partnered this year with three scholars Lincoln Mullen, Lauren Tilton, and Andromeda Yelton, who would explore the Library’s digital collections by using cloud computing services in their individual research projects.
With impressively varied in their aims, Mullen attempts to use machine learning to extract biblical quotations across the Library’s collections; while Tilton seeks to refine and design computer vision by examining approximately 250,000 early 20th century images; and Yelton plans work with clusters conceptually similar documents to create an interactive data visualization to support users who only have a rough idea of the items they’re looking for.
In addition, to engage audiences in transforming access to knowledge, public humanities focus would be used by each of these projects.
LoC would also be informed collectively by these projects about the understanding of the benefits and challenges of using distributed computing environments in large-scale digital library settings.
Results from the individual projects will be documented and shared openly to complement the findings from the institution’s overarching investigation.
In an interviews with Alice Goldfarb who has joined the LC Labs team as an Innovation Specialist for CCHC, Leah Weinryb-Grohsgal, Innovation Specialist at the Library of Congress, and works at the CCHC, Alice said that her work at CCHC is to determine the requirements in a service model for supporting cloud computing digital humanities research in the future to further explore the changes required in to disseminate collections to more people in more ways and build on and contribute to the work other people are doing.
Due to the vast the scale of the Library’s collections, said Alice, they would be benefited to disseminate the collections available for cloud computing as libraries already consider the ethics of this type of work, and we want to make sure to extend this approach to digital work and learn ways to steward and share data in systematic ways digitally.
#AI; #ProteinStructure; #ScienceAndResearch; #CASP
New York/Canadian-Media: Proteins are the minions of life, working alone or together to build, manage, fuel, protect, and eventually destroy cells. To function, these long chains of amino acids twist and fold and intertwine into complex shapes that can be slow, even impossible, to decipher.
Image: A new artificial intelligence program readily predicts the structure of protein complexes, such as the immune signal interleukin-12 (blue) bound to its receptor. Image credit: Ian Haydon/Institute for protein design
Scientists have dreamed of simply predicting a protein’s shape from its amino acid sequence—an ability that would open a world of insights into the workings of life.
“This problem has been around for 50 years; lots of people have broken their head on it,” says John Moult, a structural biologist at the University of Maryland, Shady Grove. But a practical solution is in their grasp.
Several months ago, in a result hailed as a turning point, computational biologists showed that artificial intelligence (AI) could accurately predict protein shapes. That group describes its approach online in Nature today. Meanwhile, David Baker and Minkyung Baek at the University of Washington, Seattle, and their colleagues present their AI-based structure prediction approach online in Science. Their method works on not just simple proteins, but also complexes of proteins.
Baker’s and Baek’s method and computer code have been available for weeks, and the team has already used it to model more than 4500 protein sequences submitted by other researchers. Savvas Savvides, a structural biologist at Ghent University, had tried six times to model a problematic protein. He says Baker’s and Baek’s program, called RoseTTAFold, “paved the way to a structure solution.”
In fall of 2020, DeepMind, a U.K.-based AI company owned by Google, wowed the field with its structure predictions in a biennial competition. Called Critical Assessment of Protein Structure Prediction (CASP), the competition uses structures newly determined using laborious lab techniques such as x-ray crystallography as benchmarks. DeepMind’s program, AlphaFold2, did “really extraordinary things [predicting] protein structures with atomic accuracy,” says Moult, who organizes CASP.
But for many structural biologists, AlphaFold2 was a tease: “Incredibly exciting but also very frustrating,” says David Agard, a structural biophysicist at the University of California, San Francisco. In mid-June, 3 days after the Baker lab posted its RoseTTAFold preprint, Demis Hassabis, DeepMind’s CEO, tweeted that AlphaFold2’s details were under review at a publication and the company would provide “broad free access to AlphaFold for the scientific community.” Nature has now rushed to publish that paper to coincide with the Science paper. “It is appropriate that it is not coming out after ours, as our work is really based on their advances,” Baker says.
DeepMind’s 30-minute presentation at CASP had been enough to inspire Baek to develop her own approach. Like AlphaFold2, it uses AI’s ability to discern patterns in vast databases of examples, generating ever more informed and accurate iterations as it learns. When given a new protein to model, RoseTTAFold proceeds along multiple “tracks.” One compares the protein’s amino acid sequence with all similar sequences in protein databases. Another predicts pairwise interactions between amino acids within the protein, and a third compiles the putative 3D structure. The program bounces among the tracks to refine the model, using the output of each one to update the others. DeepMind’s approach involves just two tracks.
Gira Bhabha, a cell and structural biologist at New York University School of Medicine, says both methods work well. “Both the DeepMind and Baker lab advances are phenomenal and will change how we can use protein structure predictions to advance biology,” she says. A DeepMind spokesperson wrote in an email, “It’s great to see examples such as this where the protein folding community is building on AlphaFold to work towards our shared goal of increasing our understanding of structural biology.”
But AlphaFold2 solved the structures of only single proteins, whereas RoseTTAFold has also predicted complexes, such as the structure of the immune molecule interleukin-12 latched onto its receptor. Many biological functions depend on protein-protein interactions, says Torsten Schwede, a computational structural biologist at the University of Basel. “The ability to handle protein-protein complexes directly from sequence information makes it extremely attractive for many questions in biomedical research.”
Baker concedes that AlphaFold2’s structures are more accurate. But Savvides says the Baker lab’s approach better captures “the essence and particularities of protein structure,” such as identifying strings of atoms sticking out of the sides of the protein—features key to interactions between proteins. Last year, AlphaFold2 needed a lot of computing power to work, more than RoseTTAFold. “Now, it seems they’ve accelerated their method since CASP14, and it’s now comparable to RoseTTAFold,” Baek says.
Beginning on 1 June, Baker and Baek began to challenge their method by asking researchers to send in their most baffling protein sequences. Fifty-six head scratchers arrived in the first month, all of which have now predicted structures. Agard’s group sent in an amino acid sequence with no known similar proteins. Within hours, his group got a protein model back “that probably saved us a year of work,” Agard says. Now, he and his team know where to mutate the protein to test ideas about how it functions.
Because Baek’s and Baker’s group has released its computer code on the web, others can improve on it; the code has been downloaded 250 times since 1 July. “Many researchers will build their own structure prediction methods upon Baker’s work,” says Jinbo Xu, a computational structural biologist at the Toyota Technological Institute at Chicago. Hassabis says its computer code is now also open source. As a result of both groups’ work, progress should now be swift, Moult says: “When there’s a breakthrough like this, 2 years later, everyone is doing it as well if not better than before.”