Recent technological advances have revealed an enormous diversity of lifeforms by sequencing their genomes. There are now millions of available genomes, each comprised of thousands of genes. The universe of newly discovered genes is expanding far faster than our ability to study them in the laboratory. Here, I will present how high-throughput computing is unlocking the function of novel genes...
Panel Discussion led by Miron LIvny.
NSF NCAR’s labs and programs collectively cover a breadth of research topics in Earth system science, from the effects of the Sun on Earth's atmosphere to the role of the ocean in weather and climate prediction, as well as supporting and training the next-generation of Earth system scientists. However, with the current legacy `download and analyze’ model followed by most of our remote users,...
The Monitoring Infrastructure for Network and Computing Environment Research (MINCER) project aims to provide a foundation for in-depth insight and analysis of distributed heterogeneous computing environments, supporting and enhancing research and education in computer and network systems. Our approach is to work in conjunction with the Open Science Grid (OSG) by providing a set of MINCER...
The Algorithms Research and Development Group (ARDG) at the National Radio Astronomy Observatory (NRAO) has been using HTCondor and compute resources at the Open Science Grid (OSG) to significantly improve throughput in radio astronomy imaging by up to 2 orders of magnitude in single imaging workflows, and we are currently putting efforts towards extending these imaging capabilities to...
Computational notebooks have become a critical tool of scientific discovery, by wrapping together code, results, and visualization into a common package. However, moving complex notebooks between different facilities is not so easy: complex workflows require precise software stacks, access to large data, and large backend computational resources. The Floability project aims to connect these...
The Event Workflow Management System (EWMS) enables previously impractical scientific workflows by transforming how HTCondor is used for massively parallel, short-runtime tasks. This talk explores what’s now possible from a user’s perspective. Integrated into IceCube’s Realtime Alert pipeline and powered by OSG’s national-scale compute resources, EWMS’s debut application delivers directional...
The MIT Tier-2 computing center, established in 2006, has been a long-standing contributor to CMS computing. As hardware ages and computing demands evolve, we are undertaking a major redesign of the center’s infrastructure. In this talk, we present a holistic cost analysis that includes not only hardware purchases but also power consumption, cooling, and rack space—factors often excluded from...
While there are perhaps hundreds of petabytes of datasets available to researchers, instead of swimming in seas of data there is often a feel of sitting in a data desert: there’s a mismatch between what sits in carefully curated repositories around the world versus what’s accessible at the computational resources locally available. The Pelican Project (https://pelicanplatform.org/) aims to...
Experiences with running tape storage systems at ATLAS and CMS Tier-2s
We present the initial design and proposed implementation for a series of long-baseline, distributed inference experiments leveraging ARA --- a platform for advanced wireless research that spans approximately 500 square kilometers near Iowa State University, including campus, the City of Ames, local research and producer farms, and neighboring rural communities in central Iowa. These...
Discuss tools and options for the next capacity challenge.
Discuss plans for SENSE/Rucio testing for USATLAS/USCMS
Discuss capacity and capability challenges. Which are we interesting in pursuing? Who will participate? When to schedule?
Quick overview of AI/ML in WLCG so far
Needs for AI/ML for Infrastructure AND Infrastructure for AI/ML
Funding opportunities
Next step, areas on common interest/effort?
HTCondor is the leading system for building a dynamic overlay batch scheduling system on resources managed by any scheduling system, by means of glideins. One fundamental property of these setups is the use of late binding of containerized user workloads. From a resource provider point of view, a compute resource is thus claimed before the user container image is selected. Kubernetes allows...
Shared development and prototyping for AFs?
Storage technologies to support AFs (Carlos Gamboa ?, 10 minutes)
Joint AFs: Can we “share” AFs between experiments: Belle II and ATLAS or Dune and CMS, etc ? (Hiro/Ofer/Lincoln ?)
Can we agree on a minimum baseline for AFs?
Standard “login”
New requirements from HEP data analysis includes limited access to login nodes, resource needed from the experiments program rather than the login nodes and efficient data access for collaborative workflows etc. We have developed an Interactive aNalysis worKbench (INK), a web-based platform leveraging the HTCondor cluster. INK transforms traditional batch-processing resources into a...
I describe the process of deploying the National Data Platform Endpoint (formerly Point of Presence / POP) on local infrastructure to provide a data streaming service for a published science dataset where the data origin is located in Hawaii. From the perspective of a software engineer I will cover the process of deploying the endpoint into a Kubernetes cluster or using Docker Compose. I will...