Recent technological advances have revealed an enormous diversity of lifeforms by sequencing their genomes. There are now millions of available genomes, each comprised of thousands of genes. The universe of newly discovered genes is expanding far faster than our ability to study them in the laboratory. Here, I will present how high-throughput computing is unlocking the function of novel genes at an unfathomable scale.
Fermilab is the first High Energy Physics institution to transition from X.509 user certificates to authentication tokens in production systems. All the experiments that Fermilab hosts are now using JSON Web Token (JWT) access tokens in their grid jobs. The tokens are defined using the WLCG Common JWT Profile. Many software components have been either created or updated for this transition, and the changes to those components are described. Most of the software is available to others as open source. There have been some glitches and learning curve issues but in general the system has been performing well and is being improved as operational problems are addressed.
Presentation on storage monitoring, analytics, diagnosis, best practices for WebDAV, Xrootd. Options for future storage
We present the initial design and proposed implementation for a series of long-baseline, distributed inference experiments leveraging ARA --- a platform for advanced wireless research that spans approximately 500 square kilometers near Iowa State University, including campus, the City of Ames, local research and producer farms, and neighboring rural communities in central Iowa. These experiments aim to demonstrate, characterize, and evaluate the use of distributed inference for computer vision tasks in rural and remote regions where high-capacity, low-latency wireless broadband access and backhaul networks enable edge computing devices and sensors in the field to offload compute-intensive workloads to cloud and high-performance computing systems embedded throughout the edge-to-cloud continuum. In each experiment, a distributed implementation of the MLPerf Inference benchmarks for image classification and object detection will measure standard inference performance metrics for an ARA subsystem configuration under different workload scenarios. Real-time network and weather conditions will also be monitored throughout each experiment to evaluate their impact on inference performance. Here, we highlight the role of HTCondor as the common scheduler and workload manager used to distribute the inference workload across ARA and beyond. We also discuss some of the unique challenges in deploying HTCondor on ARA and provide an update on the current status of the project.
Pegasus is a widely used scientific workflow management system built on top of HTCondor DAGMan. This talk will highlight how Pegasus is deployed within the NSF ACCESS ecosystem and the NAIRR Pilot. We will cover access point deployments, including the hosted ACCESS Pegasus platform (Open OnDemand and Jupyter), workflow execution nodes in HPC environments, and a JupyterLab-based access point within the Purdue Anvil Composable Subsystem. On the execution side, we will discuss several provisioning strategies, including HTCondor Annex, custom virtual machines on the Jetstream2 cloud, simple glidein configurations for campus clusters, and a dynamically autoscaled TestPool environment designed for workflow development and testing.
Discuss tools and options for the next capacity challenge.
Discuss plans for SENSE/Rucio testing for USATLAS/USCMS
Discuss capacity and capability challenges. Which are we interesting in pursuing? Who will participate? When to schedule?
Quick overview of AI/ML in WLCG so far
Needs for AI/ML for Infrastructure AND Infrastructure for AI/ML
Funding opportunities
Next step, areas on common interest/effort?
Shared development and prototyping for AFs?
Storage technologies to support AFs (Carlos Gamboa ?, 10 minutes)
Joint AFs: Can we “share” AFs between experiments: Belle II and ATLAS or Dune and CMS, etc ? (Hiro/Ofer/Lincoln ?)
Can we agree on a minimum baseline for AFs?
Standard “login”
I describe the process of deploying the National Data Platform Endpoint (formerly Point of Presence / POP) on local infrastructure to provide a data streaming service for a published science dataset where the data origin is located in Hawaii. From the perspective of a software engineer I will cover the process of deploying the endpoint into a Kubernetes cluster or using Docker Compose. I will explain how I integrated access to local data with the endpoint and describe some useful lessons that were learned. Finally, from the perspective of a science user, I will demonstrate how to use the deployed endpoint to preprocess data and stream the results to a Jupyter notebook to visualize the data.