Throughput Computing 2025

America/Chicago
Howard Auditorium (Fluno Center on the University of Wisconsin-Madison Campus)

Howard Auditorium

Fluno Center on the University of Wisconsin-Madison Campus

601 University Avenue, Madison, WI 53715-1035
Description

Registration
Registration for Virtual Attendance
Questions about attending, speaking, accommodations, and other concerns
    • 7:30 AM 9:00 AM
      Registration, Coffee, and Light Breakfast 1h 30m Fluno Atrium

      Fluno Atrium

    • 9:00 AM 10:20 AM
      OSG at 20: Growing the Throughput Community Howard Auditorium

      Howard Auditorium

      Fluno Center on the University of Wisconsin-Madison Campus

      601 University Avenue, Madison, WI 53715-1035
    • 10:20 AM 10:50 AM
      Break 30m Fluno Atrium

      Fluno Atrium

    • 10:50 AM 12:30 PM
      Erik Wright Keynote Presentation, David Swanson Awardee Presentation Howard Auditorium (Fluno Center)

      Howard Auditorium

      Fluno Center

      • 10:50 AM
        Keynote Introduction 5m
        Speaker: Christina Koch (UW Madison)
      • 10:55 AM
        Erik Wright's Keynote Presentation: Biological Discovery at an Unfathomable Scale 1h

        Recent technological advances have revealed an enormous diversity of lifeforms by sequencing their genomes. There are now millions of available genomes, each comprised of thousands of genes. The universe of newly discovered genes is expanding far faster than our ability to study them in the laboratory. Here, I will present how high-throughput computing is unlocking the function of novel genes at an unfathomable scale.

        Speaker: Dr Erik Scott Wright (University of Pittsburgh)
      • 12:00 PM
        Introduction of David Swanson Award 5m
        Speaker: Ronda Swanson (University of Nebraska-Lincoln)
      • 12:05 PM
        David Swanson Awardee Presentation: Reconstructing Spider Webs from Behavioral Tracking. 25m
        Speaker: Brandi Pessman (University of Nebraska-Lincoln)
    • 12:30 PM 1:30 PM
      Lunch 1h Oros Executive Dining room (Fluno Center)

      Oros Executive Dining room

      Fluno Center

    • 1:30 PM 2:45 PM
      Throughput Computing in the Biology & Life Sciences Community Howard Auditorium

      Howard Auditorium

      Fluno Center on the University of Wisconsin-Madison Campus

      601 University Avenue, Madison, WI 53715-1035
      • 1:30 PM
        Using OSG to learn the rules of biological evolution (Remote Presentation) 20m
        Speaker: Oana Carja (Carnegie Mellon University)
      • 1:55 PM
        Computational challenges in metagenomics and small molecule biosynthesis 20m
        Speaker: Jason Kwan (UW-Madison)
      • 2:20 PM
        Understanding Gene Regulatory Networks 20m
        Speaker: Prakriti Garg (University of Wisconsin-Madison)
    • 2:45 PM 3:15 PM
      Break 30m Fluno Atrium

      Fluno Atrium

    • 3:15 PM 4:50 PM
      Reenvisioning Identity in Distributed Services Howard Auditorium

      Howard Auditorium

      Fluno Center on the University of Wisconsin-Madison Campus

      601 University Avenue, Madison, WI 53715-1035
      • 3:15 PM
        Our Current Vision for Trustworthy Long-Term CILogon Operations 20m
        Speaker: Jim Basney (University of Illinois Urbana-Champaign)
      • 3:40 PM
        Placement Tokens 15m
        Speaker: Matyas Selmeci (UW-Madison CHTC)
      • 4:00 PM
        Can I have my data... please? Authorization in Pelican 20m
        Speaker: Justin Hiemstra (Morgridge Institute for Research)
      • 4:25 PM
        Fermilab’s Transition to Token Authentication 20m

        Fermilab is the first High Energy Physics institution to transition from X.509 user certificates to authentication tokens in production systems. All the experiments that Fermilab hosts are now using JSON Web Token (JWT) access tokens in their grid jobs. The tokens are defined using the WLCG Common JWT Profile. Many software components have been either created or updated for this transition, and the changes to those components are described. Most of the software is available to others as open source. There have been some glitches and learning curve issues but in general the system has been performing well and is being improved as operational problems are addressed.

        Speaker: Dave Dykstra (Fermilab)
    • 4:50 PM 5:05 PM
      CHTC Fellow Lightning Talks Howard Auditorium

      Howard Auditorium

      Fluno Center on the University of Wisconsin-Madison Campus

      601 University Avenue, Madison, WI 53715-1035
    • 5:05 PM 5:15 PM
      Closing Remarks Howard Auditorium

      Howard Auditorium

      Fluno Center on the University of Wisconsin-Madison Campus

      601 University Avenue, Madison, WI 53715-1035
    • 6:30 AM 7:30 AM
      Lakeshore Run 1h Depart from Fluno Lobby

      Depart from Fluno Lobby

    • 8:00 AM 9:00 AM
      Coffee, Pastries and Registration 1h Fluno Atrium

      Fluno Atrium

    • 9:00 AM 10:35 AM
      Campus Impacts on the National Science Community via the OSDF and OSPool Howard Auditorium

      Howard Auditorium

      Fluno Center on the University of Wisconsin-Madison Campus

      601 University Avenue, Madison, WI 53715-1035
      • 9:00 AM
        NSF Campus Cyberinfrastructure (CC*) (Remote Presentation) 15m
        Speaker: Kevin Thompson (NSF)
      • 9:20 AM
        Translational Computer Science Panel Discussion 30m

        Panel Discussion led by Miron LIvny.

        Speakers: Brian Bockelman (Morgridge Institute for Research), Douglas Thain (University of Notre Dame), Ewa Deelman (USC ISI), Manish Parashar (University of Utah), Miron Livny (UW-Center for High Throughput Computing), Peter Couvares (LIGO Laboratory - Caltech)
      • 9:55 AM
        University of Montana and Contributing to the OSPool 15m
        Speaker: Michael Couso (University of Montana)
      • 10:15 AM
        Experiences of a Small, Primarily Undergraduate Institution in Servicing OSPool Compute Jobs 15m
        Speaker: Dr Stephen Wheat (Oral Roberts University)
    • 10:35 AM 11:00 AM
      Break 25m Fluno Atrium

      Fluno Atrium

    • 11:00 AM 12:30 PM
      HTC at Work at Campuses and in Research Howard Auditorium

      Howard Auditorium

      Fluno Center on the University of Wisconsin-Madison Campus

      601 University Avenue, Madison, WI 53715-1035
      • 11:05 AM
        Scaling Up Research: Integrating CENVAL-ARC resources with OSG and Expanding User Access (Remote Presentation) 20m Howard Auditorium

        Howard Auditorium

        Fluno Center on the University of Wisconsin-Madison Campus

        601 University Avenue, Madison, WI 53715-1035
        Speaker: Sarvani Chadalapaka (University of California - Merced)
      • 11:25 AM
        Supporting Research Computing @ Syracuse University 20m Howard Auditorium (Fluno Center)

        Howard Auditorium

        Fluno Center

        Speaker: Peter Pizzamenti (Syracuse University)
      • 11:50 AM
        Supporting MicrobiologyResearch at Scale: Experiences and Perspectives 20m Howard Auditorium

        Howard Auditorium

        Fluno Center on the University of Wisconsin-Madison Campus

        601 University Avenue, Madison, WI 53715-1035
        Speaker: Patricia Tran (UW-Madison)
      • 12:10 PM
        Integrating NSF NCAR’s data infrastructure with OSDF 20m Howard Auditorium

        Howard Auditorium

        Fluno Center on the University of Wisconsin-Madison Campus

        601 University Avenue, Madison, WI 53715-1035

        NSF NCAR’s labs and programs collectively cover a breadth of research topics in Earth system science, from the effects of the Sun on Earth's atmosphere to the role of the ocean in weather and climate prediction, as well as supporting and training the next-generation of Earth system scientists. However, with the current legacy `download and analyze’ model followed by most of our remote users, we are not realizing the full research potential of NCAR’s wealth of datasets. Our goal is to integrate NCAR’s curated data collections with the OSDF data and compute fabric to broaden access capabilities. In this talk, we present progress on this collaboration and demonstrate geoscience workflows which ingest data from NCAR’s Research Data Archive using pelicanFS via OSDF caches.

        Speaker: Harsha Hampapura (NSF National Center for Atmospheric Research)
    • 12:30 PM 1:30 PM
      Lunch 1h Oros Executive Dining Room

      Oros Executive Dining Room

      Fluno Center on the University of Wisconsin-Madison Campus

    • 1:30 PM 2:00 PM
      Doors Open for 40 Years of The Condor Project: A Celebration (In-Person Event Only) 30m Great Hall (Memorial Union)

      Great Hall

      Memorial Union

      800 Langdon St., Madison
    • 2:00 PM 6:30 PM
      40 Years of The Condor Project Commemorative Gathering (In-Person Event Only) Great Hall (Memorial Union on the UW-Madison Campus)

      Great Hall

      Memorial Union on the UW-Madison Campus

      • 2:00 PM
        Welcome to 40 Years of The Condor Project 5m
        Speaker: Todd Tannenbaum (University of Wisconsin)
      • 2:05 PM
        Reflections from Guri Sohi 10m
      • 2:15 PM
        Reflections from Rajesh Raman 10m
      • 2:25 PM
        Reflections from Ewa Deelman 10m
      • 2:35 PM
        Condor in Europe 15m
        Speaker: Helge Meinhard (CERN)
      • 2:50 PM
        Reflections from Manish Parashar 10m
      • 3:00 PM
        Reflections from Peter Couvares 10m
      • 3:10 PM
        Reflections from Jim Basney 10m
      • 3:20 PM
        Reflections from Brian Bockelman 10m
        Speaker: Brian Bockelman (Morgridge Institute for Research)
      • 3:30 PM
        Reflections from Remzi Arpaci-Dusseau 10m
      • 3:40 PM
        Reflections from Doug Thain 10m
      • 3:50 PM
        Reflections from the Floor 10m
      • 4:00 PM
        Reflections from Miron Livny 15m
      • 4:30 PM
        Reception 2h
    • 6:30 PM 8:30 PM
      Evening on the Memorial Union Terrace
    • 8:00 AM 9:00 AM
      Coffee, Pastries and Registration 1h Fluno Atrium

      Fluno Atrium

    • 9:00 AM 9:30 AM
      Advancing Technologies Through Community Howard Auditorium

      Howard Auditorium

      Fluno Center on the University of Wisconsin-Madison Campus

      601 University Avenue, Madison, WI 53715-1035
      • 9:00 AM
        Building Communities Around OSDF and OSPool Contributors 30m
        Speakers: Christina Koch (UW Madison), Tim Cartwright (University of Wisconsin–Madison, OSG)
    • 9:30 AM 10:45 AM
      Campus Track: Campus/CC* Contributors (to OSPool & OSDF, current & future): Meet the Team and Stump the Experts 2nd Floor Meeting Room, Room 201 (Fluno Center)

      2nd Floor Meeting Room, Room 201

      Fluno Center

    • 9:30 AM 10:45 AM
      Interactive Access and Administration Howard Auditorium (Fluno Center)

      Howard Auditorium

      Fluno Center

      • 9:30 AM
        Integration of MINCER with Open Science Grid 15m

        The Monitoring Infrastructure for Network and Computing Environment Research (MINCER) project aims to provide a foundation for in-depth insight and analysis of distributed heterogeneous computing environments, supporting and enhancing research and education in computer and network systems. Our approach is to work in conjunction with the Open Science Grid (OSG) by providing a set of MINCER containers for integration, including the necessary software to obtain measurements for different types of system experiments on available platforms. These containers will be accompanied by detailed instructions and examples illustrating their use, and will be submitted to the OSG repository for broader community access.
        To ensure portability and reproducibility, we will follow best practices for constructing lightweight containers, providing examples that illustrate how to robustly specify dependencies not directly included in the container environment. Our current focus includes containers that enable performance and power measurement capabilities for workflows running on GPUs. In parallel, we are developing a separate profiling tool that leverages cyPAPI to extract compute and memory metrics from Python applications, generating roofline models for performance analysis. This tool will also be containerized to facilitate streamlined deployment and analysis across varied platforms, further supporting comprehensive insights into GPU-based workflows.
        Furthermore, we are developing a container-based strategy for enabling OSG sites contributing to OSG’s OSPool to collect more detailed job-specific data using tools such as DCGM and LDMS. DCGM (Data Center GPU Manager) is an NVIDIA tool that tracks GPU metrics like utilization, memory use, and power. LDMS (Lightweight Distributed Metric Service) is a low-overhead system for collecting performance data from clusters. Using these tools will allow OSG sites to export more comprehensive data to the OSG dashboard if desired. They can also use this job-level data to better tune their systems to meet workload demands.

        Speaker: Irvin Lopez-Audetat (University of Texas at El Paso)
      • 9:50 AM
        Monitoring and diagnostics to support scaling up radio astronomy imaging workflows (Remote Presentation) 15m

        The Algorithms Research and Development Group (ARDG) at the National Radio Astronomy Observatory (NRAO) has been using HTCondor and compute resources at the Open Science Grid (OSG) to significantly improve throughput in radio astronomy imaging by up to 2 orders of magnitude in single imaging workflows, and we are currently putting efforts towards extending these imaging capabilities to multiple imaging workflows. Besides developing and maintaining software to efficiently process data and manage the imaging workflows, we found that monitoring and diagnosing workflow executions is critical to achieve and maintain high throughput. We will present and discuss some of the tools and methodology that we use to assess workflow health and efficiency, which enable us to visualize and solve problems, eventually also contributing to optimizing and advancing capabilities of HTCondor and the OSG.

        Speaker: Felipe Madsen (National Radio Astronomy Observatory)
      • 10:10 AM
        Managing, Maintaining, and Monitoring a large HTC system 15m
        Speaker: Tom Smith (Brookhaven National Laboratory)
      • 10:30 AM
        A Tape Robot for the MIT Tier-2 Center 15m

        Since its establishment in 2006, the MIT Tier-2 computing center has been a long-standing contributor to CMS computing efforts. Recently, an opportunity arose to take part in the usage of a shared tape storage system operated by Harvard University. In the context of a pilot project to explore this system we acquired tape cartridges with a total capacity of 15 PB and successfully integrated them for use by CMS. A key challenge was the lack of direct access to the tape libraries. To address this, we developed a novel setup that is currently unique within the CMS infrastructure. In this talk, we present our technical approach and share insights that may benefit other sites considering participation in shared tape storage systems.

        Speaker: Maxim Goncharov (MIT)
    • 10:45 AM 11:15 AM
      Break 30m Fluno Atrium

      Fluno Atrium

    • 11:15 AM 12:30 PM
      Campus Track: Campus/CC* Technical Lightning Talks and Discussion 2nd Floor Meeting Room, Rom 201 (Fluno Center)

      2nd Floor Meeting Room, Rom 201

      Fluno Center

    • 11:15 AM 12:30 PM
      Tutorials and Workflows Track: Building Your Toolbox: HTCSS Overviews Howard Auditorium (Fluno Center)

      Howard Auditorium

      Fluno Center

      • 11:15 AM
        Duct Tape, DAGs, and Determination: Snakemake at the Edge of HTCondor 20m
        Speaker: Justin Hiemstra (Morgridge Institute for Research)
      • 11:40 AM
        Wrangling Complex Notebook Workflows with Floability 20m

        Computational notebooks have become a critical tool of scientific discovery, by wrapping together code, results, and visualization into a common package. However, moving complex notebooks between different facilities is not so easy: complex workflows require precise software stacks, access to large data, and large backend computational resources. The Floability project aims to connect these two worlds, making it possible to specify, share, and execute computational workflows through the familiar notebook interface. This talk will introduce the Floability concept of a workflow "backpack" and demonstrate applications in high energy physics, machine learning, and geosciences.

        Speaker: Douglas Thain (University of Notre Dame)
      • 12:05 PM
        EWMS in Action: A User’s Guide to Adaptive, Extreme-Scale Workflows 20m

        The Event Workflow Management System (EWMS) enables previously impractical scientific workflows by transforming how HTCondor is used for massively parallel, short-runtime tasks. This talk explores what’s now possible from a user’s perspective. Integrated into IceCube’s Realtime Alert pipeline and powered by OSG’s national-scale compute resources, EWMS’s debut application delivers directional reconstructions of high-energy neutrinos within minutes. The system’s user-first design streamlines scientific workflows, while built-in tools provide reliable administrative control. This talk will showcase how EWMS is accelerating discovery today and explore how its capabilities could unlock new research across domains—from astrophysics to protein modeling, large-scale text mining, and beyond.

        Speaker: Ric Evans (UW-Madison / IceCube)
    • 12:30 PM 1:30 PM
      Lunch 1h Oros Executive Dining Room (Fluno Center)

      Oros Executive Dining Room

      Fluno Center

    • 1:30 PM 2:45 PM
      Tutorials and Workflows Track: Workloads and Workflows: Debugging and Observability Howard Auditorium (Fluno Center)

      Howard Auditorium

      Fluno Center

      • 1:30 PM
        Monitoring in the OSDF 20m
        Speaker: Patrick Brophy (Morgridge)
      • 1:55 PM
        Disk Usage of Jobs at the EP 20m
        Speaker: Cole Bollig (University of Wisconsin-Madison)
      • 2:20 PM
        Hollistic cost analysis of running a computing center 20m

        The MIT Tier-2 computing center, established in 2006, has been a long-standing contributor to CMS computing. As hardware ages and computing demands evolve, we are undertaking a major redesign of the center’s infrastructure. In this talk, we present a holistic cost analysis that includes not only hardware purchases but also power consumption, cooling, and rack space—factors often excluded from conventional cost models. Using power measurements under typical CMS workloads, we evaluate the cost-effectiveness of maintaining aging hardware versus timely replacement, and propose optimal hardware retirement and procurement policies.

        Speaker: Zhangqier Wang (Massachusetts Inst. of Technology (US))
    • 2:45 PM 3:05 PM
      Break 20m Fluno Atrium

      Fluno Atrium

    • 3:05 PM 4:50 PM
      Scientific Collaborations Track: Collaborations Group Discussions 2nd Floor Meeting Room, Room 201 (Fluno Center)

      2nd Floor Meeting Room, Room 201

      Fluno Center

    • 3:05 PM 4:50 PM
      Tutorials and Workflows Track: Objects, Objects Everywhere: Using the OSDF in your Science Howard Auditorium

      Howard Auditorium

      Fluno Center on the University of Wisconsin-Madison Campus

      601 University Avenue, Madison, WI 53715-1035
      • 3:05 PM
        Data Everywhere: Using and Sharing Scientific Data with Pelican 1h 20m

        While there are perhaps hundreds of petabytes of datasets available to researchers, instead of swimming in seas of data there is often a feel of sitting in a data desert: there’s a mismatch between what sits in carefully curated repositories around the world versus what’s accessible at the computational resources locally available. The Pelican Project (https://pelicanplatform.org/) aims to bridge the gap between repositories and compute by providing a software platform to connect the two sides. Pelican’s flagship instance, the Open Science Data Federation (OSDF), serves billions of objects and more than a hundred petabytes a year to national-scale resources. This tutorial, targeted at end-user data consumers and data providers, will cover the data access model of Pelican, guide participants to access and share data through an existing data federation, and consider how data movement via Pelican and the OSDF can enable their research computing.

        Speaker: Dr Andrew Owen (UW-Madison/CHTC)
      • 4:30 PM
        Unbreaking the bird: Debugging Pelican client failures 20m
        Speaker: Brian Bockelman (Morgridge Institute for Research)
    • 4:50 PM 5:00 PM
      Closing Remarks Howard Auditorium

      Howard Auditorium

      Fluno Center on the University of Wisconsin-Madison Campus

      601 University Avenue, Madison, WI 53715-1035
    • 6:00 PM 7:00 PM
      Evening Bike Ride 1h Depart from Fluno Lobby

      Depart from Fluno Lobby

    • 8:00 AM 9:00 AM
      Coffee, Pastries and Registration 1h Fluno Atrium

      Fluno Atrium

    • 9:00 AM 10:30 AM
      Joint CMS and US-ATLAS Session: Monitoring and Measuring Room 201 (Fluno Center)

      Room 201

      Fluno Center

      We have a common notes and action items document at:
      https://docs.google.com/document/d/1y3V4HKGQ8EMUTxH9_MDCqegtA8sGeNtu6UmFk2fRyLQ/edit?usp=sharing

      Conveners: Andrew Melo (Vanderbilt University), Shawn McKee (University of Michigan)
    • 9:00 AM 10:30 AM
      Running ML, AI and GPU Workflows Howard Auditorium

      Howard Auditorium

      Fluno Center on the University of Wisconsin-Madison Campus

      601 University Avenue, Madison, WI 53715-1035
      • 9:00 AM
        GPU Access and AI workflows in CHTC and the OSPool 20m Howard Auditorium (Fluno Center)

        Howard Auditorium

        Fluno Center

        Speaker: Amber Lim (UW Madison)
      • 9:25 AM
        ARA Distributed Inference Experiments: Flying HTCondor Over a Field of Wireless Dreams 20m Howard Auditorium

        Howard Auditorium

        Fluno Center on the University of Wisconsin-Madison Campus

        601 University Avenue, Madison, WI 53715-1035

        We present the initial design and proposed implementation for a series of long-baseline, distributed inference experiments leveraging ARA --- a platform for advanced wireless research that spans approximately 500 square kilometers near Iowa State University, including campus, the City of Ames, local research and producer farms, and neighboring rural communities in central Iowa. These experiments aim to demonstrate, characterize, and evaluate the use of distributed inference for computer vision tasks in rural and remote regions where high-capacity, low-latency wireless broadband access and backhaul networks enable edge computing devices and sensors in the field to offload compute-intensive workloads to cloud and high-performance computing systems embedded throughout the edge-to-cloud continuum. In each experiment, a distributed implementation of the MLPerf Inference benchmarks for image classification and object detection will measure standard inference performance metrics for an ARA subsystem configuration under different workload scenarios. Real-time network and weather conditions will also be monitored throughout each experiment to evaluate their impact on inference performance. Here, we highlight the role of HTCondor as the common scheduler and workload manager used to distribute the inference workload across ARA and beyond. We also discuss some of the unique challenges in deploying HTCondor on ARA and provide an update on the current status of the project.

        Speaker: Martin Kandes (San Diego Supercomputer Center)
      • 9:50 AM
        Pegasus WMS Deployments in ACCESS and NAIRR Pilot 20m Howard Auditorium

        Howard Auditorium

        Fluno Center on the University of Wisconsin-Madison Campus

        601 University Avenue, Madison, WI 53715-1035

        Pegasus is a widely used scientific workflow management system built on top of HTCondor DAGMan. This talk will highlight how Pegasus is deployed within the NSF ACCESS ecosystem and the NAIRR Pilot. We will cover access point deployments, including the hosted ACCESS Pegasus platform (Open OnDemand and Jupyter), workflow execution nodes in HPC environments, and a JupyterLab-based access point within the Purdue Anvil Composable Subsystem. On the execution side, we will discuss several provisioning strategies, including HTCondor Annex, custom virtual machines on the Jetstream2 cloud, simple glidein configurations for campus clusters, and a dynamically autoscaled TestPool environment designed for workflow development and testing.

        Speaker: Mats Rynge (USC / ISI)
      • 10:15 AM
        Experiments and expansions: Leveraging PATh tools and NAIRR resources in ML workflows 15m Howard Auditorium

        Howard Auditorium

        Fluno Center on the University of Wisconsin-Madison Campus

        601 University Avenue, Madison, WI 53715-1035
        Speaker: Ian Ross (U. Wisconsin)
    • 10:30 AM 11:00 AM
      Break 30m Fluno Atrium

      Fluno Atrium

    • 11:00 AM 12:30 PM
      AI-driven Science with PATh Services Howard Auditorium

      Howard Auditorium

      Fluno Center on the University of Wisconsin-Madison Campus

      601 University Avenue, Madison, WI 53715-1035
      • 11:00 AM
        Contouring the Audio Fovea with Pinna Cues for Spatial Speech Perception (Remote Presentation) 20m
        Speaker: Bujji Selagamsetty (University of Wisconsin-Madison)
      • 11:25 AM
        Mapping the Zymomonas mobilis Interactome 20m
        Speaker: Sameer DCosta (GLBRC / WEI)
      • 11:50 AM
        Large Scale Dataset Curation and Model Evaluation 20m
        Speaker: John Peters (UW-Madison)
      • 12:10 PM
        Utilizing HTCondor, Pelican, and DAGman workflows for high-throughput phenotyping in dairy cattle (Remote Presentation) 20m
        Speaker: Ariana Negreiro (University of Wisconsin-Madison)
    • 11:00 AM 12:30 PM
      Joint CMS and US-ATLAS Session: WLCG (Mini) Data Challenges Room 201 (Fluno Center)

      Room 201

      Fluno Center

      We have a common notes and action items document at:
      https://docs.google.com/document/d/1y3V4HKGQ8EMUTxH9_MDCqegtA8sGeNtu6UmFk2fRyLQ/edit?usp=sharing

      Conveners: Andrew Melo (Vanderbilt University), Shawn McKee (University of Michigan)
      • 11:00 AM
        Capacity Challenges 15m

        Discuss tools and options for the next capacity challenge.

        Speaker: Hironori Ito (Brookhaven National Laboratory)
      • 11:15 AM
        Capability Challenge: SENSE/Rucio 15m

        Discuss plans for SENSE/Rucio testing for USATLAS/USCMS

        Speakers: Diego Davila (UCSD), Justas Balcas (ESnet)
      • 11:30 AM
        Capability Challenge: Netbird/Wireguard 15m
        Speaker: Lincoln Bryant (University of Chicago)
      • 11:45 AM
        Discussion & Planning 45m

        Discuss capacity and capability challenges. Which are we interesting in pursuing? Who will participate? When to schedule?

        Speakers: Andrew Melo (Vanderbilt University), Shawn McKee (University of Michigan)
    • 12:30 PM 1:30 PM
      Lunch 1h Oros Executive Dining Room (Fluno Center)

      Oros Executive Dining Room

      Fluno Center

    • 1:30 PM 2:45 PM
      Joint CMS and US-ATLAS Session: AI/ML, Tools and Operations Room 201 (Fluno Center)

      Room 201

      Fluno Center

      We have a common notes and action items document at:
      https://docs.google.com/document/d/1y3V4HKGQ8EMUTxH9_MDCqegtA8sGeNtu6UmFk2fRyLQ/edit?usp=sharing

      Conveners: Andrew Melo (Vanderbilt University), Shawn McKee (University of Michigan)
      • 1:30 PM
        HEPCDN: Exploring NGINX for Content Distribution 15m
        Speaker: Andrew Melo (Vanderbilt University)
      • 1:45 PM
        CANCELED: Data Streaming as a Service 15m

        Presentation/discussion canceled.

        Speaker: Alexei Klimentov (Brookhaven National Lab)
      • 2:00 PM
        AI/ML Introduction and Discussion 45m

        Quick overview of AI/ML in WLCG so far
        Needs for AI/ML for Infrastructure AND Infrastructure for AI/ML
        Funding opportunities
        Next step, areas on common interest/effort?

        Speaker: Ilija Vukotic (University of Chicago)
    • 1:30 PM 2:45 PM
      Tutorial: Mastering Debugging (for admins) Howard Auditorium

      Howard Auditorium

      Fluno Center on the University of Wisconsin-Madison Campus

      601 University Avenue, Madison, WI 53715-1035
      • 1:30 PM
        Tracking HTCondor Uptime 20m

        While the DaemonStartTime and MonitorSelfAge attributes of HTCondor daemons provide a slice of insight as to the uptime and availability of the service, they're not well-suited for tracking longer-term up/down-time stats over the course of days, weeks, or months.

        One illustration of this limitation is that if a malfunctioning node or service restarts every five minutes, the values are reset to zero each time and there's no accumulation of the total uptime across the restarts.

        Longer-time-period uptime statistics are essential for contractual Service Level Agreement (SLA) management, and are an important aspect of monitoring the overall health of large HTCondor pools.

        Using straightforward scripting, the start daemon's Cron system, and an external file to store an uptime-centered ClassAd, long-term statistics can be maintained and easily queried.

        Speaker: Michael Pelletier
      • 1:55 PM
        Rolling HTCondor Upgrades Without Rolling Over 20m
        Speaker: Tim Theisen (UW-Madison CHTC)
      • 2:20 PM
        Tighter HTCondor and Kubernetes interplay for better glideins 20m

        HTCondor is the leading system for building a dynamic overlay batch scheduling system on resources managed by any scheduling system, by means of glideins. One fundamental property of these setups is the use of late binding of containerized user workloads. From a resource provider point of view, a compute resource is thus claimed before the user container image is selected. Kubernetes allows for both multi-container requests and dynamic updates to the container image being used. In this talk we show how HTCondor can exploit these features to both increase the effectiveness and the security of gildeins running on top of Kubernetes-managed resources.

        Speakers: Igor Sfiligoi (University of California San Diego), Jaime Frey (Center for High-Throughput Computing)
    • 2:45 PM 3:05 PM
      Break 20m Fluno Atrium

      Fluno Atrium

    • 3:05 PM 3:55 PM
      HTCondor: New Things You Should Know Howard Auditorium

      Howard Auditorium

      Fluno Center on the University of Wisconsin-Madison Campus

      601 University Avenue, Madison, WI 53715-1035
    • 3:05 PM 3:55 PM
      Joint CMS and US-ATLAS Session: Analysis Facilities (AFs) Room 201 (Fluno Center)

      Room 201

      Fluno Center

      We have a common notes and action items document at:
      https://docs.google.com/document/d/1y3V4HKGQ8EMUTxH9_MDCqegtA8sGeNtu6UmFk2fRyLQ/edit?usp=sharing

      Conveners: Hironori Ito (Brookhaven National Laboratory), Lincoln Bryant (University of Chicago)
      • 3:05 PM
        Introduction and Overview for AFs in WLCG 15m
        Speaker: Lincoln Bryant (University of Chicago)
      • 3:20 PM
        AF Discussion 35m

        Shared development and prototyping for AFs?
        Storage technologies to support AFs (Carlos Gamboa ?, 10 minutes)
        Joint AFs: Can we “share” AFs between experiments: Belle II and ATLAS or Dune and CMS, etc ? (Hiro/Ofer/Lincoln ?)
        Can we agree on a minimum baseline for AFs?
        Standard “login”

        Speakers: Hironori Ito (Brookhaven National Laboratory), Lincoln Bryant (University of Chicago)
    • 3:55 PM 4:10 PM
      Break 15m Fluno Atrium

      Fluno Atrium

    • 4:10 PM 4:55 PM
      Tools for Production Pools Howard Auditorium

      Howard Auditorium

      Fluno Center on the University of Wisconsin-Madison Campus

      601 University Avenue, Madison, WI 53715-1035
      • 4:10 PM
        From backend to interactive based on HTCondor 20m

        New requirements from HEP data analysis includes limited access to login nodes, resource needed from the experiments program rather than the login nodes and efficient data access for collaborative workflows etc. We have developed an Interactive aNalysis worKbench (INK), a web-based platform leveraging the HTCondor cluster. INK transforms traditional batch-processing resources into a user-friendly, web-accessible interface, enabling researchers to leveage the cluster computing and storage resources directly via their browsers.
        A loosely coupled architecture with token-based authentication to ensures thesecurity, while fine-grained permission management allows customizable access for users and experimental applications. Universal public interfaces abstract the heterogeneity of underlying resource heterogeneity. Following the first version released in Mar.2025, user feedback has been highly positive.

        Speaker: Jingyan Shi (INSTITUTE OF HIGH ENERGY PHYSICS, Chinese Academy of Science)
      • 4:35 PM
        Operating a Federated HTCondor Infrastructure: Monitoring and Management for CMS Computing 20m

        The Compact Muon Solenoid (CMS) experiment at CERN generates and processes vast volumes of data requiring significant computing capacity. To meet these demands, CMS has adopted a federated throughput computing model distributed across a global infrastructure based on HTCondor, the CMS Submission Infrastructure. Seamless integration of heterogeneous resources from multiple sites allows for operating a unified, virtualized pool. This infrastructure currently provides access to over 500,000 CPU cores, enabling CMS to efficiently execute a wide variety of data processing and simulation workloads.

        This federation, however, comes with substantial operational challenges, notably, the need for robust and scalable monitoring. To ensure reliability, performance, and rapid diagnosis of issues, we have developed a comprehensive monitoring ecosystem that spans job execution, resource availability, and system health across the entire pool. This talk will present the architecture of the CMS federated compute infrastructure, detail the role of HTCondor in enabling global workload distribution, and highlight recent developments in monitoring that are critical to operating such a large-scale system effectively.

        Speaker: Bruno Coimbra (Fermilab)
    • 4:55 PM 5:05 PM
      Closing Remarks Howard Auditorium

      Howard Auditorium

      Fluno Center on the University of Wisconsin-Madison Campus

      601 University Avenue, Madison, WI 53715-1035
    • 6:00 PM 7:30 PM
      Sunset Kayak or Paddleboard 1h 30m Depart From Fluno Lobby

      Depart From Fluno Lobby

    • 8:45 PM 10:45 PM
      Karaoke Night 2h Depart from Fluno Lobby

      Depart from Fluno Lobby

    • 8:00 AM 9:00 AM
      Coffee, Pastries and Registration 1h Fluno Atrium

      Fluno Atrium

    • 9:00 AM 10:25 AM
      Throughput Computing Science Impacts Howard Auditorium

      Howard Auditorium

      Fluno Center on the University of Wisconsin-Madison Campus

      601 University Avenue, Madison, WI 53715-1035
      • 9:00 AM
        High-resolution Imaging of the Multi-phase Interstellar Medium with CHTC 20m
        Speaker: Nickolas Pingel (University of Wisconsin-Madison)
      • 9:25 AM
        Processing Mouse Brain Data on CHTC using Research Drive Integration 15m
        Speaker: Aydan Bailey (UW-Madison)
      • 9:45 AM
        Visual proteomics, powered by CHTC 15m
        Speaker: Raison D'Souza (University of Wisconsin-Madison)
      • 10:05 AM
        High Throughput Computing for Comparative Genomics on Large Public Datasets 15m
        Speaker: Conor Bendett (University of Wisconsin-Madison)
    • 10:25 AM 10:45 AM
      Break 20m Fluno Atrium

      Fluno Atrium

    • 10:45 AM 12:30 PM
      Managed and Staging Objects: Caches and Execution Points Howard Auditorium

      Howard Auditorium

      Fluno Center on the University of Wisconsin-Madison Campus

      601 University Avenue, Madison, WI 53715-1035
      • 10:45 AM
        Files Common Across Jobs and How To Transfer Them 20m
        Speaker: Todd Miller (CHTC)
      • 11:10 AM
        condor_who (are you) 15m
        Speaker: John Knoeller (University of Wisconsin, Madison)
      • 11:30 AM
        Kingfisher: Toward Explicit Space Management 20m
        Speaker: Justin Hiemstra (Morgridge Institute for Research)
      • 11:55 AM
        Using the National Data Platform Endpoint to Improve Access to Science Data 20m

        I describe the process of deploying the National Data Platform Endpoint (formerly Point of Presence / POP) on local infrastructure to provide a data streaming service for a published science dataset where the data origin is located in Hawaii. From the perspective of a software engineer I will cover the process of deploying the endpoint into a Kubernetes cluster or using Docker Compose. I will explain how I integrated access to local data with the endpoint and describe some useful lessons that were learned. Finally, from the perspective of a science user, I will demonstrate how to use the deployed endpoint to preprocess data and stream the results to a Jupyter notebook to visualize the data.

        Speaker: Curt Dodds (Institute for Astronomy, University of Hawaii)
      • 12:20 PM
        Closing Remarks 10m
        Speaker: Miron Livny (UW-Center for High Throughput Computing)
    • 12:30 PM 1:30 PM
      Lunch 1h Oros Executive Dining Room (Fluno Center)

      Oros Executive Dining Room

      Fluno Center