Jun 9 – 12, 2026
Fluno Center on the University of Wisconsin-Madison Campus
America/Chicago timezone

Questions about attending, speaking, accommodations, and other concerns

Session

Is AI for The Birds? Using AI tools to understand and debug HTCondor and Pelican

Jun 12, 2026, 11:00 AM
Howard Auditorium (Fluno Center on the University of Wisconsin-Madison Campus)

Howard Auditorium

Fluno Center on the University of Wisconsin-Madison Campus

601 University Avenue, Madison, WI 53715-1035

Presentation materials

There are no materials yet.

  1. Khyathi Vagolu (UW-Madison)
    6/12/26, 11:00 AM

    When network disruptions or worker node failures occur, HTCondor relies on a static lease timeout, traditionally 40 minutes, before abandoning a job. This static window creates a costly trade-off: waiting too long causes massive machine idle time on unrecoverable failures, while cutting it too short prematurely kills jobs that could have successfully reconnected. Can we use AI to solve this?...

    Go to contribution page
  2. Ilija Vukotic (University of Chicago)
    6/12/26, 11:25 AM
  3. Tom Smith (Brookhaven National Laboratory)
    6/12/26, 11:50 AM
  4. Ron Tapia (Penn State University)

    Discussion of metrics that Condor can provide about the performance of external services. Condor has a unique view of the performance of the services that it uses on behalf of jobs. Examples of external services include file transfer plugins and credmons. Individual failures are not very interesting to cluster administrators, but widespread failures affecting many jobs are. What sort of...

    Go to contribution page
Building timetable...