Throughput Computing 2025

Name: Throughput Computing 2025
Start: 2025-06-02T06:30:00-05:00
End: 2025-06-06T23:15:00-05:00
Location: Fluno Center on the University of Wisconsin-Madison Campus

Jun 2 – 6, 2025

Fluno Center on the University of Wisconsin-Madison Campus

America/Chicago timezone

Questions about attending, speaking, accommodations, and other concerns

htc@path-cc.io

Tracking HTCondor Uptime

Jun 5, 2025, 1:30 PM

20m

Howard Auditorium (Fluno Center on the University of Wisconsin-Madison Campus)

Howard Auditorium

Fluno Center on the University of Wisconsin-Madison Campus

601 University Avenue, Madison, WI 53715-1035

Tutorial: Mastering Debugging (for admins)

Michael Pelletier

While the DaemonStartTime and MonitorSelfAge attributes of HTCondor daemons provide a slice of insight as to the uptime and availability of the service, they're not well-suited for tracking longer-term up/down-time stats over the course of days, weeks, or months.

One illustration of this limitation is that if a malfunctioning node or service restarts every five minutes, the values are reset to zero each time and there's no accumulation of the total uptime across the restarts.

Longer-time-period uptime statistics are essential for contractual Service Level Agreement (SLA) management, and are an important aspect of monitoring the overall health of large HTCondor pools.

Using straightforward scripting, the start daemon's Cron system, and an external file to store an uptime-centered ClassAd, long-term statistics can be maintained and easily queried.

Tracking HTCondor Uptime TC25.pptx

Throughput Computing 2025

Questions about attending, speaking, accommodations, and other concerns

Tracking HTCondor Uptime

Howard Auditorium

Fluno Center on the University of Wisconsin-Madison Campus

Speaker

Description

Presentation materials

Choose timezone

Throughput Computing 2025

Questions about attending, speaking, accommodations, and other concerns

Speaker

Description

Presentation materials