@patrikp:matrix.org
===================

2025-12-16 08:09:00
- F41 end of life stuff which is now done.
- Updating the SOP for it.
- Meetings.

2025-12-10 15:01:00
- Housekeeping. Trying to get the tracker issue number to a more reasonable number before the holidays.
- Meetings.
- Expense reimbursement.


@gwmngilfen:fedora.im
=====================

2025-12-16 09:13:00
- dentist visit
- small infra tickets (2x spam, fas groups)
- investigated ipa01, gave up and let Michal do it
- started looking at the rdu-cc hardware leftover list

2025-12-15 09:40:00
- holiday admin
- zabbix announcements
- looking through tickets
- catching up on lists/discourse

2025-12-12 10:37:00
- Wed:
- admin, coding club, travel plans
- got the ssl certs checked in Zabbix
- Yesterday:
- Spent most of the day on Zabbix http checks
  - these are easy enough individually (the zabbix agent has a default item for geting a regex from a page)
  - however we have many, and they are scattered through the infra - and then manually collated in the nagios role
  - as such, it's not easy to determine which **application roles** need this adding to them
  - for now, I have done an example for the ipa http-internal check in the ipa/server role, and a bunch of handcrafted external checks in the proxy role
- also had the infra meeting, where we agreed that [phase 1 of Zabbix is ready to go](https://meetbot.fedoraproject.org/meeting-3_matrix_fedoraproject-org/2025-12-11/infrastructure.2025-12-11-17.00.html) 🎉

2025-12-10 09:15:00
- data**nommer** checks written and deployed, and added to Ansible
- initial research into cert checks
- looks like we mainly have two types - things we can check on disk (openssl x509 ...) via an agent, and things we have to check via a connection (urls)
- 1:1 with kevin and family stuff

2025-12-09 09:40:00
- 3 hours debugging issues with old data-analysis code for an Ansible colleague 😕 
- took rdu2 proxies out of DNS
- shutdown servers in RDU2
- deployed change to copr hosts to fix external access to zabbix
- we can use this for other external hosts too


@zlopez:fedora.im
=================

2025-12-16 10:03:00
- Unstuck signing queue - I did that multiple times during the day
- Help Greg investigate NRPE alert for ipa01 - it was caused by some problems with LDAP, fixed by re-initialization of the DB from ipa02

2025-12-15 10:27:00
- Move the infra-docs to forge.fedoraproject.org ([ticket](https://pagure.io/fedora-infrastructure/issue/12854))
- Resolve the ticketkey alerts on proxies - [PR](https://pagure.io/fedora-infra/ansible/pull-request/3001)

2025-12-12 08:46:00
- Move infra-docs-fpo to forge.stg.fedoraproject.org ([ticket](https://pagure.io/fedora-infrastructure/issue/12854))
- Resolve buildvm-ppc64le-19.rdu3.fedoraproject.org read-only filesystem
- Resolve the big FMN queue - Aurelien knew what was going on, there was some bad library update

2025-12-11 08:37:00
- Investigate the bodhi staging messaging queue ([ticket](https://pagure.io/fedora-infrastructure/issue/12932)) - the queue is finally being consumed
- Update owners of blockerbugs and testdays apps ([PR](https://pagure.io/fedora-infra/ansible/pull-request/2998))
- Deal with spam on fedora users list ([ticket](https://pagure.io/fedora-infrastructure/issue/12973))
- Process PDR request
- Investigate the growing FMN queue

2025-12-10 08:41:00
- I&R weekly report
- Investigate the bodhi staging messaging queue ([ticket](https://pagure.io/fedora-infrastructure/issue/12932)) - trying strace now
- Process PDR request
- release-monitoring.org: Review Anitya [PR](https://github.com/fedora-infra/anitya/pull/1981)

2025-12-09 12:26:00
- Helping Greg with rdu-cc migration
- Dentist appointment in the morning
- Investigate the bodhi staging messaging queue ([ticket](https://pagure.io/fedora-infrastructure/issue/12932))


@t0xic0der:fedora.im
====================


@james:fedora.im
================


@arrfab:fedora.im
=================

2025-12-12 12:34:00
- All machines that moved from rdu2 to rdu3 are now reconfigured https://gitlab.com/CentOS/infra/tracker/-/issues/1822
- * traffic for mirror network main distribution point is now redirected to rdu3
- * all hardware are up2date for firmware and alerting settings
- * machines are all reinstalled and added back into pool
- Had to work in emergency also on some sponsored infra (hosting www.centos.org) and ticket is open with sponsor (will work on that today)
- Modified restic role for new options (needed)

2025-12-11 09:11:00
- Found the issue for https://gitlab.com/CentOS/infra/tracker/-/issues/1820 , so implemented a workaround and created task to enhance the plugin (for a next sprint).
- RDU3 dc move : https://gitlab.com/CentOS/infra/tracker/-/issues/1822
- * init all hardware with new ipmi/oob mgmt ip addresses
- * reconfigure OS+ansible for backup machine (priority one), and modified restic/backup role for it (applied but verifying after)
- * setup basic LACP bond interfaces on migrated machines to just confirm network stack is working and machines are available

2025-12-10 12:33:00
- worked on DC move 
- Investigated the cbs.centos.org stuck task and identified plugin issue (to be rewritten)
- unblocked some stuck tasks and using a workardound for now

2025-12-09 10:02:00
- still investigating https://gitlab.com/CentOS/infra/tracker/-/issues/1820
- Prepared all the moving hardware for rdu3 (https://gitlab.com/CentOS/infra/tracker/-/issues/1815)
- setup vpn ci tunnel (https://gitlab.com/CentOS/infra/tracker/-/issues/1816)


@nirik:matrix.scrye.com
=======================

2025-12-12 17:55:00
- - Fixed some issues on retrace03.
- Fedora infrastructure meeting - 2025-12-11 9am
- Increased max workers on proxies to see if it fixes https://pagure.io/fedora-infrastructure/issue/12975
- Put staging docs behind anubis to test, will roll to prod soon.
- Added all the rdu3 acls to internal requests
- vmhost-x86-iso02 configure and reinstall, but no 10G link. Waiting on dc folks
- vmhost-x86-iso04 configure and reinstall, one 10G link, worked with it to fix.
- Installed download-iso01, smtp-auth-iso01.
- redeployed proxy14.
- Ended rdu2-cc -> rdu3 outage, updated status and tickets.
- Cleanup/fixing nagios
- docs: fedocal updates ( https://pagure.io/infra-docs-fpo/pull-request/454 )

2025-12-11 19:03:00
- - risc-v internal meeting - 2025-12-03 8:30am
- Working with datacenter folks on rdu2-cc machines in rdu3
- Upgraded all the firmware in retrace03 to try and get 10G nics working
- docs: Updated a bunch of infra docs tickets, closed a number.
- Infra daily meeting - 2025-12-10 noon
- Got retrace reconfigured and mostly ansiblized and back online
- Met up with local RedHat/IBM/MS folks at a social gathering.

2025-12-10 19:14:00
- * Global Engineering meeting - 2025-12-09 7am
* Fesco meeting - 2025-12-09 9am
* Filed schedule ticket about mass rebuild being wrong ( https://pagure.io/fedora-pgm/schedule/issue/219 )
* Fixed some mgmt interfaces that had wrong domain
* scrapers hitting pagure.io. Attempted to adjust.
* Fixed some dhcp overlaps with the rdu2-cc mgmt interfaces
* 1x1 with Greg - 2025-12-09 1pm
* Looked at buildvm-ppc64le-08.rdu3.fedoraproject.org, power cycled it, then reinstalled it.
* Looked at what mgmt I could get to for the rdu2 moved servers. 4/11 so far.
* docs: minor updates to departing admin sop ( https://pagure.io/infra-docs-fpo/pull-request/449 )
* Upgraded proxy110 to f43.


