Minutes from Wheel Meeting held 23rd of May 2020

Wheel Meeting Agenda - Saturday 2020-05-23 14:00

Meeting opened 14:08

Attendance

Present

  • [BOB]
  • [NTU]
  • [333]
  • [THA]
  • [MPT]
  • [TPG]
  • [CFE]
  • [TPG]
  • [MTL]
  • Wings (Guest)

Late

  • [TEC], not too late though. Just 30 minutes

Absent

  • Everyone else

Schedule next meeting

  • Schedule/delegate reminders of next meeting
  • Curate agenda.next
    • Much bikeshedding about how to automate meeting reminders

ACTION:
- [TPG] commits to hacking something up

ACTION:
- [TPG] to look after next meeting reminder

Standing items

Visibly reinduct members new (and old?) with the "Wheel Group Ethical Guidelines"

  • Examining an Ethical Guideline, e.g. asking:
  • What's an example situation?
  • [NTU]: Forensics scenario? Disc space management?
  • [BOB]: Coming across user images in a set of recovered webcam images (from recovered filesystem)
  • Discussed "forgetting" private information that has been accessed

Status check: Regular updates, monitoring

  • e.g. Debian oldstable 9 "stretch" -> Debian stable 10 "buster"
  • discord-irc.ucc.asn.au could use a rebuild

ACTION:
- [333] to do in-place upgrade
- murasoi

ACTION:
- [MPT] to do dist-upgrade

  • Pay attention to firewalling (iptables vs nftables) and logging
  • molmol
  • [NTU] was hoping for benchmarking before and after, will help
  • No immediate volunteers

Mission Control (ocsinventory, uccmonitor)

  • ocsinventory, uccmonitor (see https://wiki.ucc.asn.au/MissionControl) for an overview
  • Add molmol and/or get a NFS latency-under-load benchmark
  • [DAA]: thought I'd asked before but can someone take me out of the root crontab MAILTO on murasoi
  • Getting errors from the script that updates the rancid backup so perhaps it is directly defined there, I forget
  • murasoi:/etc/cron.hourly/11rancid errors if mussel is down
  • lard is responding to pings from murasoi but rancid cannot connect
  • abe is not responding to pings and rancid is also complaining
  • Who's watching hostperson?
  • Would other uccmonitor notifications work better? as well?

Active Directory logins

  • [MTL]: has been playing with the UCC grafana setup to add Active Directory logins.
  • UCC's AD doesn't return a mail attribute for LDAP queries.
  • [MTL] is going to investigate an alternative
  • Will set up a SSO system for UCC which can give the right information to grafana on signup
  • [MTL] will also set up a healthchecks.io UCC install on his VM
  • Once this is working, will look at migrating to UCC

Status check: Backups

  • [NTU] We should have done that offsite file-restore demo
  • Discussion about backup solutions
  • zzdailybackup going to ???somewhere??? in the ether
  • rsync.net
  • backblaze
  • [NTU] can we get a small monthly budget from committee for this sort of thing?
    • An ongoing budget for offsite cloud services of the order of $20 per month
  • Full disks
  • here comes murasoi:/var - live lvextend(8) demo?
  • Dead disks
  • As previously mentioned, mollitz can only take 2TB disks, so there's no capacity for expansion
  • Encourage members to undertake housekeeping of their home directories
  • Consensus that wheel wants the club to purchase 2x 7200RPM 12TB disks, with an approximate budget of $650 each

Status check: Password/Key rotations

Expiring wheel keys in UCCPass

- \[MPT\] Editing files keeps failing when keys expire
- Fixing:
    - `cd /home/wheel/bin/uccpass/keys/wheel`
    - `gpg2 --fingerprint | grep -C2 <first 4 of fingerprint of failing key>`
    - `mv <failing>.gpg <failing>.gpg.expired`
    - `uccpass reload`
  • Keys of long-standing wheel members are starting to expire in quick succession
    - Regen keys as required
    - gpg --gen-key
    - gpg --export -a "John Hodge (UCC Wheel Group)" > uccpass.pub
    - cp uccpass.pub /home/wheel/bin/uccpass/keys/wheel/tpg.gpg
    - uccpass reload

...then New wheel members, additions, nominations

  • Welcome to wheel!
  • Read /home/wheel/docs/WelcomeToWheel

New Matters

Power Outage

  • Short power outage Thursday 2020-05-21T23:46
  • Team effort on remote powerup
  • 4G link came back, 5 minutes sooner than the fibre uplink?
  • mooneye didn't fully boot
    - Pinged, connection refused on services, port 22, 25, ...
    - Long fsck? then failed to mount a filesystem?
    - Serial console not available?
    - Power cycled with ipmitool -> success!
    - Later: systemctl restart mailman
  • motsugo, molmol NFS servers not exporting properly - accidental DNS dependence?
  • Cluster hosts running, but most VMs, containers down: no nas-vmstore from molmol
    - Could not login to WebUI because of...
    - Samson AD server VM running, but needed: systemctl restart samba-ad-dc.service
    - Some VMs not set to "autostart"
    - Some VMs set for "autostart" needed manual start: needed nas-vmstore?

Fixing UCC DNS

  • This will be an issue with the UWA network/firewall changes
  • The main issue is that mooneye didn't boot cleanly
  • Need to break out DNS between authoritative and resolvers
  • Need to go through UCC hosts to ensure they are all configured similarly, and also to use our local resolvers
  • Need to tease out things that use ucc.gu.uwa.edu.au and move things away to ucc.asn.au; use ucc.asn.au in configs
  • Avoid split horizon
  • Move to stuff we control

ACTION:
- [333] to look into getting SoL (Serial over LAN) working for Mooneye's IPMI

ACTION:
- [MPT] and [333] to integrate new Magikarp/Mudkip disks into ceph and upgrade ceph
- https://pve.proxmox.com/wiki/Ceph_Luminous_to_Nautilus

Matters arising previously

Network Changes

  • Update on network nightmares with UCC/UWA DNS
  • [MPT] has an email about the change is being delayed until June

Network Security

  • Current security of remote management had been brought up 2020-04-18_wheel.txt
  • Still need to decide whether we need to:
  • Lock down the 192.168.2.0 subnet further (ie. no access from motsugo)
  • e.g. some management controllers are positively ancient, and are unpatched
  • Need to look into applying stricter firewalling between UCC's LAN + the management network
  • [333] moves to defer to next meeting

Annual account locking

  • If it has not happened by now, it's overdue
  • Are the rejoining member/password reset/new member account procedures going OK?
  • Pausing until clubroom is accessible?
  • What's the latest we can leave it and still get 2020 renewals in?
  • Also need to tidy deleted/obsolete accounts?
  • [BOB]: People will balk at paying new account fees early in 2021 if they are left too late in 2020
  • [MPT]: It's probably best if account locking happens after Semester 1 exams
  • Address next meeting

PREVIOUS ACTION: Purchase of 2x 1TB SSDs

  • [333][THA][TEC]: Buy 2x 1TB SSDs for magikarp+mudkip Ceph?
  • Passed by committee on 2019-10-04.txt
  • Done 2020-05-12 [NTU]
  • Installed 2020-05-22 [MPT]

mooneye, mailfish, UCC SOE

  • [MTL] gives update of status
  • [MTL] requests permission to move mail off mooneye
  • General consensus that's okay

4G Backup Link

  • [MPT]: 4G backup link maintenance
  • Signal issues, external antenna now hooked up
  • Comes back up automatically on power loss
  • Verified by fire on Saturday morning
  • Need to document and automate configuration (Ansible SOE)

Meeting closed 16:29


Current Action Items

Wheel Group Ethical Guidelines

  • Add in example situations regarding ethics into Guidlines
  • Or when inducting new/returning wheel members

[MPT] + [333]

[MPT]

  • To do dist-upgrade

[333]

  • To do in-place upgrade (murasoi)
  • To look into getting SoL (Serial over LAN) up for Mooneye's IPMI

[TPG]

  • To look into set up of automated meeting reminders

[committee] + [wheel]

  • Emails about Account Locking to be sent by the end of the month
  • Should have a date for clubroom reopening by then
  • Account Locking to happen after Semester 1 exams