You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Current »

CIT systems support does OS patching for RedHat systems on a 6 month schedule with updates around June/July and around December/January each year. This means an outage for arXiv.org which can be kept around 30minutes with appropriate care.

The best weekday times for upgrades are either early in the morning (at the expense of European users) or between the freeze and mailing times (start 6pm, finish before 8pm). Weekdays from 8pm onward are not good because that conflicts with the mailing process which starts at 8pm and runs for several hours.

Because of dependencies between machines, it is best to do all machines at once.

Preparation

  1. disable fsck for all large partitions (>=100GB) so that rebooting is not slowed by long fsck processes which might take hours for the cache/ftp partitions (DeanX note 2008-12-16: do this in fstab and via tune2fs)

Patching

  1. put arXiv.org into "no-write" mode so that all database functions are disabled and it will come back up in this way (arxiv1: sudo /users/e-prints/bin/rl no-write, may have to try more than once if script objects because a submission is being processed, try every 10s until it works)
  2. shut down webserver on arxiv2 (sudo /etc/init.d/httpd stop)
  3. shut down webserver on arxiv1 (sudo /users/e-prints/bin/rl stop)
  4. patch arxiv1,arxiv2,arxiv3
  5. best reboot order: arxiv3 and arxiv1, then arxiv2
  6. check NFS mount on arxiv2 from arxiv1: /data/ftp, /data/orig, /home
  7. check web server on arxiv2: http://export.arxiv.org/
  8. check web server on arxiv1: http://arxiv.org/
  9. put arXiv.org back in normal mode with DB access (arxiv1: sudo /users/e-prints/bin/rl up)
  • No labels