FogBugz On Demand Infrastructure

We've created the FogBugz On Demand infrastructure exclusively for FogBugz, and specifically to give you a stable and secure environment.  This page describes our world-class fault-tolerant disaster recovery environment. 

The FogBugz On Demand infrastructure includes:

  • External and internal monitors for hard drives, backups, and bandwidth
  • Alarm emails from and to multiple addresses at multiple domains using different services
  • Monitoring by redundant, different companies at different physical locations
  • Two data centers (one in Los Angeles, one in New York City) with backups of each others' replicated databases
  • Regular backups, testing of backups, and testing of recovery from backups

Tiered Firewall Infrastructure Within A Data Center

There are three zones of security: red, yellow, and green.  These correspond to a funnel/sieve of three Virtual LANs, which are enforced by the network switch (filtering by MAC address) and gateway/router with embedded firewall (filtering by IP address).  The red zone is the untrusted Internet.  The yellow zone has machines that need to talk to the outside world through the mediating firewall: webservers (with a virtual IP as a loadbalancer) and mail server and DNS server.  The green zone has the SQL Server database server, which never needs to talk to the outside world except through the mediation of a webserver in the yellow zone.  The green zone also contains the management consoles to manage the network switch and for the Dell Remote Access Console cards for emergency reboots and remote desktopping of all the machines in all zones.

Security

The only way our system administrator in the NYC headquarters can access the remote consoles is via username and password over an IP-to-IP tunnel, through which all traffic is encrypted.  The gateways recognize the IP of headquarters and thus grant slightly less limited access, such as the ability to Remote Desktop into certain machines and to manipulate the DRAC and network switch management ports.  The firewall rules prevent any access to the SQL Server machine by the DNS or email server, or indeed at all except (on certain ports and according to certain rules) by the webserver, and via encrypted IP-to-IP tunnel from the headquarters or the other data center (for database replication).

Stability

We have a web farm on a load balancer for automatic redundancy.  Our DNS and email server has tremendous excess capacity and our architecture allows us to easily swap in more hardware as necessary.  We have a redundant gateway standing by as an understudy; if the primary fails, our system administrator will receive notice via monitoring within seconds, and will remotely switch to the backup.  We also have an extra network switch standing by ready for use in case the primary fails; the data center employees, who are onhand 24/7, can switch that device out in minutes.  And the DRAC cards in every computer let us recover from failures by letting our system administrator remotely power-cycle the machines, or remotely control the machines' desktops to administer them.  The DRAC cards have their own operating systems and their own Ethernet ports connecting to the network switch, so even if the machine's connectivity drops off or it crashes hard, our sysadmin can still remotely restart it.

Upgrades

We use "generations" to manage the rollout of FogBugz upgrades.  During each rollout, we can release new code for individual accounts so that different FogBugz sites/accounts are under different generations.  First we do a live push that affects zero (0) accounts, and make sure that goes fine.  Then we select a few representative accounts and push the upgrade to them, and then see if they have a problem.  Then we push to successively more accounts till all FogBugz On Demand sites have upgraded to the new generation.

Start now or check out the Billing FAQ.