Faculty of Environment

Purchasing large volume storage

Managed Storage

The Faculty offers a managed storage system for large volume research data. The system is classed as manged because Faculty IT carries out all the background activities associated with data storage:

  • purchasing the raw hardware
  • setting up filesystems and making them available on the network
  • backing up the filesystems
  • monitoring and replacing faulty hard drives
  • resizing filesystems if more space is required in the future
  • maintaining the servers
  • etc.

Enterprise vs Non-Enterprise

There are different classes of managed storage systems - scratch, Enterprise, non-Enterprise, etc. The University Policy on Safeguarding Data should be applied when making decisions on whether to use Enterprise or non-Enterprise systems for storing data. Enterprise systems feature criteria such as performance, resilience, high availability and comprehensive backups.

The Faculty system is a managed, non-Enterprise system.

Faculty Storage

The main features of the Faculty storage system are:

  • Relatively cheap (about 120 pounds per Tb)
  • Large-volume capacity on a managed system
  • Limited backups - a single mirror with 30 days of changes (see below for details)
  • Limited performance - the storage is on direct-attached, SATA disks served from 1Gps networked servers
  • Disaster recovery - in the event of a major incident, it could take some time to restore all data from backups (possibly several weeks)
  • Security - the servers are housed in locked Faculty server rooms and are rack-based

Funding and Data Lifetime

The funding for the Faculty storage system comes from individual research grants - the Faculty buys and maintains large servers (to achieve economies of scale) and passes on the exact cost per Tb of the whole system (including backups) to research groups. Obviously this changes from time to time, but is currently around 120 pounds per Tb (depending on the backup policy - see below) for 5 years. (After 5 years, data on Faculty storage systems will move to read-only data repositories which are maintained as part of the over-all Faculty system. No time limit has currently been set on the lifetime of the repositories. Note that this could be an issue because some funding bodies require data to be stored for at least 10 years - so at the time of purchase, research groups may want to request 10 years full data storage at double the prices quoted here).

Storage space can be purchasing by contacting Faculty IT staff (foe-support@leeds.ac.uk).

Partition size

Each server in the system has a RAID data array which is split into partitions. The minimum preferred size which the Faculty re-sells to projects is 1Tb - although smaller partitions may be possible. The maximum size of a single partition is currently around 64Tb - this is set by the maximum size of the data array on a single server (so it could increase in the future).

Backup Policy

The RAID arrays are constantly monitored and can recover from individual disk failures. Current systems run RAID-6 - so at least 3 disks have to fail simultaneously  before a filesystem is lost.

We run 2 levels of backup - the correct level should be decided for each filesystem by the PI responsible for the associated data (with reference to the University policy on Safeguarding Data) -

  1. Scratch space - In this case, the live data is the only copy. There are no backups at all, and there's no possibility of recovering data from failed filesystems. Filesystems in scratch areas will always have the word scratch in their name. There's a possibility of losing data via user errors (if a user accidentally deletes or overwrites files), there's also a possibility of losing entire filesystems if enough disks in the live RAID array fail simultaneously, or if the array is affected by fire, theft, etc. For this level of data protection, we charge 170 pounds per Tb for 5 years.
  2. Mirrored data with increments - In this case, the live data is mirrored to another RAID array in a separate fileserver in a separate server room. The mirror is synchronised overnight and all files which are changed or deleted during the synchronisation are kept. These incremental changes can be kept for either 7 days or 30 days (agreed between the PI and IT support, but the default is 30 days). This protects against disasters such as theft, flooding, etc in the server room (at the worst, 24 hours of work could be lost). It also protects against a critical number of disks failing in the RAID array. It also gives  protection against user errors - files which were deleted or changed up to 30 days ago can be restored. For this level of data protection, we charge 120 pounds per Tb for 5 years.