Implementing TrueNAS definitely reduces the amount of technical knowledge you will need, but it doesn't eliminate it completely.
However, this does NOT mean that you can design, install, set-up, manage on a day-to-day basis, and deal with any issues without any technical understanding, though you will certainly need less knowledge that you might otherwise do.
HexOS is purported to reduce the level of knowledge you need to setup and manage a TrueNAS installation, however the Addams family share a concern that if things go wrong you will then have even less technical knowledge and experience to draw upon to help fix things.
This is a fairly opinionated list (and contributors should feel free to modify this list according to their own views):
boot-pool
which contains TrueNAS and which you should not write files to yourself. To store your files, apps and virtualised operating systems you will need one or more storage pools you create - each pool is a self-contained storage group./mnt
. Datasets can contain other datasets, zVolumes or normal directories and files. From a Linux perspective, each dataset is separately mounted - so copying or moving data between datasets will involve a physical copy of the files.A pool is made up of one or more virtual-devices or vDevs.
A vDev is formed from one or more physical disks (HDDs, SSDs, NVMe etc.) or disk partitions - and (in almost all circumstances) these should typically all be the same type and size of disk.
The primary type of vDev is the Data vDev - and every pool must have at least one of these.
There are various secondary types of vDev - SLOG, L2ARC, Dedup, Special Allocation - each of which is designed to solve a specific performance issue which occur with specific use cases (more on these later in this wiki). When designing your pools, you should assume that you do not need any of these unless you have a specific use case which will need them - do not just add these to your design because you think that they must be useful, because if you don't have a use case they will at best cost you money to do absolutely nothing or at worse have a negative impact on performance rather than a positive one.
You can also have spare drives which are unused, ready to be switched in to replace drives which fail - these are not vDevs but they are configured in a similar way.
Data, Slog, Dedup and Special Allocation vDevs are critical to the operation of the pool, and if you lose complete access to any one of these vDevs, your pool (and all the data in it) are most likely going to be gone forever (which is why you still need backups). The more vDevs you have, the more links in the chain that can experience problems - and thus the greater the need for each of the vDevs to have some redundancy in order to survive individual hardware issues. With the exception of single drive, single vDev pools, unless you really know what you are doing and you are prepared to accept fully the risks associated, you should always plan for redundant vDevs, and you should NOT under any circumstances stripe data across multiple single drive vDevs. (L2ARC Cache vDevs can be single drive as they can be lost without any impact to the data integrity of the pool.)
There are two basic ways to make a vDev redundant: Mirrors and RAIDZ:
Important: Redundancy provides more safety than simply recovering from the entire loss of a drive. In ZFS, each record has a checksum associated with it, and when ZFS detects a corrupted block because the checksum doesn't match, the redundant copies can be used to recover the record or file that is corrupted. This is another reason not to use non-redundant vDevs.
A ZFS scrub reads every block of data in a pool, and checks that the data matches the checksum. If not, ZFS attempts to correct the data using the vDev redundancy.
You should run a scrub every 1-3 months at a minimum.
A Snapshot is a way of freezing the contents of a dataset or zVolume so that you can go back to the contents at a later date. For datasets you can go back and find any individual file that may have been changed; for zVolumes you can only roll-back the entire zVolume.
A Checkpoint is a specialised type of Snapshot that allows for greater recovery if there are integrity problems with the pool. It is not intended to be used as a replacement for regular Snapshots, but can be a useful tool for boot-pool during upgrades or for other pools when you are undertaking a major change.
ZFS Replication is a way of copying a snapshot from a dataset in one pool to a different pool on the same or different server.
The key point about ZFS Replication is that it is incremental and therefore extremely efficient - ZFS knows what snapshot is at the destination dataset, and can match it to the same snapshot at the source and then only send any changes that have happened since.
Normal Asynchronous writes are queued up for (normally) 5 seconds before being written to disk as a Transaction Group (TXG) with full pool integrity (i.e. either all writes succeed or all writes fail).
However whilst this maintains the integrity of the ZFS pool, it does not guarantee the integrity of the internal structure of virtual disks nor does it guarantee that business-critical database transactions have been written to disk.
If you need these integrities to be guaranteed as well, then you need to turn on synchronous writes. For these, in addition to queuing the writes for 5 seconds as above, it immediately writes a second copy of the data to the Zero Intent Log (ZIL) which can be used when TrueNAS starts to catch up on any delayed writes that might otherwise be lost, thus maintaining both virtual disk and database integrity.
Unfortunately using synchronous writes has a very detrimental impact on performance, especially if you are writing the ZIL to HDD, and so you should either place the data itself on SSD (so that the ZIL is there also) or implement a special (mirrored) SLOG vDev to divert the ZIL writes from HDD to SSD. (Note: The Linux fsync
issued at the end of writing a file to flush data to disk also creates a ZIL write even for normal asynchronous writes, however unless you are frequently writing small sequential files the performance impact of this is usually limited.)
The following vDev types are specialised secondary vDevs that can be used to address specific performance issues with specific use cases:
Even in the most correctly configured systems, occasionally things do go wrong and your pool either becomes degraded but stays online or ZFS takes it offline completely because the integrity has become suspect and in order to avoid any further corruption. ZFS expects your hardware to behave in a certain way, and if any part of your system is mis-configured, then these sorts of issues become more likely.
If your pool becomes degraded, and you are technically knowledgeable about ZFS, then many recovery actions can be undertaken safely through the UI.
However if your pool has gone offline or you are not 100% confident about how to recover it to fully working order, then your first action should be to seek help from experts on the TrueNAS forums - to put things bluntly, if you try to resolve issues when you are guessing about what to do, you could make the corruption worse and turn a potentially recoverable pool into a completely irrecoverable one where your data becomes permanently inaccessible.
Resilvering is the process of restoring full redundancy when your system has lost one or more drives and become partially or fully degraded (but is still online).
If you have a RAIDZ vDev, from ElectricEel onwards you can now add drives to the data part of the data vDev (retaining the same parity level) to increase your useable storage space.
SMART is a technology whereby each disk can tell the operating system about its internal operating status. TrueNAS can automatically run regular SMART tests on your system, and you are recommended to run SMART Short Tests at least weekly and SMART Long Tests at least monthly.
@JoeSchmuck's Multi-Report script can be used to check on the status of your SMART tests, Scrubs and SMART attributes and inform you if any of your disks are starting to experience early signs of failure.
To access your NAS from your LAN, you will need to configure the server's network cards correctly, and you will need to have a basic understanding of TCP/IP network addressing to achieve this. (Teaching this is not within the scope of this Wiki - however there are lots of resources on the Internet for this.)
When you implement applications or containers or virtualised operating systems, these will typically also need network addresses and / or ports and a slightly greater understanding of TCP/IP networking will be needed to ensure that this is done correctly.
ZFS is designed to write data to pools in Transaction Groups (TXGs) in such a way that a TXG is either fully written to disk or effectively not written at all. Thus ZFS pools maintain integrity even after sudden power loss or O/S crash, and do not need e.g. a Windows chkdsk
or Linux fsck
to be run to restore integrity. However…
The definition of a backup is one where you can recover your important and irreplaceable data regardless of what happens to your primary copy (e.g. a fire guts your entire home) - which means off-site. Anything else is only at-best a partial backup, if indeed it is a backup at all.
This is NOT to say that e.g. Snapshots don't have their benefits - they are very fast and low cost, and they can assist with recovering data that has been accidentally deleted or e.g. encrypted by ransomware, but whilst they may address some risks, they do not address many others.
Previous - 1.3 Which version of TrueNAS? | Index | Next - 1.5 The Structure of this Wiki |