Formatting/Making/Creating Filesystem Volumes

Overview
Disk testing

Some software used to perform hardware testing for a data storage device (such as a hard drive) may go much faster by using some options that are destructive to data. If a section of the disk is about to have prior data become unusable because a new filesystem volume is about to be created, then (properly controlled) data loss from hard drive testing may not be a real problem. Therefore, the time just before the creation of the filesystem may be a relatively convenient time to run such a test. Specifically, this sort of time may be a great time to check the portion of the disk that will have a filesystem volume. If there is no disk layout created yet, it may be a great time to perform hardware testing on the entire data storage device.

The reason that such destructive write-enabled testing must often be avoided is simply because it is, well, destructive. Such a “destructive” test, which will cause loss of any data on the portion of the disk being tested (which may immediately cause even more data to be unaccessible, if the data that is lost includes information keeping track of where other data is stored). Even an empty volume would lose key structures that are needed, and so would need to be repaired (or, more probably, just re-created from scratch).

Such a check may be done on the drive after the file system is created, when the filesystem volume is unmounted, although at that time a non-destructive write test (e.g. with the “ -n ” parameter for badblocks in Unix, or the -cc parameter to e2fsck) is needed to prevent massive data loss (which would involve losing the entire volume). (This slower non-destructive test is what is used by e2fsck -cc.)

During the time it takes to run a disk check that heavily involves writing to the disk, the filesystem volume's data is often unusable. (This may be because the testing software requires that the filesystem be unmounted. With some disk testing software, it may be because the software is designed to be run at system startup, instead of loading a more standard operating system.) Having data unavailable is generally something to try to minimize, and the non-destructive test may often be substantially slower, which is why now, when the section of the disk contains no data that anybody cares about, may be a great time to run such a “destructive” test.

On Linux systems that use the mkfs command, one way to perform the write test may be to use mkfs -cc (along with whatever other parameters are needed). In at least some implementations, this will run a test. (Perhaps this is true for many/most/all common implementations, but it is likely worth checking the manual pages for mkfs and, if Ext2 is being used, mkfs.ext2, just to make sure!) Hopefully (for the sake of speed) this will be a destructive test.

Of course, if the command to make a filesystem volume (e.g. if using newfs) does not seem to have an option to run the disk checker, then the decision may already be made. In other cases, why complicate things by embedding the test?

By simply running the test separately, first, any problems with either operation (testing the disk or subsequently making the file system) will be more localized. (There may be a reason not to run the test prior to the command to create a filesystem. For example, one might want to use mkfs -cc to determine the desired block size. However, if a destructive test is being used, it may be desirable to abort the test right after it starts, and then use the reported block size to just run the test manually.)

Note: although the advice which is generally safe to provide is to use non-destructive testing, disk testing may go faster if using destructive testing. If a new filesystem volume is about to be created, overwriting any data that may be on the drive already, then there may not really be a problem with using the method that is destructive to pre-existing data (including files as well as the filesystem structure). If there is an option to use destructive testing, consider using it for full speed. (There is probably not a reason to not use the destructive approach, in this case where a new filesystem volume is about to be created.) Of course, be careful to select the right device object to perform the testing on. (For example, do not test the entire disk when only one filesystem volume is supposed to be re-created, and end up losing data which wasn't intended to be lost.)

[#mkfspreq]: Pre-requisites
Know the type

Before creating a filesystem, know which filesystem type will be used.

After reading the general process for creating a filesystem (using whatever software that is going to be used), before actually creating the filesystem, see if there are additional details that are specific to that type of filesystem.

Destination
File (disk image)

Sometimes it may be desirable to store the image of a filesystem within a file. Details on this are in the section about disk images.

Optical media
If the media is soemthing other than a standard hard drive, some types of media may have specialized software that is commonly used when working with such media. For writing straight to an optical disc, the typical way is to write a disc image straight to the destination. To do that, (like when creating an image of a hard drive) see the disk images section (which has details about creating and writing the data).
Floppy disks

File systems are placed onto a destination which is an already-defined data area. This defined data area could be an entire physical device, such as all of the writable area on a removable disk. (This is the typical implementation if using the obsolete technology known as a “floppy disk”.)

For floppies: Software which can create filesystems on hard drives may also recognize floppy disks and be able to work with that type of media. However, a lot of the standard tools do not fully utilize the media. There may be a way to get even more data off of these things: MTools documentation: section about “high capacity formats” documents some various methods. However, the recommended way to handle floppy disks is to copy data off of the media ASAP and use other storage media. Information about this media is covered in the hardware section: section about floppy disks.

Fixed disks (and removable media that is treated similarly)

The already-defined data area where a filesystem gets placed may be a “destination” which uses only certain portions of writable media. This is common for “fixed” disks, such as hard drives. (The term “fixed” means that the media is not designed with the intent of being regularly removed from the drive.)

Before a file system is placed on a hard drive, the typical/common practice is to segment the data storage area into “partitions” (such as MBR-style partitions, which BSD operating systems might refer to as slices, or GPT partitions, or perhaps a BSDlabel/disklabel partition that might exist within the boundaries of another type of partition). For details about defining the boundaries of such a disk area, see the section about handling a disk layout. As an example, a primary partition on a disk using MBR-style partitioning could be used as a destination when creating a FAT32 partition, while the same computer could have an FFS partition go into a BSD disklabel partition that uses up only some of the space inside a different MBR-style partition.

[#fsdstnam]: Destination name

These destinations typically have names. In MS-DOS, they may be assigned drive letters. In Microsoft Windows, the destinations may have some sort of fancy NTFS device names that start with a backslash and probably \Device, but more commonly what end users end up seeing are drive letters similar to MS-DOS.

In Unix, the destination should be a virtual “device”, corresponding to an entry in the /dev/ subdirectory. This device may correspond to a physical device (such as a removable drive), or this virtual device may refer to a certain portion of the storage media's ability to store information. For instance, a hard drive's partition may have boundaries that indicate where the partition starts and ends, and then the device may correspond to that partition.

Before creating a filesystem on a device, know the name of the destination where the filesystem is going to be stored. To find out the name, here are some hints:

  • If the operating system is running, see what the names of other devices are. In Unix, mount points that are being used may be seen using the mount command with no parameters, and pre-configured (available) mount points may be seen by viewing the information stored in the file system table stored in the /etc/fstab file (described by OpenBSD Manual Page for the file system table stored in the /etc/fstab file)
  • Software that helps to create the disk layout may show the name of the destination, or the name of the hardware. The name of the hardware may end up being similar to the name of the destination. If a specific partition is being used, knowing details like the partition's size and type identifier may help to identify which partition to use.
  • Some clues may be provided by boot logs. (In Unix, run dmesg which should work, or view /var/run/dmesg.boot (which might exist).)
  • Here are some quick tips, specific to operating systems:
    BSD and Linux

    Perhaps try /dev/?d[0-9]* as detailed further in the following subsections:

    BSD

    The drives tend to be named after a driver being used. For instance, SCSI-like devices (including some SATA drives (using AHCI, perhaps?) may use a driver called “sd”, and so the first detected hard drive may be called “sd0”. However, IDE drives may use a driver called wd, and so the first detected drive may be called “wd0”. BSDlabel/Disklabel entries tend to look similar to the hard drive names, but to have a letter added, such as sd0a. One letter, though, is reserved to refer to the device. In OpenBSD, this is the letter c. So the device may be referred to as the device name (e.g. sd0) or the disklabel entry (sd0c).

    The drive may be able to be referenced by name (e.g. sd0), or by its BSDlabel/disklabel (e.g. sd0c in OpenBSD). The hard drive's name may also be able to be referenced by prepending /dev/ (e.g. /dev/sd0) or by prepending /dev/r (e.g. /dev/rsd0). The BSDlabel/disklabel name may also have device nodes, e.g. in OpenBSD this may be /dev/sd0c and /dev/rsd0c. Note that in each of these examples, the example is based on the hard drive being named sd0. Some systems may have different drive names, such as sd1 or wd0. Also, other BSD's may use other letters for the disklabel, such as sd0d instead of sd0c.

    Painfully, it is not always clear which device name is most preferred for some software. In some cases, there may be multiple names that work. Fortunately, the harm of trying to use an incompatible name may be negligible: software generally ends fatally, abruptly, and with a message saying that the device could not be found or used. Then a different name may be tried.

    Linux
    Linux follows a pattern similar to BSD, but the names of the hard drives are not quite as dependent on which driver is being used. They may, however, still be dependent on the controller type. IDE devices may be named /dev/hd[a-z] (which means /dev/hd followed by a single lowercase letter). SCSI devices may be named /dev/sd[a-z]. After the name of the hard drive, a number may be appended to represent a partition name. This number may be one through four, which may correspond to an MBR partition. A “logical drive” inside an “extended partition” may use a number that is five or higher, even if not all the lower numbers are active. (The lower numbers may be reserved for potential future primary partitions.) Some more documentation on these device names may be seen by The Linux Documentation Project: Device Names, Ubuntu Device Names.

    Once the actual device name has been found, some disk utilities (such as fdisk) may support some abbreviated names, such as using sd0 instead of some other form of /dev/*sd0* because the software may be using a standard library call. Some further technical details may be seen by OpenBSD Manual page for “opendev”.

    DOS
    Devices are not typically given names used by various software programs, although hard drives may be given a disk number as shown in fdisk. Instead, devices are generally referred to by their mount points. Further details may be in the section about mount points.
    Microsoft Windows
    Disks are often referred to by mount points, which are very commonly drive letters similar to DOS. Devices might have some sort of Windows Object Device Name, which are commonly usable in operating systems derived from Windows NT code (including Windows 2000 and XP and anything newer than Windows XP). For details, see Troubleshooting: Device Namespace. Some programs, particularly if they are popular programs on Unix platforms, might be able to use a path similar to what is seen in Unix. If that is an option, details are often included in documentation for the Windows version of the program.
    Other/misc
    Perhaps (but perhaps not currently) see details in the Troubleshooting: Device Namespace section. Another method may be to read some documentation of a program that interacts with disks, such as Smartmontools Manual Page for smartctl (“Description” section) for some hints specific to various operating systems.
Size

Before making a partition, having an an understanding, of how big the filesystem is going to be, is generally desirable. Typically, a filesystem will use up the entire size of the data area that the filesystem is written in. For a hard drive, that statement means that an entire partition is typically used. A key detail that needs to be decided is the partition size to use. In order to maximize free disk space, by reducing unnecessary overhead of clusters, some partition sizes may be better to use than other partition sizes. The details may be specific to the type of filesystem that is going to be used. Therefore, details of partition size may often be considered around the same time as decisions are made about what type of filesystem will be used.

Content

Having some idea of the type of data that will be stored on the filesystem may be helpful. (In cases where the filesystem objects aren't actually standard data files, such as when using devfs or procfs or kernfs, this may be more of a requirement rather than an optional consideration.) If the data is only meant to be used by a single operating system, then choosing a filesystem that is better supported by that operating system may make sense.

Typically, at least traditionally with simple setups, there is one filesystem per mount point. In BSD, a disklabel may be used so that several mount points are commonly placed in one hard drive partition. Other Unix systems may be similar to BSD. In other cases, there is typically just one mount point per hard drive partition.

Finally, know that proceeding can often cause data loss which may not be very easy, or even be realistically possible, to recover from.

Post-creation changes

With at least some filesystem formats, at least under some operating systems, there may be some actions that are good to typically take after a filesystem instance/volume is created. After the process of creating the volume seems to be completed, look over the rest of this documentation to see if there are recommended adjustments to make. (As an example, it may be particularly true for Unix that some options exist.)

Making a filesystem volume in Unix
Creating the filesystem volume

These instructions assume that Prerequisites for making a filesystem have been addressed. It is recommended that this section be read and understood before running commands, as warnings may occur after the commands.

If using BSD (or perhaps anytime if using the UFS(2)/FFS(2) file systems, the preferred command may be the newfs command. If both newfs and mkfs exist, using newfs may be the safer route to go (to try to allow the standardized fsck command to work well). (A reason that newfs may be more desirable is imply that it may be less likely to use code from the e2fsprogs package, and result in data loss from the data loss from mixing e2fsprogs and other software suites.)

For most types of file systems, the generally preferred command to create the filesystem may be either newfs, if that exists, or else the mkfs command. (It may be that both commands exist in some operating systems? If so, plan to generally use newfs if it will do what is desired.)

Either of these commands may typically do little to nothing more than run another command. For example, to create an Ext2 drive, the commands that get run might be newfs_ext2fs or perhaps an executable named mkfs.ext2 may be used. Checking for manual pages related to those executables may provide information (such as command line parameters) that may be rather specific to creating the type of filesystem that is going to be created by that executable.

Several parameters may be used. The most impactful of these may be the -t parameter which can affect which executable is used to write to the disk. (The newfs command and the mkfs command may differ with some parameters, but both do support the same sort of “-t formatTYPEfs syntax.) For instance, “ newfs -t msdos devname ” may cause newfs to run “ newfs_msdos devname ”.

In very, very many cases, command line parameters beyond just the minimum ones required may be desirable. However, the software may often allow so much customization that it is (easily) possible to create a filesystem which may be unoptimal, and possibly fairly imcompatible with filesystem drivers unless those drivers support such customizations. Even supporting filesystem drivers might not be able to reliably support some such customizations automatically, and so may need to be provided with details about the filesystem. If in doubt, it may be best to try to just stick with the minimum number of parameters to accomplish what is desired.

As a possible over-simplification, the simple command to make a new filesystem may be:

newfs devname

For more details, see documentation such as the operating system's manual page for the newfs and/or mkfs commands, documentation for commands to mount filesystems (e.g. run “apropos mount”; if that shows too much then try “apropos mount | grep mount_”), documentation for commands to adjust filesystems (e.g. run “apropos tune | grep fs”), and/or documentation about the filesystem format/type that is going to be used. (e.g., creating an Ext2 partition)

[#unxonwfs]: Post-creation changes in Unix
Tuning

After the filesystem volume is made, it may be desirable to adjust/tune a filesystem that was just created. (Sometimes the options that may be set after the filesystem is created could also be set during the process of creating the filesystem. That may seem, and be, slightly faster.) Keep this in mind: after the filesystem volume is created, there may be further options to check out as described in the section about how to adjust/tune a filesystem.

Making lost+found

On ext2 drives, each volume generally has a directory called lost+found. To make sure that there are no problems from the directory not existing, mount the drive, and then make the directory if it doesn't yet exist.

Randomize inodes

This may (insignificantly) help security (insignificantly). Security based on using this technique is pretty poor security, but doing this when there is little to no data on the filesystem volume will go faster than doing this later when there is more data on the volume.

See the documentation for fsirand before using the program. For example, OpenBSD's manual page for fsirand says that the newfs command “now does the equivalent of fsirand itself so it is no longer necessary to run fsirand by hand on a new filesystem” that is created with a recent version of OpenBSD's newfs command. Other operating systems have also included this command (e.g. OpenBSD's manual page for fsirand: “History” section states “The fsirand command appeared in SunOS 3.x.”)

Making a filesystem volume in MS-DOS and similar/compatible

This is generally done with the FORMAT command.

[#reqzfsct]: FORMAT requirements of an original sector

Apparently the first sector needs to have acceptable data which is either specific data accurate about the partition size, or bits cleared to zero so that FORMAT will know to create any such information that may be needed. Having incorrect data in the first part of the partition may cause problems.

A popular manual page for fdisk for Linux: section called “DOS mode DOS 6.x WARNING” states, “The DOS 6.x FORMAT command looks for some information in the first sector of the data area of the partition, and treats this information as more reliable than the information in the partition table. DOS FORMAT expects DOS FDISK to clear the first 512 bytes of the data area of a partition whenever a size change occurs. DOS FORMAT will look at this extra information even if the /U flag is given -- we consider this a bug in DOS FORMAT and DOS FDISK.” “The bottom line is that” for a partition to be compatible with the DOS FORMAT command, software must first “zero the first 512 bytes of that partition before using DOS FORMAT to format the partition.”

(The name of the section is “DOS mode and DOS 6.x WARNING” in what is presumed to be the latest version of the man page, found from the fdisk hyperlink of man7.org: section 7: Superuser and system administration commands which is hyperlinked from kernel.org man-pages. Earlier versions had a slightly smaller section, containing all of this text that this section quotes, in a section called “DOS 6.x WARNING”. e.g., Ubuntu man page for sfdisk: DOS 6.x WARNING, Linux Man Page for fdisk, archived by the Wayback Machine @ Archive.org (last archive of this page from 2015).)

This isn't a step that is needed to be taken manually using supporting FDISK software built into MS-DOS, but realize that this does mean that supporting software may overwrite whatever data is in the start of a partition. This may include the FDISK that comes with an operating system, as well as similar programs. As an example, the Disk Management software which (uses a graphical interface and) comes in some versions of Microsoft Windows may do this, as mentioned by Petri.com guide to Win XP/2003 guide to recovering a deleted volume (when the article mentions a “boot sector” being overwritten).

Despite the first sentence of the quoted material (and the name of the section heading) in the Linux man pages referring to the “DOS 6.x” FORMAT command, the author of this text happens to suspect this behavior exists in many other versions of DOS FORMAT, and the related behavior in many other versions of DOS FDISK. So, unless you know of testing that shows otherwise, don't suppose that MS-DOS 5's FDISK won't do the very same thing.

Having the partition software needing to erase the first sector may seem like a design flaw, as it would be more sensible for a FORMAT command to take care of its own requirements instead of relying on the partition manipulating software to perform this effect of zeroing out certain data. (Perhaps the thought was that by having FORMAT use pre-existing information, FORMAT might save some time. Such a time savings would seem pretty insignificant, especially compared to the substantial amount of time dealing with unnecessary loss of data that can commonly be rather critical, even if such cases of data loss may be very rare.)

(Related discussion: MBR disks: Warning: Possible/certain destruction of data in the first sector references this section.)

Using FORMAT

Note that this is a fairly slow way of creating the filesystem. However, this is the way to do so using built-in software. The reason for the slowness is because some checks may be done on each sector. (A valid FAT filesystem could potentially be made without doing these checks. Some side notes about Linux: Making a drive fast may be doable using Linux software. The result is that the process is very quick. The process just involves writing certain key data at certain locations, and so it does not substantially increase in time as larger drives are made. One may think about that while waiting for the drive formatting to complete. However, do give due care when taking the approach of using Linux: the Linux software may offer several options which, if set incorrectly, can result in a filesystem which may be used by Linux but not DOS. This shouldn't be a problem if all options are set correctly.)

Some edition(s) of a FORMAT command may support a parameter to help determine what type of filesystem is created. For instance, OS/2 may support /FS:HPFS. Windows Vista may supports /FS:FAT and /FS:FAT32 and /FS:exFAT and /FS:NTFS and /FS:UDF.

(At least if the desired filesystem type is not speicified) FORMAT may auto-detect what filesystem to use based on some factors, including the partition type of the partition that is going to be formatted.

Formatting a disk in Microsoft Windows

Sometimes the creation of a disk may be much slower using the graphical tools compared to using the text-mode DOS-based tools. (This might be limited to floppy disks in some version of Win9x. More details may be added here at some point.)

Creating a filesystem volume during operating system installation

Filesystems volumes are frequently formatted as part of the steps of installing an operating system. For further details about formatting a drive in that way, perhaps see the information about installing the operating system. Currently, this section is focused more about creating the file system (manually), which can be done after the operating system is installed (to another partition).

making Making FAT: For hard drives, and most types of drives: FORMAT possibly FORMAT /FS:FAT (OS/2?) In Unix: newfs_msdos or http://www.gnu.org/software/mtools/manual/mtools.html#mformat