Terminology notice: Users of MS-DOS and similar operating systems may be used to seeing the term “logical drive” referring to a segment of disk space that gets referred to by using a drive letter. The documentation indicates that such a “logical drive” is simply part of a primary partition called the “extended partition”. Note that, despite the names being similar, this has absolutely nothing to do with the phrase “logical disk”. The phrase “logical disk”, as used by software such as LVM, is entirely different. (For more details about a “logical drive” as the term is used by MS-DOS documentation, see the section about the MBR partitioning scheme.) TechNet Library Archive: MS-DOS: Details related to
(and other topics), section about DrivParm.sys also references the term “logical drive” which appears to refer to an entirely different concept.
Simply be careful with such terminology: do not expect that the terms always refer to the same type of thing.
- Physical disks, outer logical disk boundaries
- [#raid]: RAID
- See the section about RAID.
- [#disklvm]: LVM (Logical Volume Manager/Management)
Some of this info may be moved/merged with choosing filesystem types: logical disks.
HP-UX, IBM AIX and OS/2 have Logical Volume Management.
Heinz Mauelshagen wrote Logical Volume Manager for Linux, basing the design on HP-UX (according to Wikipedia's page on “Logical Volume Manger (Linux)”). That is supported by several distributions including Debian (and Ubuntu) and SUSE. There may be a package called lvm2. (Further information on this disk layout software may be added here at a later time.)
- [#diskimgs]: Disk images
See the page about disk images for information such as:
- Optical Disc Formats
- ISO 9660
- Hard drive images may be covered in more detail in the virtualization documentation: disk image files.
- Reserved/Hidden Sectors
Sometimes, device manufacturers have created drives that have more capacity than what the drives claim. Device manufacturers have figured out that a lot of the cost in being a major drive manufacturer are costs related to the “research and development” (“R&D”). However, once the R&D process has already been mostly or fully completed for some drive models, manufacturing a drive with more capacity might not cost much more than manufacturing a drive with less capacity.
What some technology device manfacturers (including data storage devices like a magnetic hard drive, or manufacturers of entire/complete computers) have done is to sell drive that claim a smaller capacity, but actually include a larger capacity. By claiming a smaller capacity, those drives don't tend to complete with larger capacity drives in the marketplace. However, by having a larger capacity, a disk may be able to have some sectors which most end users won't be able to easily use.
The hidden data storage sectors may be used along with disk monitoring technology (which may be part of Self-Monitoring, Analysis, and Reporting Technology “S.M.A.R.T.” standard) to detect when there seem to be some physical problems accessing a sector (or multiple sectors). Using technology like “Logical Block Addressing”, the disk can try to use one of the reserved sectors that hasn't been used as much.
The result could be that the drive successfully provides the amount of data storage capacity that was claimed when the drive was sold to a customer, which may result in a lower amount of drives being returned (due to being faulty), a lower amount of drives being replaced by using a warranty, and a higher amount of customers who are satisfied.
Another possibility is that hidden sectors may be used to store some “recovery” software (which might enable an ability to re-install an operating system). Major computer manufacturers will likely be much happier to tell a customer that they can have an operating system be re-installed by selecting an option from the device's firmware, instead of needing to ship a computer back to a factory so that the computer may be repaired/replaced.
Here are some details related to some standards related to which sectors are commonly visible:
- Boundaries within a logical disk
- [#dskstrct]: Disk Structure Methods
- Compatability with the boot sequence
One approach to how information gets laid out on a hard drive has been the approach of relying primarily on a BIOS loading up a “Master Boot Record”, more commonly referred to by its sensible abbreviation “MBR”, in order to start find partitions. This old approach of finding partitions, by relying primarily on a BIOS that loads up the Master Boot Record, has slowly started to be replaced. The replacement uses EFI to implement a GPT method.
- Support for the two methods
The hardware being used may determine whether to use the old standard BIOS/MBR method, or the newer EFI/GPT method, or perhaps some entirely different method (which is probably more likely on older computers, like computers from the 1980's).
To determine whether an operating system will boot from one method or the other, check out the section about operating systems. Many/most operating systems described there will have a section about what disk layouts are supported. (It seems possible that the section about system startup sequences may also mention this sort of detail.)
- Support for BIOS and MBR
The MBR-style partitioning scheme has been supported for decades and is universally supported by newer versions of even older software. MS-DOS supported using an MBR partition starting with version 2.0, which was the first version that started to support hard drives. (Any earlier MS-DOS version must have only supported floppy disks, predating the technological advance of hard drives.)
- Support for EFI and GPT
Compared to the support for BIOS code looking for MBR-based partitions, at the time of this writing, not as many operating systems support booting a GPT disk.
- The versions of Microsoft Windows that have supported booting from GPT at all have not supported booting from GPT if the computer is using PC/BIOS (as noted by Wikipedia's page on GPT: section on OS support). Instead Microsoft page on “Using GPT Drives” states, “Note: Windows only supports booting from a GPT disk on systems that contain Unified Extensible Firmware Interface (UEFI) boot firmware.”
- [#btpartsz]: Bootable partition size
OpenBSD FAQ on large drives notes that some computers have had limits regarding what portions of the drives are fully usable by the boot code. The FAQ notes, “To play it safe, the rule is simple:” “The entire” (root/boot/system) “partition must be within the computer's BIOS (or boot ROM) addressable space.” “Some non-i386 users think they are immune to this, however most platforms have some kind of boot ROM limitation on disk size.” (Documentation is needed.) “This is another good reason to partition your hard disk, rather than using one large partition. ” (Hyperlink was removed from quoted text.)
- [#mbrdisk]: MBR-Style Disk
See: MBR-Style Disk for many details about data on the hard drive that uses this format.
The TestDisk program refers to this type of disk layout as “Intel” and “Intel/PC”.
- Size limit
- Approx 2TB.
- [#guidptbl]: GUID Partition Table (“GPT”)
The abbreviation of GPT stands for GUID Partition Table. That contained an abbreviation as well: GUID stands for “Globally Unique Identifier”.
The section about GPT has now been greatly expanded. See: GUID Partition Table.
Following is some older text that provided just a bit of details about GPT.
GPT may be supported more often by EFI (and/or UEFI). Although, EFI should not be required. Rod Smith's page on GPT says “GPT is part of the EFI specification.” (twice) Others have indicated that GPT is rather separate from EFI. Theo de Raadt has indicated that GPT is unhelpful (post, GPT support, GPT).
The beginning of the disk will start with the string “EFI PART” (bytes 0x4546492050415254).
Also, the byte at 0x01C2 (which identifies the partition type) is set to 0xEE.
Archived Content from the Wayback Machine @ Archive.org: TechNet: Basic Disks and Volumes Technical Reference: “How Basic Disks and Volumes Work” is more specific: the bytes starting at 0x1BE are set to 0x00000200EEFFFFFF01000000FFFFFFFF. This has the effect of showing a single unbootable partition that is nearly 2TB large. (Might some other implementations show a smaller sized partition of the disk itself is less than 2TB in size?)
- Size limit
Microsoft KB 302873: FAQ about the GPT disk architecture says, “GUID Partition Table disks can grow to a very large size. As of July 2001, the Microsoft implementation supports a hard disk of up to 18 EB (512 KB LBAs).” (This article indicated it applied to Windows for Itanium systems, and specifically XP 64-bit, and both Enterprise and Datacenter editions of Windows Server 2003.)
- [#bsdlabel]: BSD disklabel (“bsdlabel”)
A BSD disklabel (sometimes called a “bsdlabel”) is stored at the beginning of a virtual “device”. Such a virtual device may correspond to the beginning of a partition in another format (such as an MBR partition), which is generally the case with modern BSD systems on hard drives. Such a virtual “device” could also be the very beginning of data storage media. This may be more common with some types of removable media, such as floppy disks. As noted by the page about starting OpenBSD: section about the location of the first stage boot loader, this could also have been used for hard drives.
command may be used.
There may be a maximum number of partitions supported by the operating system. For instance, OpenBSD has a sysctl called kern.maxpartitions which OpenBSD Manual: Section 3 (Subroutines): page about sysctl documents as not being changeable (at run-time... presumably it could be changed when creating a kernel, with the standard disclaimers about whether that is a good idea for OpenBSD).
- [#dskpterm]: Different types of partitions: terminology
OpenBSD FAQ on disks: Introduction has a section called “Partitioning” which states,
Due to historical reasons, the term "partition" is regularly used for two different things in OpenBSD and this leads to some confusion.
Perhaps this is most confusing becuase some documentation will simply use the phrase “partition” without discussing which type of partition is being discussed.
Understand that a disklabel record can exist on a drive at the same time as another partitioning scheme like a Master Boot Record (or, at least with FreeBSD, a GPT). The disklabel is a section on the hard drive that can store information about where different sections of the hard drive start and end. Also, another disk layout method, such as using an MBR's partition table, can be a different part of a hard drive which also stores information about where different sections of the hard drive start and end. If everything is set up to make the most sense, these two areas of the hard drive (the BSDLabel and another disk layout method like an MBR) will likely have matching details about where the different sections of the hard drive start and end. These two separate records do not necessarily need to have matching details, although things may be quite prone to be confusing if the details do not match.
So the term “partition” really refers to a general concept, which is having a hard drive broken up into multiple sections. When using or referring to the
command, the term “partition” generally refers to the sections of the hard drive as documented in the “BSD disklabel”/“bsdlabel”. Historically, when dealing with the
command, or with most other operating systems, a disk “partition” generally refers to the sections of the hard drive that are described in the MBR. (In more modern times, this may change as computers start using GPT rather than MBR partition tables.)
Some FreeBSD documentation (e.g. Installing FreeBSD 8x: Chapter 3: section 3.6 Allocating Disk Space section 3.6.2) refers to partitions on an MBR as “slices”.
This guide doesn't try to fully resolve the confusion by providing a specific definition for the usage of the term “partition”. Instead, the confusion is likely to be minimized after a person has a real solid understanding of what is actually being stored on the disk. This doesn't seem to be covered in any real straightforward way by the official documentation of any of the main BSD projects (OpenBSD, FreeBSD, and NetBSD). So, this guide will try to present some information that, in the end, will help to clarify what is being done by the various disk software.
An example may help. Let's say there is an MBR partition of an 80GB drive. The MBR splits up the drive into 4 partitions:
- GB number zero through GB # 19: Partition type B (this is a FAT32 drive made in Win98, which takes up the first bunch of gigabytes)
- GB # 20 through GB # 39: Partition type 7: An installable filesystem (specifically an NTFS drive) which takes up twenty gigabytes
- GB # 40 through GB # 59: MBR partition type A6 (for OpenBSD-related data) which takes up the next twenty GB
- GB # 60 through GB # 79: MBR partition type 83 (for Linux-related data, which might be used by a single Ext2 drive) which takes up the final gigabytes (GB #60 - # 79).
The way this looks in software will depend on which partition editing software is being used.
A BSDlabel might have an entry that basically says something like this:
- a: takes up GB # 40 - GB # 41 and is used by /
- b: takes up GB # 42 - GB # 45 and is used by (swap)
- c: takes up GB # 0 - GB # 79 and is used by (the hard drive)
- d: takes up GB # 46 - GB # 48 and is used by /tmp
- e: takes up GB # 49 - GB # 51 and is used by /var
- f: takes up GB # 52 - GB # 55 and is used by /usr
- k: takes up GB # 56 through # 59 and is used by /home
- i: takes up GB # 0 through # 19 and is used by /srv/fatdrive
- j: takes up GB # 20 through # 39 and is used by /srv/ntfsdriv
- k: takes up GB # 60 through # 79 and is used by /srv/linuxext
(This is somewhat simplified from what would be in the BSD disklabel/bsdlabel. A real disklabel would show information about actual sectors. Also, depending on the command line parameters, the mount points might not show up in the
program, but may be in the /etc/fstab file. For now, we will ignore these complications so we can focus on the more basic concept of how a disklabel gets used.)
With OpenBSD, the disklabel entry called “c” covers the entire hard drive, which is shown by the third disklabel entry in this example. That is unchangable, and we should ignore that when we say that we don't want any overlapping partitions. (NetBSD (I think? Or FreeBSD?) also does something similar with a different letter.)
Other than that, you'll notice that none of the drive partitions overlap. Also, the “BSD disklabel”/“bsdlabel” specifies exactly where the NTFS drive can be found. OpenBSD should be able to successfully access all of these areas of the disk.
In comparison, a more widely used standard such as the MBR may be what gets primarily used by most non-BSD operating systems. You'll notice that it also describes where the NTFS drive is. The MBR table entry does not specify the exact location of the BSD's /var mount point, but that is okay if Microsoft Windows is not going to be interacting with the BSD's /var mount point. What we need from Microsoft Windows is for Microsoft Windows to not affect any of the OpenBSD data. That will happen because Microsoft Windows will basically ignore everything from GB # 40 through GB # 59 because Microsoft Windows treats it as one giant section to be used by BSD.
- [#bsdlabad]: When disk layouts do not match
If the information from the disklabel did not match the information from the MBR, that might or might not cause any problems. Let's look at an example of how these disk layouts could affect things. For example, if the “BSD disklabel”/“bsdlabel” had the information shown above, and if that information was correct (perhaps because someone made a decision to give 20GB to each operating system), but if the MBR incorrectly said that:
- OpenBSD took up GB #40 - # 49
- Linux took up GB # 50 - 79
A user could install Microsoft Windows without any problems, and the user could install OpenBSD. Since we decided that each operating system should use 20 GB, and since that is what OpenBSD's “BSD disklabel”/“bsdlabel” says, the result is that neither using OpenBSD or Microsoft Windows would cause problems. OpenBSD would be using correct information, so it would not cause problems. Microsoft Windows would be ignoring all of GB # 40 through GB # 79 and so there would be no negative consequences. Up to this point, there have been no noticeable problems to complain about.
However, if Linux used the information in this incorrect MBR, then Linux might start writing to space in the area from GB # 50 through # 59. Let's say, for example, that the user just told the Linux operating system to use all of the available disk space as specified by the MBR. That would overwrite the latter part of OpenBSD's /var and all of OpenBSD's /usr and all of OpenBSD's /home. The loss of the last partition means that users lost their data. The user could then boot between Linux and Microsoft Windows, and not really notice any problems as long as the user didn't bother trying to access the drives used by OpenBSD.
After this disaster happens, OpenBSD might still be able to boot somewhat, since OpenBSD's / (on GB # 40 and GB # 41) was not overwritten. However, OpenBSD may complain about some troubles with some of the data that was overwritten by Linux. If OpenBSD tries to use “
” then “
” may make whatever changes it wants to GB # 49 through GB # 51, which will probably wipe out the first part of the data that Linux started writing. The end result is that Microsoft Windows may continue to work, but now OpenBSD starts to work (after being fully repaired/re-installed) but then Linux seems to have broken terribly. (The reason is because Linux wrote information using the MBR which had wrong information. When Linux later tries to read information from the wrong spot on the disk, Linux will not find the data that Linux wants.)
So, an incorrect “BSD disklabel”/“bsdlabel” or an incorrect MBR may or may not have any noticeable problems. As another simpler example, if the “BSD disklabel”/“bsdlabel” simply did not mention the NTFS drive, the only impact is that OpenBSD would not be able to mount the NTFS drive. If OpenBSD wasn't going to mount the NTFS drive anyway, there would be no impact.
- [#msdyndsk]: Dynamic Disks/LDM Partitioning (used by Microsoft Windows Server/Pro)
- See the section about Dynamic Disks (used by Microsoft Windows Server/Pro).
- [#ftdskvol]: Fault-tolerant volumes
- TechNet Win2K RK: Fault-Tolerant Disk Management says, “Existing fault-tolerant volumes (Ftdisk sets) configured on Windows NT 4.0 can still be used on computers running Windows 2000 or converted to dynamic volumes.”
article on “GUID Parittion Table” (“GPT”): “See
also” section refers to others, including some or all of:
- Amiga's “rigid disk block” (“RDB”) format.
- Apple Partition Map (“APM”, unrelated to Advanced Power Management (“APM”), discussed in the hardware section's sub-section on electrical “power interfaces”, which is perhaps a more famous usage of the abbreviation “APM”)
- Boot Engineering Extension Record (“BEER”), which, along with “Protected Area Run Time Interface Extension Services” (“PARTIES”), is now covered by the section on Wikipedia's article for “Host Protected Area”
- Partition information
This section is about some information about partitions. As a brief quick overview, the boundaries of partitions is typically stored in the disk structure, and the section on filesystems contains details about how the files get laid out, and creating the ability to start storing files.
- Where the information is stored
- Details about how this information is stored on the disk are details that are not extensively covered by this section. As a quick reference to materials that may cover that aspect: In many cases, the precise location of such information will be specific to the disk structure. Information about the disk layout might also be somehow referenced in a section with data about mount points. (For instance, the standard from Unix is, for most common inter to use information stored in the file system table stored in the /etc/fstab file.) These areas of the disk dictate how partition information is stored. This section is not as much about how the information is stored/referenced/accessed, but rather what information exists.
- [#partisiz]: Partition size
Limits are more frequently placed on the size of filesystem volumes, although in practice there might often be little difference between a size limit for a filesystem volume and a size limit related to a partition that the filesystem volume fills up.
A filesystem volume does not need to precisely match the size of a standard MBR partition. This is easily noticed by BSD operating systems where filesystem volumes may match the size of bsdlabel/disklabel entries, and so there may typically be multiple filesystem volumes within one MBR partition. However, thinking about partition sizes will often involve thinking about the desired size of the filesystem volume, especially, in the large number of cases where the size of the filesystem volume is just a little bit smaller than (and nearly identical to) the size of the partition.
Some generalizations: Allocating every single byte immediately is not always preferable. Most notably, future flexibility may be limited by a practice of allocating every single byte immediately. This practice can also increase the time of certain disk-intensive tasks such as filesystem testing. If space is going to be used, it is often most convenient to have few partitions. This naturally results in the desire to make large partitions. Generally it is best to make the partitions as large as possible except when there are benefits to keeping them small. However, keeping partitions slightly under the size of a power of two may sometimes result in smaller allocation units. (Making a 15.5GB partition may result in more space being free than making a 17GB partition.) There are some cases where certain versions of operating systems may not support a certain type of filesystem volume that is too large, even if another operating system (or version of an operating system) can access such a drive. (As an example, FAT16 can be up to nearly 16 GB in size as supported by a version of Microsoft Windows, although more common size limitations for FAT16 may be 4GB or 2GB.)
(Some of the following information may move to that section.)
Hardware section about fixed disks provides some references to limitations of hard drives. Some system startup processes (likely involving the software on the drive, but perhaps also related to system startup code) have sometimes led to a strange limitation of only being able to properly boot if a certain file is stored within the first part of a hard drive's storage area. (Then, once the needed code was successfully loaded, the operating system could then support all of the rest of the data on the hard drive.) Such requirements tend to be fairly old and may have been rather uncommon even back in the day, but in some cases could be relevant when deciding how large to make partitions.
(This information may have been placed here from some old notes, rather quickly. A review may be in order.)
- Overview: Why not to allocate every byte
Allocating all of the space immediately, even before the operating system is installed, is very often unnecessary and it does have one significant drawback: it limits options. Keeping space unallocated allows for easier changes to multiple things: partition IDs, file systems within the partitions, mount points, implementation of RAID features, and any volume-level settings like whether quotas or compression may be available options.
Many systems get by with allocating all of the bytes immediately and then living with whatever the consequences are. Having space allocated, so that the space is readily available to use, can be convenient. However, despite these truths, if there is tons of space available, why limit options? In some cases, such as when multiple file systems will be used to optimize support for multiple operating systems, these limits can be impactful. In other cases, there may not be an immediate need to unnecessarily introduce such limits when there's no compelling need.
(The following paragraph should be checked for redundancy, within itself and also with other nearby paragraphs of text. Another factor is that certain disk-wide operations, such as creating and later testing the filesystem, may go faster if the filesystem is smaller. (This might be less true with some filesystem types that mark large sections as being unused until they need to be used, but this is certainly true with some implementations of filesystem support.) If large amounts of data can be placed on seperate partitions, then partition-wide operatings such as searching for a file might be sped up by not needing to be run on partitions that aren't likely to have the file. (Again, though, this impact on speediness may be substantially lessed on a drive with little data.) Having an operating system separated from data might allow for some easier operating system upgrades. It may be nice to keep the operating system's partition a bit small, particularly if that operating system uses a different partition than the one where most of the data will be stored. This can sting if the partition is made too small.
OpenBSD FAQ 4: section on Partitioning notes a “commonly forgotten fact: you do not have to allocate all space on a drive when you set the system up! Since you will now find it a challenge to buy a new drive smaller than 20G, it can make sense to leave a chunk of your drive unallocated. If you outgrow a partition,” easy options can be available with unallocated space. (That quote came from OpenBSD 4.3's FAQ 4 (Archived): section on disklabel: Newer versions of the FAQ may reference a larger size like 100G as the size that new drives don't tend to be smaller than. Also, OpenBSD FAQ 3: “Getting started with OpenBSD”, #FirstSys section titled “What is an appropriate "first system" to learn OpenBSD on?” notes, “don't feel the obligation to allocate all the disk initially -- there is nothing wrong with leaving 72G of an 80G hard disk unallocated if all you need is 8G.”
Keeping this key thing in mind can allow different options down the road. Why lock yourself into a certain path when there is really no reason to do so?
Is it practical to be leaving space unallocated? That depends on the scenario, and doing so may be less practical on smaller drives. However, there are many instances where leaving the space free would be very easy to do. As an example of where it isn't needed, OpenBSD 4.3 FAQ: disklabel section shows how to install the operating system onto a 17GB drive. The example notes “over 17G available for OpenBSD. That's a lot of space, and it isn't likely we will need most of it.” By the end of the example, “over 6G of space is unused”. Actually, 6GB of that is dedicated to /home which, for a server that does not hold user data, could be much smaller (like 100MB and still be largely unused). Two thirds of that drive could be easily left unallocated, despite the size of that example drive being less than 1% of the size of newer drives currently on the market. (That comparison is with a modern operating system: comparisons that are even more mathematically ridiculous could be made if using older operating systems that were much smaller in size.)
OpenBSD FAQ 4: (more) disklabel information notes, “With modern monstrously huge drives,” allocating all remaining available space as a default “is usually a bad idea. If you know you will never use it, don't allocate it, and save it for some future use.” Leaving the space free is not done in many cases where it would be very easy to leave the space free. The tendancy to just allocate every byte can prohibit people from being able to perform certain tasks, so don't lock the system into being forced to do things a certain way when it is totally unnecessary and just as easy to allow flexibility.
Details on how to make the desired partitions are available. (Details may be provided in the sections that are specific to individual disk layout methods. For example, software to manipulate partitions on a disk using the MBR partitioning scheme.) For setting up the disk layout, software is generally included in full operating systems, and is frequently handled during the process of installing an operating system. However, before proceeding with the step of trying to start making changes to desired partitions, determine what partitions to make. Here are some guidelines to consider.
(Some similar discussion may be in the section about virtual machine drive sizes.)
- Determining some sizes
Very often, partitions of some type (such as an MBR partition or a BSDlabel/disklabel partition) are created with the intent of storing certain types of data. The partitions are then given different mount points. The section about mount points may discuss where various types of data is often/commonly found. This sort of information may be considered when deciding upon an organization method for the data.
Some hardware and software, most notably the system startup code, operating systems, and drivers (in that order), may lead to certain limitations about not being able to use a partition for a certain purpose if the partition is too large and/or located in an unsupported location. For a brief overview of these limits prior to the 2TB limit, see “Partition types: Properties of partition tables”, section 2.11: Hard drive limitations. The section discussing bootable partition size mentions another such limit.
OpenBSD 4.3 FAQ 4 (Archived): section on disklabel notes, “We would rather have a few hundred megabytes of unused space than a kilobyte too little.” Making partitions too small can be inconvenient or, worse, problematic. (Not being able to put a file on a drive when desired can be inconvenient. Having a software installation unexpectedly fail, or having a running program be unable to write data that it needs to keep functioning properly, can be problematic.)
With modern operating systems, virtual memory is likely to be supported, and so planning where data goes may involve some consideration about where the swapped/paged data may go. For some operating systems, that may involve making a separate partition for ideal performance (although swapping to a file might also be a possibility). Even if swapping to a file, make sure the partition will be large enough to store that expected amount of data. There are some different opinions about how much swap space is an ideal amount. See the section on selecting the ideal swap/page (file/partition) size.
Determine how much space is needed. For an already existing business, an initial estimate may be as simple as looking at the size of the data currently being used. Another method is to simply look at the size of the data being backed up, although more space may be needed if some data, such as the operating system installation, isn't included in the backups.
- Considering data growth
People (and especially people who like to be organized, including people running organizations) like to be able to store not only their current data, but to also be able to suitably handle the foreseeable data needs for the near future.
Unfortunately, OpenBSD FAQ 3: “Getting started with OpenBSD”, #FirstSys section titled “What is an appropriate "first system" to learn OpenBSD on?” has stated, “you weren't going to get the requirements estimate right, no one ever does”. So people can make some educated guesses, but in practice estimates tend to either be not very precise or not very accurate.
Instead, what the OpenBSD team suggests is to not try hard to get just one storage system that will seem to offer a guarantee of sufficient storage for many years. Instead, the team recommends flexibility, allowing data to be stored in multiple areas if needed, and saving money initially by buying smaller amounts of storage. Then, use that money to buy more storage when the storage starts to run out. By buying additional storage more frequently, the later purchases of storage will likely be able to buy larger amounts of storage for less money. Whether this advice works out in practice may vary, possibly depending on details like how budgets become available, but this philosophy might prove to be the most effective way of spending dollars to maximize available storage capacity (as storage capacity needs increase over a period of time).
Backup software often produces logs, and technologies such as “incremental” backup solutions might even have the ability to restore what data looked like months ago. This may be able to show what the data usage was like months ago, which allows calculations to see how fast the data has been growing. Otherwise, if such data is not immediately available, then very generalized guesses/assumptions may be made, such as that the data probably won't re-double itself in less than two years unless the business has specific plans on using more data, or unless experiences a major expansion (in which case it will likely also be getting enough money that it can afford more disk space). Such initial projections can then be re-verified by recording the amount of disk space used, and then comparing that to how much disk space is used months later.
- Size limits: Cluster sizes
Keep in mind the file systems that will be used. Some older file systems will have certain size limitations that could be impactful. For example, a FAT16 drive using an “unsigned” value for its size may be practically limited to 2GB in size. Using such a small size may not allow for easy modifications to a 4.7GB image of a DVD. Also, such a small size could result in more than a thousand partitions being needed to utilize a drive. Some operating systems may be able to use less than 25 of those (and even less if the plan is to use more partitions on other hard drives, an optical drive, and network drives), but such a number would be impratical even with other operating systems.
The size of a “cluster” (using FAT16 terminology) or a “block” (Linux terminology) can have some impact on space wastage. In general, this is a minor consideration. However, considering the limits of each cluster size can be worthwhile in some cases, so it may be worthwhile to check if the anticipated size is anywhere near a limit. If it is just over a limit, consider reducing the size to allow for the smaller cluster size. Linux Partition HOWTO notes, “In general, you waste on average one half of a block for every file, so matching block size to the average size of your files is important if you have many files.”
With older equipment and file systems, cluster sizes can also be impactful. Details may be provided about individual filesystems in the section about filesystem formats. (For instance, FAT16 size limits are discussed.) (For those interested in the last half-kilobyte of usable space on a partition, see: partitions at start of cylinders for FAT16-specific example, although this might commonly apply to some other filesystem formats too. However, with hard drives now exceeding the size of a terabyte, going after every last half-K on modern systems is more for trivial academic interest than substantial practical gain.)
Generally, categorize the data (which already existing or anticipated) into separate categories. One category of data, generally the most valuable by far, is the “unique data” which is likely to be rather unique to individual people and/or organizations. This data may be the most difficult to re-obtain (re-producing it as needed) if it wasn't backed up. Other categories of data include the operating system, installed programs, and unneeded data such as temporary files and swap data (which may go to a “page file” or a “swap partition”).
- [#partityp]: Partition Type Identifier
The convention is to select a partition type identifier that reflects the type of information stored in the partition.
There are type identifiers related to “extended partitions” which should be used if an “extended partitions” partition is used. Even an extension partition will involve having data segments (called “logical drives”) within the partition that will require type identifiers based on the content. For most other cases, the list of type identifiers recommended to use will depend primarily on what filesystem will be used when storing information in a specific section of the disk. (In probably most cases, there is just one partition type for a specific filesystem. However, there have been some exceptions, as noted in the section about Win9x partition types.) Another usage of a type identifier may be to specify that data is stored without overheard from using a filesystem, and that the section of the disk may be overwritten at will for swapping virtual memory.
- [#filesys]: Filesystems
See the section about filesystems.
- Info to look into: