Coding

At some point, ][Cyber Pillar][ may provide some resources to help people get up to speed quickly to be able to learn programming languages. However, in the meantime, here is a little bit of information.

TOOGAM's Software Archive: Page About Programming Languages has some information about some languages.

The difference between full-blown computer language, and a scripting language, may be fuzzily defined. One characterstic that may help to influence this is whether the language creates compiled machine code, or if it creates some sort of compiled byte code that isn't quite as far from machine code as raw source code is, or if the source code itself is typically run through an interpretor. However, those rules may not be hard and fast: The BASIC computer programming language can be interpreted or it can be used by QuickBASIC to create an executable that is run directly from the operating system's command prompt. So, these categories are just based on some perception, and not any strong rules. (Some people may have some disagreement over whether a language better fits in one of these categories or a different category.)

[#codestyl]: Style guide

There can be various styles available related to how code is written (such as whether to use a “for” statement, or a “while” statement), amounts of comments, and indenting styles. Adding clarity, compared to using more language-specific abbreviations for terseness, can also be a matter of style. For example, in C, both of these do the same thing:

if(++x) single-line-of-code;

... and ...

x = x + 1;
if(x!=0) /* Code runs unless x was -1 before add */
{
single-line-of-code;
}

A more medium-of-the-line approach can also be done. The following uses a C++ method of commenting, but the main point is showing how the statement is formatted within the block.

if(++x) // Will only be true if prior job was not -1
{
single-line-of-code;
}

All of those statements are valid code that perform the same task. There is really not just one single universal answer on which style is considered to be “best”. Opinions vary.

People who intend to do programming for others, such as people who want to get paid by a company or people who wish to participate in an existing group development effort, may receive demands to make their code fit one particular style or another.

Curly Brace handling
Name

Curly bracket, brace, curly bracy, or even “handlebars” (referencing how they may look like the handlebars of a bicycle or motorcyle).

Location

Here is a contrast of some standards.

Horstmann style
if(++x) /* Only runs if prior job didn't result in -1 */
{ first-line-of-code;
second-line-of-code;
}

That method of handling curly braces and white space is referred to as “Horstmann style” by Wikipedia's article on “Indent style”: Horstmann style, and features rather condensed white space as well as easy visual balancing that can be verified by scanning a single vertical column.

Contrasted to the Allman style, the Horstmann style uses up less white space vertically. Using up the space vertically can be quite significant in some environments, such as when using a text mode environment that only provides 24 usable rows. So, the Horstmann style can be easier to read in some environments. Also, the Horstmann style uses up less vertical space, which can be useful when writing code on a whiteboard. (Using up less space in useless ways results in more space available to write other content that may be useful.)

It is a great style, and is recommended for code that will be read by other people, especially if the code will not be getting edited nearly as frequently as it is viewed. Unfortunately, the style is not supported by the automatically generated code that is produced by some popular programming environments.

Allman style
if(++x) /* Only runs if prior job didn't result in -1 */
{
first-line-of-code;
second-line-of-code;
}

This is similar to Horstmann, but uses up an entire line for the left curley bracket. This may be a bit easier to edit than Horstmann (because the first line of code does not require pressing “Right Arrow” twice, or Ctrl-“Right Arrow” twice, after the cursor is placed at the start of the desired line). However, it does use up more space vertically, while providing very little benefit in making the code easier to read.

Some people are used to the Allman style because of support by Microsoft Visual Studio. Even such people may benefit from Horstmann style in other circumstances, like writing on a white board, where computer interface is a non-issue and where vertical space may be useful.

K&R
if(++x) { /* Only runs if prior job didn't result in -1 */
first-line-of-code;
second-line-of-code;
}

Notice how the left curly brace is on the same line as the code just before it.

Using this style is considered a relic by some, there are other professionals who prefer this style. This style does increase the challenge in quickly locating the left curly brace. The preferred solution is probably for people to get out of the habit of looking for a left curly brace. Instead, match the right curly brace with the line of code that is indented to the same amount.

(To be clear, K&R C refers to more than just this style of where the curly braces go.)

Comments
What to document

As a generalization, many people (and organizations) feel like too many coders tend to write code with too few comments. As a generalization, documentation should be used to describe the following:

Function declarations
All noteworthy functions should document the purpose of every parameter and also should document what gets returned. (This might not be needed for a simple “get” function that returns an object's private variable, if the variable is named similar to the function and if the variable is documented. Functions that tend to impact things are typically more complex than that extremely simple type of function, and so they should be documented.)
Variables passed by reference

Actually, all variables should be noted.

Documenting every single variable is a guideline that experienced coders might, sometimes, sensibly decide should not be followed in certain circumstances. However, if a passed value is passed by reference, and so the original data may be modified, that should certainly be noted.

Coding Culture Commentary: Contrasting passing “by reference” to passing “by value”

Actually, in C, passing a pointer is typically a more straightforward method. That requires that the calling function passes a pointer, and programmers should generally suspect possible changes to memory when the required parameter is a pointer to a simple data type. Every time the pointer is used in code, the syntax will generally make it clear that there is potential for modifying the original data.

C++ does allow variables that are passed “by reference” instead of “by value”, and some people (including instructors) think that makes the code a bit easier for novice programmers to understand. However, there are many programmers that do not think that passing the variables “by reference” makes things easier, because there is a constant need to remember that a certain variable is affecting memory outside of the function. This is not something that needs to be memorized when the function is documented as receiving a pointer instead of a value that is passed “by reference”.

The author of this text does thoroughly agree that using values “by reference” reduces clarity. However, the typical alternative, which is to pass pointers like what was done in C, is an approach that has a bit of a higher learning curve. People who intend to become serious programmers will likely benefit from just gaining enough experience to get past that learning curve, and then the notation of using pointers will typically be the more clear way of specifying memory changing in the original function.

Some experienced programmers do disagree.

Boundaries

Boundaries of acceptable/possible values should be noted unless they match the boundaries of the data types being used. For instance, boolean values may safely be understood to have just two possible values. If a data type was called “octet”, then a variable's boundaries might not absolutely need to be documented if the variable's boundaries range from 0 through 255.

However, be aware that some implementations may have different boundaries. For instance, in the 1990s some code was migrated from creating a “16-bit” executable file to creating a “32-bit” executable. When that was done, the size of an “int” (integer) was often increased, so that instead of matching the size of a “short” it would match the size of a “long”.

As another example (which might not be specific to computer programming), the size of a “character” has been referred to as 6 bits (BCDIC), 7 bits (EBCDIC and ASCII), 8 bits (octets, “Extended ASCII”), 10 bits (referred to in communications where an octet is padded with flow control), and 16 bits (Unicode).

Perhaps primarily for this reason (of possible future changes in how a data type is treated), but also to help any other coders who might view the code but be more familiar with other languages, documenting the permitted ranges may be a good idea to do even when the ranges are bound by a data type.

Long branches

Particularly if a conditional or loop has code that exceeds about 20 lines (the size of some old text mode screens), or perhaps even half of that size, it can be nice to comment the bottom of the branch. The comment might even be as simple as documenting the condition that was used to determine whether the branch executes. Such simple documentation can help in case the function ends up having nested branches (either immediately, or in the future after the code is further modified).

Complex cases

These types of comments are generally of increased importance for complex functions (including functions of any substantial length, such as being several lines or longer) and those that take many parameters. There may be very few exceptions, such as the following function:

var moveFile(string srcFileName, string destFileName)
{ return copyFile(srcFileName,destFileName) && delFile(srcFileName);
/* The above is effectively the same as:
  var results;
  results = copyFile(srcFileName,destFileName);
  if (results==0)
 * { results = delFile(srcFileName);
 * }
 * return results
*/
}

In such an example, the purpose of the variables are so abundantly clear that adding a comment might be truely unnecessary. (People familiar with copy and move commands will typically understand that such commands generally involve a source and a destination.) However, the logic (of using &&) might actually be advanced enough to require some study and thought for some less experienced coders to correctly understand what is happening. If there is a reason to use the abbreviated form, then a comment to help others may be useful. Such sufficiently documented code might often be enough for people to understand the code even if they have never studied the nuances of the programming language that is being used.

How to comment
Lengthy comments

Like any other element of coding style, there may be some different preferences. However, there may be some definite “don't”s.

As an example: a student once turned in some homework assignment that had a lengthy comment. The comment was written with just a few words per line, spread out over multiple comment blocks placed on different lines of code. Just as written text may “flow” around an image, this comment “flowed” around code.

That might look a bit nice, but it has some drawbacks. One drawback is that such a commenting style is a bit difficult to implement. Another drawback is that making any changes to the comments, or even the code, may require some substantial work. Requiring that work is not a very good style.

Short comments can be placed near the code that they relate to. The best place for longer comments may often be a different location. For example, a great place may be in some paragraphs of text that shows up in the documentation that can be read by the user. If this documentation is public, then a URL could be referenced by the code. URLs, unlike lengthier paragraphs, can be quite short (especially if using a website that provides a URL-shortening service by redirecting short URLs to longer URLs).

As another example, the start of a function will likely provide some standard documentation, like the purpose of expected variables and what results occur when the function is called. (Commonly, the “results” are that a value gets “returned”.) After that typical critical documentation, a programmer could make a lengthy footnote. Then, inside the function, the relevant section of code could include a comment that says “see function footnote 2”. By doing this, the documentation/comment can be changed (slightly or extensively) without needing to adjust the code. Also, once a person fully understands the comment, they can read large amounts of code without needing to scroll past a comment that they understand.

Of course, there is another line of thought: maybe the code should be made simpler so that it doesn't require such elaborate commentary. This might not be easily or sensibly avoidable in some cases, but there are cases where some further thought may lead to substantial simplification, and produce a much nicer result. Certainly such results are more pleasant, and are often worth the effort. At minimum, complicated code is usually worth a bit of thought to consider whether there is a simpler way to accomplish a task.

Function length

(This text may reflect unresearched personal opinion.)

Function length is, like other style aspects, probably fairly subjective. Many/most coders will agree that a function can be “too long”, simply meaning that the function might be simpler to understand if it is broken up into multiple functions that do less. However, the length before people get uneasy may vary between different people.

“If your” function “includes more than a few hundred lines of code, you may want to consider” re-designing ”it into more than one” (quoting from OCA Java SE 7 Programmer I Study Guide (Exam 1Z0-803) by Liguori, Robert; Finegan, Edward (2012-09-18) (Oracle Press) (Kindle Locations 4871-4872 out of 13310). McGraw-Hill. Kindle Edition.)

However, other coders might prefer to be seeing functions that are no longer than 75 lines long, or even a smaller number... perhaps even about 20 lines long.

On the other hand, if functions are split too frequently (and, therefore, are being split into too many smaller functions), some people may refer to the resulting code as “spaghetti code”. Code which seems to jump around in all sorts of directions, in a very unstructured fashion, may be given this derogatory description. (The term references to concept of how a single strand of cooked spaghetti may typically turn in rather unpredictable ways.) Some people may be prone to use the term to describe code that jumps around a lot, paying little attention to how structured or unstructured the code is.

Some examples of documentation related to style may be found as a sub-topic of the section about the C language. (See: Style standards for C programming.)

Full-blown computer programming languages
Languages using compiled code

Granted, categorizing languages based on whether their code is generally compiled may be a bit murky. For example, QBasic can create compiled code, even though BASIC has been thought of as an interpreted language more frequently than a compiled language. So, this distinction may get to be ambiguous at some point.

C (and derivatives)
C
C Standards
Functional Standards

There have been some different standards related to C, including K&R C (named after the last names of Brian Wilson Kernighan, who claimed simply to help write some documentation for the C language, and Dennis MacAlistair Ritchie, who created the language). In 1989, a more formal standard called “ANSI C” was ratified by the standards body called ANSI, and it was then published in 1990. This standard is sometimes called C89. Also in 1990, the standards body called ISO ratified the standard using the full name of “ISO/IEC 9899:1990”. This standard is sometimes called C90. GCC web page about standards has these and other details, nothing “There were no technical differences between these publications” (C89 and C90). This is also covered by Wikipedia's article on ANSI C.

A newer standard called C99 was released in March of 2000. (Prior to being called C99, it had been known as C9X.)

A newer standard “known as C11, was published in 2011 as ISO/IEC 9899:2011.” (This was noted by GCC web page about standards which also notes that “While in development, drafts of this standard version were referred to as C1X.”)

[#clngstyl]: Style standards
KNF (Kernel Normal Form, a way different concept than K&R which is named after Brian Kernighan and Dennis Ritchie) is described by the “style” man page of BSD operating systems. e.g.: OpenBSD's manual page for Kernel source file style guide (KNF) NetBSD guide NetBSD guide for KNF (latest file)
Compatability

ANSI C (C89) is extremely functional and is one of the most compatible/portable standards in the world.

C was largely developed alongside Unix. Unix has a shell called “C shell” which may have some similarities to this language. Support for the DOS operating system platform has also been provided on a commercial level. Support for DOS and older versions of Windows may be commented on by TOOGAM's software archive: C and similar/derived languages.

Functionality

Use of ANSI C (C89), rather than C++, is often categorized by the lack of using objects, and by using standard libaries including those in stdio.h such as printf. Although objects are not supported, a struct is. A struct is a collection of variables, identical in concept to a class/object that contains properties. The one big difference between a struct and a class/object is simply that a class/object can also contain functions (which are called functions), while a classical struct only contains variables.

C++
Functionality
C++ introduced objects.
Compatibiilty

Like C, very widespread.

C#
Name
C# is pronounced as “C Sharp”, implying that the # is the musical symbol called “sharp”.
Compatability

This may largely be a Microsoft-pushed standard.

Functionality

MSDN: Hello World in Visual Basic (example documentation by MSDN) notes, “This code is almost the same as that for Visual C#.”

Some of the niftier tricks, such as assigning a variable in a conditation ( e.g. “ if(x=y) ”), have been outlawed because there was deemed to be too much potential for confusion (and resulting mistakes). Even assuming a non-zero variable's truth may not work, so code such as “ if(x!=0) ” may be needed, rather than the most concise “ if(x) ”.

Misc notes

Microsoft has released products called “Quick C”, and “Visual C”, and its successor “Visual C++”. All of these names are not substantially new languages. Instead, they are simply product names. (This is a different story than “Visual Basic”, which is notably different than its predecessor “Quick Basic”/QBASIC, which was a substantial enhancement to BASIC.)

BASIC (and related/successor code)
BASIC

Some details have been released at TOOGAM's software archive: BASIC programming language. That also discusses QuickBASIC and the very similar QBASIC.

Visual Basic
Working with files

There appear to be multiple approaches.

Open

This may be the oldest method. However, MSDN Documentation on Open does not document much about what happens if there is an error. No return value for Open is documented. Perhaps an object/variable called Err is set? (It may be worthwhile to first run On Error Resume Next?)

(Until this is clarified, this approach may not be the best to take. Plan to experiment/deermine what happens.)

Opening the file

MSDN Documentation on Open indicates that accessing a non-existing file will create the file. So it may be sensible to first check if the file exists.

Dim iFileHandle As Integer
Dim sPathName As String
sPathName = "example.txt"
iFileHandle = FreeFile
Open sPathName For Binary Read Write As #iFileHandle

In theory, the filename does not need to be stored in a variable first, but the alternative would be to specify a path at a constant location, which is usually not the most desirable way to code things.

The word “Binary” could be some other values. Although not listed as being optional, MSDN Documentation on Open does note, “If unspecified, the file is opened for Random access.”

The multi-word combination of “Read Write” permits both activities. An alternative is to use only one of those words.

Between the “Access” mode (e.g. “Read Write”) and the file handle, there is another optional parameter regarding file locking. That was not shown in the above example.

Before the reference to the file handle, the language supports an optional character #. That character seems to have no effect whatsoever, and is just used to help clarify that this is a file.

The file handle should be stored in a variable for later use (to close the file properly). An alternative could be to reference a constant that has an integer value, although that would seem to be less solid code.

After any necessary file access is preformed, the file handle should then be released by closing the file.

File access
Reading from the file

Choose from:

Input

Input #iFileNumber, variableName

MSDN documentation for input notes that instead of variableName, a comma-separated list of variable names may be supplied.

Get

MSDN Documentation of Get

Line Input
If the file was opened using the “Sequential” style of access, then another option may be Line Input #iFileNumber, sStringValue. (The parts after the optional # are example variable names. The file handle could potentially be a constant, although that is not generally the bext practice.) MSDN documentation for Line Input #
Writing to a file
See: MSDN documetnation for Put.
documentation.
FileOpen

This has existed at least as early as Visual Studio 2005. MSDN: “Choosing Among File I/O Options in Visual Basic .NET” noted, “ In Visual Basic .NET, FilePut and FileGet map to the Put and Get functions;

First, identify an available file handle. This is done by calling FreeFile and storing the resulting integer into a variable.

Then, see: MSDN: Opening and Closing Files for Binary Access or, for even more options related to opening a file, MSDN: FileOpen.

Once the file is opened, still hang onto the integer value that represents that file handle.

When done with the file, run FileClose (passing it the integer that represents the used file handle).

MSDN: File Access with Visual Basic Run-Time Functions

Accessing files
MSDN documentation: “Choosing Among File I/O Options in Visual Basic .NET” notes about file accessing methods, “the vast majority are the same: Dir, Input, Print, Seek, Write, and so forth.”
System.IO

MSDN documentation: “Choosing Among File I/O Options in Visual Basic .NET” refers to using a StreamWriter class, and has some example code.

MSDN: File Access Through BinaryReader and BinaryWriter Classes

(The following examples are not very good... they might not pre-check if a file exists, and do not do any other error checking.)

A StreamWriter object can be used to interact with “sequential” files, meaning files that are intended to have all data read in sequence, or written out in sequence. Examples for opening such files:

Dim objSwBrandNewOutputFile As IO.StreamWriter
Dim sFileName As String
Dim objSwExistingOutputFile As IO.StreamWriter
Dim objSwExistingInputTextFile As IO.StreamWriter

sFileName = "newfile.txt"
objSwBrandNewOutputFile = IO.StreamWriter.CreateText(sFilename)
objSwBrandNewOutputFile.Write("File is " & sFilename & vbCRLF)

sFileName = "oldfile.txt"
If IO.File.Exists(sFileName) c
objSwExistingOutputFile = IO.StreamWriter.AppendText(sFilename)
objSwExistingOutputFile.Write("File is " & sFilename & vbCRLF)
objSwExistingOutputFile.WriteLine("Repeat: File is " & sFilename)
objSwExistingOutputFile.Close()
End If ' End If IO.File.Exists(sFileName)

objSwBrandNewOutputFile.Close()

If IO.File.Exists(sFileName) Then
objSwExistingInputTextFile = IO.StreamWriter.OpenText(sFilename)
If objSwExistingInputTextFile.Peek <> -1 Then
sFileName = objSwExistingInputTextFile.ReadLine
End If ' input file opened okay
objSwExistingInputTextFile.Close()
End If ' input file exists
FileSystemObject

MSDN documentation: “Choosing Among File I/O Options in Visual Basic .NET” cites the unfortunate part about using this approach: “ it works only with text files. To manipulate binary files, you must use pointers to an address in memory, or byte arrays, which are not supported by the object.” MSDN: Accessing Files with FileSystemObject says “To manipulate binary files, use the FileOpen Function with the Binary keyword.”.

This is provided by a file named Scrrun.dll and combines multiple functions.

As a possible benefit, Windows Script Host (used by VBScript/JavaScript) may use similar code?

DataReader/DataWriter
MSDN: File Access Through BinaryReader and BinaryWriter Classes says “The BinaryReader and BinaryWriter classes may be more familiar to Visual Basic users as DataReader and DataWriter. Although the names have been changed for the System.IO Namespace model, the underlying functionality remains the same.”
Misc
MSDN: Directory class of Visual Studio .NET, MSDN: System.IO Namespace

IO.File.Exists(sFilename)

VBScript

VBScript is an alternative to JavaScript, supporting similar functionality but having language rules that are more similar to Visual Basic. Some further details about VBScript are available at section about VBScript.

For now, ][Cyber Pillar][ may have little to no information about some of the other options. Some further details may be mentioned at TOOGAM's Software Archive: Page About Programming Languages.

Command line shells / scripting languages
Unix shell scripts

See: batch/script files. For details about program flow (conditionals and looping), read the manual page for the shell that is going to be used. (This guide may someday have more details, but currently does not.)

Pash/“Monad Shell” (“MSH”)/PowerShell
See: Command line shells / scripting languages
[#winscrho]: Windows Scripting Host
See: Command line shells / scripting languages: section about Windows Scripting Host
[#vbscript]: VBScript
See: Command line shells / scripting languages: section about VBScript
PERL

PERL has quite a bit of support for processing text, such supporting the usage of a “regular expression” (“RegEx”). At the time of of this writing, PERL 5 is also built into a number of Unix operating systems. See: PERL.

[#inptstak]: Input stuffing

Some approaches to automation involve “keystroke stuffing” (or “keyboard stuffing”). This may work fine for simple keyboard entry. Some people may try to do more elaborate things, such as trying to read data.

In general, this is often an inferior method of automation. At least in theory, instead of sending keystrokes to a program, an automated task could just do what the program does. Trying to send keystrokes runs some risks, such as a multi-tasking operating system sending another program to the foreground, thereby eating a keystroke (or more than one keystroke), and thereby disrupting the expected result. Another drawback to just using keystroke stuffing is that there may not be a lot of error checking.

All that said, sometimes this can be useful. The solution may be fairly fast and easy to implement. This can be one way to automatically interface with “closed source” software, when code isn't otherwise available.

Other languages/syntaxes
[#html]: Hypertext Markup Language

Note: There are a number of other sections that provide resources related to web development. As specific examples, sections about some related content may include the section about making a web site, and the languages supported by web browsers (JavaScript (and similar) and also Microsoft Internet Explorer's support for VBScript).

A quick guide to basic HTML is available at: Hypertext Markup Language. (That guide used to be here, but has been moved to a separate section.)

Common Information Model (“CIM”)

See: Common Information Model (“CIM”).

CIM may be used by other standards: Wikipedia's article on Common Information Model (computing): section titled “Scheme and specifications” notes, “CIM is the basis for most of the other DMTF standards (e.g. WBEM or SMASH). It is also the basis for the SMI-S standard for storage management.”

[#mswinwmi]: So, for “Web-Based” Enterprise Management (“WBEM”), including Windows Management Instrumentation (“WMI”) for Microsoft Windows, see the new location in the section about Common Information Model.

Misc info

(Information which was in this section has since been categorized.)

Making code

Making code

[#codoptmz]: Optimizing code

There are a number of techniques for optimizing code. Some techniques are designed to optimize code's speed, while other techniques may optimize some other criteria, such as reducing the total number of instructions that get executed. For example, if a processor can add notably faster than it can multiply, code designed for speed might choose to add multiple times, in a loop, while code designed for minimizing instructions might not do that.

Some techniques have more impact than others.

Many times, compilers can use techniques much more effectively than the simple manual efforts that many humans do.

Test outside of loops when possible

This is often one of the simplest techniques, yet this can often have a notable impact on the system's speed.

James O'Neill's blog on Hyper-V's API notes, “An If inside a” ... “loop ? filter before looping”. (In other words, finding an If inside of a loop is worthy of raising a question. The natural response is: perform the filtering “conditional” test before the start of the loop.) The example being cited places a conditional statement directly inside of a loop. It would be more efficient to place the loop in the conditional statement. That way, the test is done just once, instead of uselessly both processing the looping logic and then (re-)performing the test with each loop iteration.

Tasks
[#iveofcn]: Including (debug) output

This section describes a function called “oive”, which stands for “if verbose enough, [then] output”. (An older version of this code used the name “oive”, but then it was decided to call the function “iveo” so that there may be variations, like “ivel” to log. The abbreviation for “log if verbose enough” would look like the words “live” (either rhymes with either “lip”, or sounds like the last syllable of “alive”).

Certainly variations could exist, such as an oive function that takes another (optional?) parameter that affects how information is output (e.g. to a console screen, or by making a GUI box, or by writing to a file, or by outputting to some other logging service that might make a logged entry in the operating system's centralized logging). No sample code for the actual iveo function is provided because there are so many possible variations, and because the code for a working iveo function can often be an extremely simple function to write.

The basic concept is to compare a passed parameter to another value (which may be a global value, or another variable that keeps getting passed). Also, a message (some text) gets passed. Then, the function will actually output the message if one value (e.g., a global value) is at least the value of the first numeric passed parameter. This way, verbosity can be adjusted (and even eliminated entirely) simply be setting the value of a variable which gets passed as the first parameter.

Often, a programmer may want to output some extra information. One possible reason for doing this may be to help debug, with no particular intent to keep such statements in the code after debugging is complete. There may be a nicer way to handle that sort of situation, which is to use debugging software.

OpenBSD's manual page for Kernel source file style guide (KNF) states, “Use err” ... “or warn” ... “don't roll your own!” In other words, if there is a standard reporting method, use that rather than using your own code. At least, make sure that the final production code (which will be used by others) will use the standard reporting method.

However, in some cases there might not be a well-known standard reporting method. Many people do simply “roll” (meaning: create) their own quick output statements.

It may be good to get into the habit of using a variable that only outputs if variables are set high enough. First, create a global value called globalVerbose. This way, all such debugging statements can be shut off quickly when a final version of the program is ready for “production use”. Second, in each function that outputs such code, have another value called lclVerbose. This way, there is an easy turn-off switch for a specific function, while not affecting the output level of other code within the program. Third, have a function called oive (output if verbose enough). The oive function should accept at least two values, which are the required verbosity level to output, and the message to output. The oive function may accept some other (perhaps optional) parameters. A third parameter could be a string that identifies the name of the function that is needing to output, and perhaps also identifies a portion of code from within that function. A fourth parameter lets the calling function identify itself, and a value for the local verbosity level.

Depending on what language that is used, the code used by another function might look something like the following example. This example involves code from a function called filetester that outputs a debug message if needed.

lclVerbose = 7
/* More code gets inserted here... */
oive(4,"My details","filetester: check if file is too large",lclVerbose);

The filetester() function is only expecting this message to be output if the used verbosity level is at least 4. The local verbosity level is currently set to 7 by the time that the oive function is called.

The oive function compares the passed verbosity level to the global value, and then uses whichever one is smaller. If the currently set verbosity levels (e.g. perhaps 7 in the above example) is larger than the passed threshold (e.g. 4 in the above example), then the message gets output.

For instance, a scale like the following may be used:

1: Fatal/Critical/Severe Error

An extremely important message, likely explaining why the entire program is going to be stopping.

2: Impacting Error

A pretty important message, explaining why a specific section of code is going to be stopping.

4: Warning

A warning. This may be something that an end user may want to know about. It will probably be good for a programmer to know that this sort of condition occurred.

8: Important

A message which might explain a key decision about what the program is doing. For instance, the value of an impactful variable that explains how the code is going to operate.

16: Details

Perhaps show even more details

32: Function changes

Often used at the start of a function to mention that it is being started. Perhaps also show some key variables, and show the name of the calling function (if detected, or if that was passed as a string). This may help an observer see how a program is being called.

64: Verbose/Debug

A heavier amount of debugging, which might only be of interest when a lot of details are being requested.

(The reason why these example values doubled is so that the values can be added together, allowing the values to be combined in case combinations might be even more meaningful for the programmer. This is a simple technique that can often be good for coders. For instance, if a program assigns a power of two for each type of error it comes across, and returns a value of 11 (8 + 2 + 1), then a person reviewing the error code might instantly be able to see that there were three types of errors encountered, and know which types of errors those were.)

Important message, such as warnings, may be given a lower required value. Fluff messages may be more prone to only be shown if the current verbosity level is set higher.

Perhaps early in the program, a special verbosity command line parameter may be used to set a global value. This way, the end user can alter how many detials are provided, without needing to change the code. If too many, or not enough, messages are provided, the end user can alter the reporting level. Of course, the programmer can adjust code to make even more drastic changes about what gets recorded. To do this, all the programmer needs to do is to change a function's local variable that affects verbosity. If, later on, the programmer again decides that details are needed about the code, then the programmer can just turn the verbosity value up. Just lowing a verbosity level can be a lot simpler than trying to yank out (a.k.a. delete) debugging code that isn't needed (while not accidentally deleting other code). Re-increasing a single verbosity level can be far quicker than trying to re-insert valuable debugging code that was deleted because the code seemed more verbose than what was desired at an earlier time. (Not only would the reporting code need to be re-typed, but decisions may need to be made again about exactly where the code should be inserted to be more useful. Such decisions may be less easy to remember/decide at a later time when the code hasn't been recenty worked on.)

Debugging

For those with access to source code, one common method is to include commands that will output (most commonly to the screen or to a file, or both, although sending information to a remote host may be another option). A way to cover this may be to use the the “only if verbose enough” function.

In addition, there is “debugging software” that may be useful both for those with source code and those who may not have access to a closed-source program's source code. Debugging software may be able to provide the value of a variable of a running program, and might even be able to alter the value of the variable, or even run another function using the variable. Another common ability is to be able to have a “breakpoint” which will pause (and later, when requested, resume) execution at a specific point in the program. The ability to temporarily halt execution can be great to combine with the previously-mentioned ability of examining a variable. Another common ability is to set up a “watch” which will automatically examine and report the value of a variable, and may automatically cause a breakpoint to be triggered if the value does something interesting (like becoming equal to, or numerically less than, some other specific constant or variable value). This way, a loop that executes hundreds of times before a problem occurs can do so without being stopped by a breakpoint, and then the program will be paused when a problem starts to occur. These abilities can be very helpful.

The first two steps to using debugging software are to know about the debugging software, and to have it installed.

Debugging information may be captured, even by non-programmers. Some details about this may be in the section about handling crashes. Being knowledgable about that process may be a great first step. The details provided by that section is simply scratching the surface: those who spend more time programming may find benefit in learning how to get even more results from those debugging tools.

Other tools that may be helpful are strace and the other instrumentation tools mentioned by Wikipedia's page on Strace: section listing some other tools. For instance, BSD systems have ktrace and many other systems have truss.

Forum post by Darren Reed notes that programs may be different, though similar. It may be that the same task can be accomplished using multiple methods. (However, some methods may be easier than others.)

Having given that overview, this guide does not provide a lot of details about making actual use of the programs. The following may be some brief guidelines: further experimentation and/or documentation may be helpful to find much use out of these programs.

ktrace

Before enabling the kernel-level tracing program, As noted by Forum post about tracing, “You can also use "kdump -l to watch the tracefile while ktrace is appending to” the tracefile. “No need to wait for the traced program to exit.”

[#mkcodsig]: Code signing

At least for now, see: code signing.

[#bitcmprs]: Data compression
No guarantees

Certain types of files are more likely to compress than others. However, it is not true that all files are compressible. This is described by Compression FAQ part 1 (MaximumCompression.com mirror of compression FAQ) provides a simple explanation. For any amount of bits, there are a certain amount of unique files. It is not possible for every single one of those files to become a unique compressed file with fewer bits. If a program were made to reduce the number of bits, at least two input files would result in the same output file. Therefore, you cannot have a unique output file for every input file, so decompressing wouldn't generate all of the unique output files. This is the “pigeon-hole principle”. See also: Wikipedia's article on Lossless data comrpession: “Mathematical background” section.

This isn't to say that some people haven't tried to make hoaxes. Such fraudsters often produce no product. One program is known to hide data. Such hidden data might even be in an area that the operating system would identify as free/deleted/available space. The operating system will not report that space as being used, and so the output does not seem to take up much space. The danger to this problematic approach is that doing this runs the risk of having necessary data may be overwritten. (This is also mentioned by bit compression tutorial.)

Better Archiver with Recursive Functionality (BARF) is meant as a joke or educational tool.

SHARND may make data which is difficult to impossible to compress (unless a certain key is made).

Compression libraries
General data

For general data, TOOGAM's software archive: Archivers has a section called “Compression Libraries”. There are also open source solutions for some specific types of data. (Search for the source code for some of the solutions mentioned by bit compression tutorial.)

BCL

Programmers may find Basic Compression Library (Basic Compression Library site at SourceForge) to be fairly easy to implement. Multiple compression algorithms are supported. Files may be made even smaller by using multiple algorithms, such as first RLE and then LZ and usually then HUFF. (If this is done, it may make sense to have a decompression program that keeps re-decompressing as long as the output is a BCL1 file. This might only make best sense when it is known that the original file would never be a BCL1 file. Otherwise, some other sort of marker may be needed.)

7-Zip

As noted by the tutorial on compressing data, 7-Zip does a fairly effective job at compressing bits while retaining widespread compatability (especially when it uses the Zip file format), and does not tend to require extreme amounts of hardware. Uncompressing the zip files may have even lower hardware requirements. Implementation may be a bit more challenging, but provide superior results to what BCL accomplishes.

The 7-Zip download page has a hyperlink to a “7z Library” (and also has a hyperlink to 7-Zip Source code). Also, the LZMA SDK (which is a “subset of source code of 7-Zip”) page's “License” section simply states:

LZMA SDK is placed in the public domain.

(The 7z format uses LZMA and LZMA2 compression.)

Variations: lib7zip, SevenZipLib, SevenZipSharp

oberhumer offerings

Presumably meaning Ultimate Compression Library, UCL is used by the product UPX (Ultimate Packer for eXecutables), which is known to be highly portable. Claimed features include being made in ANSI C (so it's extremely portable), extremely low memory requirements, and fast decompression. Those wishing to make a product that may use closed source code, but who do wish to support the creator of this library, may wish to use the commercial Not really Vanished library. This may be some relatively older code: LZO Professional is likewise a newer commercial variation on OpenSource LZO GPL. (The company uses the phrase “Space-Grade Technology”, which is justified from NASA's use of the code. The company identifies itself as oberhumer.com GmbH, and identifies itself as “a small company located in Linz, Austria, specializing on applied information theory in the form of data compression and cryptography.”

zlib

zlib.net's main page notes that this is “Not Related to the Linux zlibc Compressing File-I/O Library”. The page notes, “Not surprisingly, the compression algorithm used in zlib is essentially the same as that in gzip and Zip, namely, the `deflate' method that originated in PKWARE's PKZIP 2.x.” The people who created zlib have been key to the Info-ZIP project. (Hyperlink removed.) See also: RFC 1950: ZLIB Compressed Data Format Specification version 3.3 also credits Info-ZIP.

Others

libbzip2

Graphics
Libpng
Outputing source code

This can be a bit tricky. This can be done, however. Surely the copyright mentioned in the text file describing 1994 IOCCC's smr refers to the documentation, because the actual source code surely has tons of files that would qualify as “prior art”. More information about this subject is discussed in Wikipedia's article about “Quine (computing)”.

Bin2Char (a program designed for real-mode DOS) shows how to output source code, as well as another data file. (Perhaps some other programs by Second Millennium Software may also use this technique.) One issue with this technique is that it writes the file from RAM, so the data needs to be loaded from disk into RAM before it may then be written to disk. This may impose some memory limitations which are higher than necessary. On the plus side, it doesn't take extremely long to implement.

Swapping values in memory

Many people with beginning programming skill will know how to swap the values of variables by using a third variable. There is a process to do this without requiring a third variable space. This can be done with a single bit: it also works with any sequence of bits stored in variables as long as both variables are the same size. (This might not work with variable types of arbitrary length, such as strings or linked lists.)

Use XOR on the variables, and store the result in one of the variables. Then XOR both of the variables (note that one of the variables may have been changed by this point, so make sure that the XOR is happening with the new values), and store the result in the second variable. Then XOR both of the variables (both of which are now storing newly assigned values, which might or might not be the same as earlier values), and store the result in the first variable that was modified.

An important part of this algorithm is that when one variable gets updated, the other one needs to not be updated at the same time. FAQ about working with bits indicates there may be some danger with this technique if the same variable is used for both sides of the swap. For example, if a swapping function expects someone to pass swap(&varone,&vartwo) but the coder ends up passing swap(&varone,&varone) then both of the two items to be swapped are the same physical memory. The expected results of such a swap would be that varone ends up remaining unchanged. The problem is that both variables may end up getting updated when a single one gets updated. So checking for this danger first may be worthwhile. (This may not be an issue when both variables are known to use different memory, but could be more of an issue if using some sort of template-like system that accepts passed values and uses already-existing memory.)

Wikipedia's page called “XOR swap algorithm”: “Variables” section shows the source code for just one variation, called addSwap. It involves first adding the numbers, and being able to store the result in one of the variables. (Clearly this method generally will not work, due to overflow, for all possible values of fixed-size variables.) (This method has not been personally verified by the author of this text, at the time of this writing.) After storing the results in one variable, subtract the value of the second variable from the value of the first variable. Store that difference/result into the memory used by the second variable. Then, again follow the process of subtrcting the value of the second variable from the value of the first variable. Store that difference/result as the new value of the first variable that was modified, and the swap completes.

There may be other methods, such as just asking the processor to perform a swap (using whatever method the processor decides), possibly by using a processor instruction called XCHG.

Other bit-wise operations
FAQ about working with bits
Getting input from standard input
[#cligtky]: Using an command from the OS's command line
MS-DOS
MS-DOS 6 has a CHOICE command.
read in Unix

One may try something like

read -t 5 RDResults

This works with bash. Also, Forum post (by “jim mcnamara”) about “ksh read timeout” states, “This works in the newer versions of the Korn shell like Korn93.”

Otherwise, the following technique has been found to work in OpenBSD's ksh. This is a slight modification (so that it is working better) from Forum post by moderator “vgersh99” about reading with timeout which itself was just an adaptation from Stephane Chazelas's post which noted “There are race conditions” with this code (which she provided as an untested example). (The general idea here is that these commands get placed into a script file, which may then be run.)

Updated code
#!/bin/ksh

# This is meant for the ksh in OpenBSD.
# Other environments may require adaptation.

timedrdterm() {
kill "${$}" 2> /dev/null
tmdrdhst=DidTerm
# The above value may effectivley get lost.  Do not expect it stays set.
}
timedrdusr() {
tmdrdhst=DidSIGUSR
}
timedrd() {
trap 'timedrdusr' USR1
(sleep "${1}" && kill -USR1 "${$}" && timedrdterm ) & sleepPID=${!}
tmdrdhst=NoTermYet
trap 'echo TERM signal received - reading is stopped. > /dev/null' TERM
read "${2}"
# ${?} - results of read command - is zero.
# echo Currently after the read command

# simply unset any special handling
trap - TERM
trap - USR1
# echo Term History ${tmdrdhst}
if [ Text"${tmdrdhst}" = Text"DidSIGUSR" ] ; then
echo Did Terminate > /dev/null
# recommend returning 142 (implies reason for exit was SIGALRM
# to be like bash when using read -t
tmdrdret=142
else
# echo About to term sleep PID
kill "${sleepPID}" 2> /dev/null
# echo Did not Terminate
# recommend returning zero
tmdrdret=0
fi
trap - EXIT
return "${tmdrdret}"
}

timdrswt=${1:-10}
# The above is a fancy way for ksh and/or bash to implement:
# if [ ! X"$1" = X"" ] ; then timdrswt="$1";else timdrswt=/;fi

timedrd ${timdrswt} timedrdResults

echo Obtained varresult: \[${timedrdResults}\]

return ${tmdrdret}

Usage notes: This may still be considered to be a race condition, as the system is not using any sort of centralized locking before it proceeds to send either of the SIGTERMs (which is done with the kill command). However, it is likely not worth trying to improve this code by adding such complexity; it would be more sensible to see if the code of the ksh implementation can be modified.

This may be called by using timedrds.sh which will wait 10 seconds, or timedrds.sh 3 which will only wait three seconds. The exit code will specify if there was a line of text entered before the timeout.

All variables are lost, except that ${tmdrdret} is effectively returned as an exit code. This is true whether the program is called directly, or sourced. If the value of the text is needed, copy these functions into the script that needs them, and call them as the example shows.

Code note: The first two functions could simply have their first line of code integrated into location where the function is called.

Older versions
Slightly modified code

Perhaps this code is a bit simpler, and may work easier if the updated code does not work as well.

#!/bin/ksh

# This is meant for the ksh in OpenBSD.
# Other environments may require adaptation.

# WARNING: Treating this an executable file will likely be easier. Sourcing
# it, by running ". filename", may end parent scripts when a timeout occurs.

timedrd() {
# ${1} and ${2} refer to parameters to this function.
trap : USR1
trap 'kill "${sleepPID}" 2> /dev/null' EXIT
(sleep "${1}" && kill -USR1 "${$}" && kill "${$}" 2>/dev/null && \
echo Stopped reading ) & sleepPID=${!}
read "${2}"
readRet=${?}
kill "${sleepPID}" 2> /dev/null
trap - EXIT
return "${readRet}"
}

timdrswt=${1:-10}

timedrd ${timdrswt} timedrdResults

echo Obtained result: \[${timedrdResults}\]

That last line is simply showing the results of this code.

This behaves slightly different depending on if it runs sourced. If this file is given an executable attribute, and is run as “./timedrds.sh” then the first echo command will never be displayed. Also, environment variables, including the one that contains the entered text, get lost when the script ends. However, the entered text may be output first, unless the time out occurs, in which case the output may simply be the word “Terminated”. Also, the return value may be 143 (perhaps safer to just check for anything over 128) if the timeout occurs. Some of this sounds like it may be shell-dependent, so check before trusting expected behavior in any new environment.

If sourced, by being run as “. ./timedrds.sh”, then the first echo command will be displayed if the timeout occurs. This sounds all good, although know that if a timeout occurs then a parent script may also be terminated.

Based on some differences from Stephane's example, it seems quite likely that at least some part of this (namely the references to USR1) may be at least a bit OS-specific.

For environments other than running ksh in OpenBSD, plan to possibly need some tinkering. For example, Forum post by Perderabo notes that some environments may need to use:

ans=$(line)

... instead of using “ read ans ”. (In both cases, the “ans” is simply an example variable name.) This method would require using the line command (for systems that may not have the read command). Note that this is simply an alternative that may work better on some systems, but worse than others: OpenBSD's ksh comes with an internal “read” command but not an internal “line” command.

Do not name the file as the same name as the function. (Doing so will actually break the function.)