The Language Agnostic Anatomy of a Source Code File

Article Purpose

The purpose of this article is to explicitly define the language agnostic anatomy of a source code file. I may fail in defining a one-size-fits-all solution, but that is my goal. Embodied in this goal, I want to explicitly identify where and in what order documentation and specific sectionalized-code-divider-labels should reside within a file.

This is not rocket ~~appliances~~ science. This is disciplined and consistent code sectionalization regardless of programming language. Ideally, implementing this approach results in a DX boost.

Do you know of something like this that already exists? Let me know @derekknox.

Problem

Many developers (in one form or another) already sectionalize each code file they author. I notice however that some developers do not (my old-developer-self included). This lack of sectionalization irks me. It irks me more when I switch to authoring in another language where the sectionalization is either absent or inconsistent from the language previous. In times like these, I pay a greater "mental tax" than that which is inherent to the language context switch itself.

I have thought about this inherent "mental tax" (or "mental burden") that is endured by other developers who write in more than one language daily (or at least often). This "tax" is a function of how dissimilar the mechanics, authoring environment, and syntax are of the languages in question. I want to reduce this tax.

Firstly, I don't think I am the first person to do this and I would almost guarantee that many programmers do something similar to what I propose below. Do you? Is there standard you know of?

It is worth noting that my programming heavily involves the UI layer. The proposed approach below is likely overkill or irrelevant for certain programming tasks.

Solution

The implementation of my proposal is dead simple. Simply insert the eight ordered single-line comments with the following section divider labels into a source file:

Documentation
Dependencies
Definition
Cache
Initialization
Hooks
Methods
Handlers

The comment style for each section divider is naturally dependent on the language-in-question's single-line comment syntax. The consistent section divider labels are the language agnostic aspect.

Additionally, there are eight corresponding questions to ask yourself to help determine what code belongs within each section of a given file (if it isn't already obvious):

What is the purpose of the code?
What does the code rely on that is external to it?
What constitutes the definition of the code?
What identifiers are referenced in multiple sections & what are their default values?
What does initialization look like?
What environment-specific functions or lifecycle hooks should be leveraged?
What functionality may be executed and/or provided?
What should occur in response to internal and/or external stimuli?

Anatomy

These eight section divider labels can be subgrouped:

Shell

Documentation
Dependencies
Definition

Core

Cache
Initialization
Hooks
Methods
Handlers

Shell

In honesty, I go back and forth regarding the need to explicitly insert the first three section dividers in a file: Documentation, Dependencies, and Definition. This is because every source code file I have seen, regardless of language, follows the same pattern implicitly (documentation being less of a given experientially). I identify them here for completeness. Their insertion in practice may be considered noise, but their acknowledgment here is important.

Core

The real meat is nested within the Definition. Below are the remaining section divider labels accompanied by some notes regarding order and purpose:

Cache

Explicitly define any and all identifiers to communicate the intent of this file's shared state. In other words, define the identifiers that are referenced in any of the remaining four sections.
The Cache acts as a signpost communicating:

"Each identifier is referenced (not necessarily accessed at runtime) by more than one function below."
"Each identifier's name hints at the capability of the function(s) below."
"Each identifier has or lacks a specific default value."

I have yet to discover a language agnostic organization for identifiers within the Cache or within a function's local cache for that matter. What do you do? Alphabetically?

Initialization

Ensure any Cache identifiers had the opportunity to set default values prior to reference.
Ensure object construction and initialization adhere to the separation of concerns.

Hooks

Author callback bodies that are leveraged specifically in the framework, engine, or environment your program runs. Think of Android Activity Lifecycle, AngularJS Lifecycle Hooks, or Unity Execution Order as examples.
The Hooks are a signpost communicating:

"This file's code depends on environment-provided stimuli or stimulus."
"The environment itself may provide Initialization, Methods, and/or Handlers functions, but the Hooks section should parent them if so.

Methods

The Methods acts as a signpost communicating:

"These are the functional capabilities of this file's Definition and/or instances of it."
"Ensure proper encapsulation of the functions that require it."
"Provide this file's API, if one exists."

The order for the functions themselves within each section also lacks a standard organization. I strive for a universal solution. What do you do? Alphabetically? Probable program flow?

Handlers

The Handlers acts as a signpost communicating:

"Each function below reacts to specific stimuli."
"Stimuli is triggered elsewhere in this file's code and/or the code external to it."

Examples

So far there has been a lot of textual description. This has been intentional. I wanted to detail my thoughts regarding this simple idea rather than only presenting examples. My hope in doing so influences fellow developers to provide input on my thought process. For those wanting examples, your time has come: C# (Unity3D) and TypeScript (Angular 2).

The samples below have been updated with my eight aforementioned section divider labels. Prior to the update, they did indeed contain section dividers, but the labels and count were inconsistent across languages.

C# (Unity) Example

TypeScript (Angular 2) Example

Conclusion

I know the likelihood is slim that the eight previously proposed section-divider-labels will suffice for all programming languages (there are a shit-ton of languages). That does not stop me from trying. I want to grasp if this is a feasible effort or if this effort should tighten scope.

I do know that the authoring I've heavily engaged with in my career thus far (ActionScript 2/3, JavaScript, CoffeeScript, Java, and C#) are accounted for with this approach. It is worth mentioning that some/many labeled sections will lack code depending on what the file is trying to accomplish. My thinking right now is to include the divider labels even if it makes the file a bit noisy. I go back and forth.

All in all, I am looking for the input of others with regard to their:

Sectionalization practice (if any)
Feedback on the labels I've proposed (Too many? Missing any?)
Experience

If you have improvement ideas or any other thoughts, just reach out on Twitter @derekknox or email me derek [at] derekknox.com. I look forward to any input and I hope you enjoyed the read.