Repository Practices

For VMS development, there are more complex repository issues than for most projects. That is because multiple types of development are being done with the same repositories, each using them in different ways:

  • developing an application, in given a VMS-based language
  • developing a new VMS-based language
  • developing a new version of VMS
  • developing a version of an existing VMS language for new hardware
  • porting an existing version of VMS to new hardware.

Hence, the repository structure has to support all these uses, and keep development organized, for productivity. For any one developer, many aspects of the repository structure will appear odd, or wrong, or overly conservative. But those aspects are put there to support other kinds of developers. Hence, the repository practices will not be ideal for *any* of the developers, but rather ideal when averaged across all types of development.

As of Jan 2012, the outcome of a meeting held to talk through the issues and decide good practices consists of the following:

First, from a developer standpoint, the structure of the *working* directory, in the most general case would be:

    VMS_MC_SS_shared__x86_64_projects/  VMS_MC_MS_shared__PPC_projects/  ..

Each directory holds all projects configured for a specific VMS hardware-class and specific hardware within that class. So, "MC_SS_shared__x86_64" means the projects in descendant directories are all configured for multi-core, single-socket, shared-environment VMS interface, on x86 64-bit ISA.

Within one of those, say VMS_MC_SS_shared__x86_64_projects/:

    SSR/ Vthread/ pthread/ HWSim/ ..

within one of those, say Vthread: MatrixMultiply_Vthread_project/ KMeans_Vthread_project/ ..

These are project repositories. One holds sub-repositories, organized in such a way that all sub-repositories are leafs. The project repository tracks the state of the sub-repositories, so that when it is cloned, or pulled, it remembers the versions of all sub-repositories.

Every project repository has the same directory structure:

    Application/ C_Libraries/ VMS_Implementations/

where Application/ is its own sub-repository.

The directory C_Libraries/ contains libraries written in C, such as:

    DynArray/ Histograms/ Hash_impl/ ..

These are each their own sub-repository -- note that none should in turn contain their own sub-repositories. All repositories below the project repository should be leaves.

The VMS_Implementations/ directory contains implementations of the language and VMS:

    Vthread_impl/ VMS_impl/

These are each their own separate repository. The contents depends on which VMS interface, and which target hardware the project is targeting.

Project Branches

The project repository has a different branch for each combination of interface and architecture. For example, MC_SS_shared_\_x86_64 is one such combination -- it stands for "Multi-core, Single Socket, shared-environment, x86_64 ISA". Therefore, many different directories could exist on a given hard-drive, that all clone the same project repository, but each contains different code in the Vthread_impl/ and VMS_impl/ directories, depending on which branch of the project that directory has been updated to.

Someone developing an application would only have one directory for a given project repository. That working directory should be updated to the branch that corresponds to the developer's hardware. If they have a 4-socket PowerPC machine, and want the shared-environment version of VMS (as opposed to the split-environment), then they should update to the "MC_MS_shared_\_PPC" branch, for "multi-core, multi-socket, shared-environment PowerPC ISA". But if they have a one-socket SandyBridge machine, they would update to the "MC_SS_shared_\_x86_64" branch.

These branch names follow a strict naming scheme [Project Branch Naming](Project Branch Naming) that must be enforced when a new branch is created. Scripts rely on this naming scheme to automatically clone and update to the appropriate branch. These scripts are used by application developers, for example, who shouldn't have to know about the complex repository structure. The scripts only work if the branches are named the way the scripts expect.

The naming scheme is:


Where interface is named according to: \<Main architecture class\>\_\<variation on main class\>\_\<interface_type\> A standard table of these names is kept in [Project Branch Naming](Project Branch Naming)

Using Project Repositories --- The way the project repositories are used, they should only be committed when there is a complete working version of the project, and every commit should be pushed. These repositories don't contain any files that change -- the only state committed is the versions that sub-repositories are on. Within a project, all files that get modified as part of development must belong to a sub-repository.

Hence, a commit to a project repository is equivalent to a release of the project (or a milestone). They happen very rarely, and always result in tested, working projects. Mercurial supports this, in that a commit of the project forces commits of all sub-repositories if they have outstanding uncommitted changes in them. So, a coherent snap-shot of the project is taken, and can be later restored. (Restore a snap shot by updating to the commit that represents that snapshot).

So, when such a project repository is cloned, all sub-repositories are also cloned, and then updated to the version saved in the last project-commit of the default branch.

The default branch of a project should be empty -- just contain a file whose name is "please_update_to_branch_for_your_hardware_and_desired_VMS_interface.txt"

Higher Level Collections of Projects --- Scripts may be written that automatically get and update all projects existing for a given interface and architecture.

Branches within Sub-repositories --- Sub-repositories, such as the VMS_impl repository, must maintain separate branches for each interface and ISA combination. Any further branches meant to test particular concepts must be grouped by the intf-ISA combo. So, a naming scheme exists for branches within sub-repos:


For example, when developing VMS on the multi-socket multi-core split-environment interface on x86\_64 hardware, and want to create a branch to test an alternative Malloc implementation, name it:


Also, each intf-HW combo must have a "default" branch, named so:


Just before committing a project repository, if any sub-repositories are on a non-default branch, that branch should be merged into the default one for that intr-HW combo. That way, when the project is committed, all sub-repos are on default branches.