gnu_cobol / latest

GnuCOBOL

GnuCOBOL Programmer’s Guide

1. Introduction

This document describes the syntax, semantics and use of the COBOL programming language as implemented by GnuCOBOL, formerly known as OpenCOBOL.

The original principal developers of GnuCOBOL were Keisuke Nishida and Roger While. Since then, many members of the community have been involved in its development.

This document is intended to serve as a fully functional reference and user’s guide, suitable for both those readers learning COBOL for the first time as a training tool, as well as those already familiar with another dialect of COBOL.

A separate manual — containing only the details of the GnuCOBOL implementation and designed for experienced COBOL programmers — has been taken from this guide. That document (GnuCOBOL Quick Reference) contains no training subject matter.

Other documents that should be read is the gnucobol.pdf found in the doc directory of the compiler sources and the file NEWS supplied with the source code of the GnuCOBOL compiler, in the top-level directory. There you will find the latest COBOL language features that have been added, some of which may not be in this document due to time constraints. If you find any, please report it as a bug for the Programmer’s Guide so that it can be fixed.

Yet another document which delves deeper in to the compiler that is a must read, is the FAQ available via the GnuCOBOL Manuals and Guides, although it could do with a wee clean up to ease reading and finding required information.

1.1. Additional Reference Sources

For those wishing to learn COBOL for the first time, Gary can strongly recommend the following resources.

If you like to hold a book in your hands, I strongly recommend Murach’s Structured COBOL, by Mike Murach, Anne Prince and Raul Menendez (2000) - ISBN 9781890774059. Mike Murach and his various writing partners have been writing outstanding COBOL textbooks for decades. It’s an excellent book for those familiar with the concepts of programming in other languages, but unfamiliar with COBOL.

Would you prefer a web-based tutorial? Try the University of Limerick (Ireland) - COBOL web site.

In addition there is the GnuCOBOL FAQ — which has now exceeded 1,400 pages — available as HTML or a downloadable .pdf file.

Along with every release of the compiler sources is the file NEWS. It contains up to the minute updates regarding the compiler and additional COBOL language elements which may well not be yet included in this manual.

1.2. Introducing COBOL

If you already know a programming language other than COBOL, chances are that language is Java, C or C++. You will find COBOL much different from those; sometimes the differences are a good thing and sometimes they aren’t. The thing to remember about COBOL is this: it was designed to solve business problems.

COBOL, first introduced to the programming public in 1959, was the very first programming language to become standardized (in 1960). This meant that a standard-compliant COBOL program written on computer “A” made by company “B” would be able to be compiled and executed on computer “X” made by company “Y” with very few, if any, changes. This may not seem like such a big deal today, but it was a radical departure from all programming languages that came before it and even many that came after it.

The name COBOL actually says it all — COBOL is an acronym that stands for “(CO)mmon (B)usiness (O)riented (L)anguage”. Note the fact that the word “common” comes before all others. The word “business” is a close second. Therein lies the key to COBOL’s success.

1.2.1. Why YOU Should Learn COBOL

Despite statements from industry “insiders”, the COBOL programming language is not dead, even though newer and so-called “modern” languages like Java, C#, .NET, Ruby on Rails and so on appear to have become the languages of choice in the Information Technology world. These languages have become popular because they address the following desired requirements for “modern” programming:

Just because COBOL doesn’t traditionally support objects, classes, and the like doesn’t mean that its “procedural” approach to computing isn’t valuable — after all, it runs 70% of the worlds business transactions, and does so:

  • Using programs that, for the most part, are much more self-documenting than would be the case with any other programming language.
  • Effortlessly providing arithmetic accuracy to 31 digits, with performance approaching that of well-written assembly-language programs. Don’t think this isn’t critically important to banks, investment houses and any business interested in tracking revenues, expenses and profits (duh - like ALL of them).
  • Integrating well with non-COBOL infrastructures such as XML, SOA, MQ, almost any DBMS, Transaction Processing platforms, Queue-Management facilities and other programming languages.
  • By running on almost as many different computing platforms as Java can. You can’t run COBOL programs in your smart phone, but desktops, workstations, midframes/servers, mainframes and supercomputers are all fair game.

Today’s IT managers and business leaders are faced with a challenging dilemma — how do you maintain the enormous COBOL code base that is still running their businesses when academia has all but abandoned the language they need their people to use to keep the wheels rolling? The problem is compounded by the fact that those programmers that are skilled in COBOL are retiring and taking their knowledge with them. In some markets, this appears to be having an inflationary effect on the cost of resources (COBOL programmers) whose supply is becoming smaller and smaller. The pressure to update applications to make use of more up-to-date graphical user interfaces is also perceived as a reason to abandon COBOL in favour of GUI-friendly languages such as Java.

Businesses are addressing the COBOL challenge in different ways:

  1. By undertaking so-called “modernization projects”, where existing applications are either rewritten in “modern” languages or replaced outright with purchased packages. Most of these businesses are using such activities as an excuse to abandon “expensive” mainframes in favour of (presumably) less-expensive “open systems” (mid frame/server) solutions.
  2. Many times these businesses are finding the cost of the system/networking engineering, operational management and monitoring and risk management (i.e. disaster recovery) infrastructures necessary to support truly mission-critical applications to be so high that the “less-expensive” solution really isn’t; in these cases the mainframe may remain the best option, thus leaving COBOL in play and businesses seeking another solution for at least part of their application base.
  3. Training their own COBOL programmers. Since colleges, universities and technical schools have lost interest in doing so, many businesses have undertaken the task of “growing their own” new crop of COBOL programmers. Fear of being pigeon-holed into a niche technology is a factor inhibiting many of today’s programmers from willingly volunteering for such training.
  4. By moving the user-interface onto the desktop; such efforts involve running modern-language front-end clients on user desktops (or laptops or smart phones, etc.) with COBOL programs providing server functionality on mainframe or midframe platforms, providing all the database and file “heavy lifting” on the back-end. Solutions like this provide users with the user-interfaces they want/need while still leveraging COBOL’s strengths on (possibly) downsized legacy mainframe or midframe systems.

It’s probably a true that an IT professional can no longer afford to allow COBOL to be the only wrench in their toolbox, but with a massive code base still in production now and for the foreseeable future, adding COBOL to a multi-lingual curriculum vitae (CV) and/or resume (yes — they ARE different) is not a bad thing at all. Knowing COBOL as well as the language du-jour will make you the smartest person in the room when the discussion of migrating the current “legacy” environment to a “modern” implementation comes around.

You’ll find COBOL an easy language to learn and a FAR EASIER language to master than many of the “modern” languages.

The whole reason you’re reading this is that you’ve discovered GnuCOBOL — another implementation of COBOL in addition to those mentioned earlier. The distinguishing characteristic of GnuCOBOL versus those others is that GnuCOBOL is FREE open-source and therefore FREE to obtain and use. It is community-enhanced and community-supported. Later in this document (see So What is GnuCOBOL?), you’ll begin to learn more about this COBOL implementation’s capabilities.

1.2.2. Programmer Productivity

Throughout the history of computer programming, the search for new ways to improve of the productivity of programmers has been a major consideration. Other than hobbyists, programming is an activity performed for money, and businesses abhor spending anything more than is absolutely necessary; even government agencies try to spend as little money on projects as is absolutely necessary.

The amount of programming necessary to accomplish a given task — including rework needed by any errors found during testing (testing is sometimes jokingly defined as: that time during which an application is actually in production, allowing users to discover the problems) is the measure of programmer productivity. Anything that reduces that effort will therefore reduce the time spent in such activities therefore reducing the expense of same. When the expense of programming is reduced, programmer productivity is increased.

Sometimes the quest for improved programmer productivity (and therefore reduced programming expense) has taken the form of introducing new features in programming languages, or even new languages altogether. Sometimes it has resulted in new ways of using the existing languages.

While many technological and procedural developments have made evolutionary improvements to programmer productivity, each of the following three events has been responsible for revolutionary improvements:

  • The development of so-called “higher-level” programming languages that enable a programmer to specify in a single statement of the language an action that would have required many more separate statements in a prior programming language. The standardization of such languages, making them usable on a wide variety of computers and operating systems, was a key aspect of this development. COBOL was a pioneering development in this area, being a direct descendant of the very first higher-level language (FLOW-MATIC, developed by US Naval Lieutenant Grace Hopper) and the first to become standardized.
  • The establishment of programming techniques that make programs easier to read and therefore easier to understand. Not only do such techniques reduce the amount of rework necessary simply to make a program work as designed, but they also reduce the amount of time a programmer needs to study an existing program in order how to best adapt it to changing business requirements. The foremost development in this area was structured programming. Introduced in the late 1970’s, this approach to programming spawned new programming languages (PASCAL, ALGOL, PL/1 and so forth) designed around it. With the ANSI 85 standard, COBOL embraced the principles espoused by structured programming mavens as well as any of the languages designed strictly around it.
  • The establishment of programming techniques AND the introduction of programming language capabilities to facilitate the re-usability of program code. Anything that supports code re-usability can have a profound impact to the amount of time it takes to develop new applications or to make significant changes to existing ones. In recent years, object-oriented programming (OOP) has been the industry “poster child” for code re-usability. By enabling program logic and the data structures that logic manipulates to be encapsulated into easily stored and retrieved (and therefore “reusable”) modules called classes, the object-oriented languages such as Java, C++ and C# have become the favourites of academia. Since students are being trained in these languages and only these, by and large, it’s no surprise that — today — object-oriented programming languages are the darlings of the industry.

    The reality is, however, that good programmers have been practising code re-usability for more than a half-century. Up until recently, COBOL programmers have had some of the best code re-usability tools available — they’ve been doing it with copybooks and subprograms rather than classes, methods and attributes but the net results have been similar. With the COBOL2002 and the COBOL 2014 standards, the COBOL programming language has become just as “object-oriented” as the “modern” languages, while preserving the ability to support, modify, compile and execute “legacy” COBOL programs as well.

While GnuCOBOL supports few of the OOP programming constructs defined by the COBOL2002 and COBOL2014 standards, it supports every aspect of the ANSI 85 standard and therefore fully meets the needs of points #1 and #2, above. With its supported feature set (see So What is GnuCOBOL?), it provides significant programmer productivity capabilities.

1.3. So What is GnuCOBOL?

GnuCOBOL is a free and open sourced COBOL compiler and runtime environment, written using the C programming language. GnuCOBOL is typically distributed in source-code form, and must then be built for your computer’s operating system using the system’s C compiler and loader. While originally developed for the UNIX and Linux operating systems, GnuCOBOL has also been successfully built for computers running OSX and Windows utilizing the UNIX-emulation features of such tools as Cygwin and MinGW. Also see the GNU website for more information at.

The MinGW approach is a personal favourite with the author of this manual because it creates a GnuCOBOL compiler and runtime library that require only a single MinGW DLL to be available for the GnuCOBOL compiler, runtime library and user programs. That DLL is freely distributable under the terms of the GNU General Public License. A MinGW build of GnuCOBOL fits easily on and runs from a 128MB flash drive with no need to install any software onto the Windows computer that will be using it. Some functionality of the language, dealing with the sharing of files between concurrently executing GnuCOBOL programs and record locking on certain types of files, is sacrificed however as the underlying operating system routines needed to implement them aren’t available to Windows and aren’t provided by MinGW. The current version for MinGW is available at the download link along with various other platforms at the GnuCOBOL download website.

GnuCOBOL has also been built as a truly native Windows application utilizing Microsoft’s freely-downloadable Visual Studio Express package to provide the C compiler and linker/loader. This approach does not lend itself well to a “portable” distribution.

The GnuCOBOL compiler generates C code from your COBOL programs; that C code is then automatically compiled and linked using your system’s C compiler (typically, but not limited to, gcc).

GnuCOBOL fully supports much of the ANSI 85 standard for COBOL (the only major exclusion is the Communications Module) and also supports some of the components of the COBOL2002 and COBOL2014 standards, such as the SCREEN SECTION (see SCREEN SECTION), table-based SORT (see Table SORT) and user-defined functions. There are others with more being added almost weekly.

2. COBOL Fundamentals

This chapter describes the syntax, semantics and usage of the COBOL programming language as implemented by the current version of GnuCOBOL. For the rest of this document the Language is spelt as COBOL to ease reading however the compiler name retains the mixed case of GnuCOBOL.

This document is intended to serve as a full-function reference and user’s guide suitable for both those readers learning COBOL for the first time as usage as a training tool, as well as those already familiar with some dialects of the COBOL language.

A separate manual exists that just contains the details of the Cobol grammar as implemented in GnuCOBOL, which is designed strictly for experienced COBOL programmers and this is taken from this guide. This does NOT contain any training subject matter.

These extra manuals are: GnuCOBOL Quick Reference containing just the COBOL semantics / grammar in a short document while the other, GnuCOBOL Sample Programs, shows detailed example Cobol programs with indication of syntax used in each program.

For each implementation of the GnuCOBOL compiler the supplied files NEWS should also be read for any last minute updates along with files README and INSTALL for building the compiler.

2.1. The COBOL Language - The Basics

2.1.1. Language Reserved Words

Cobol programs consist of a sequence of words and symbols. Words, which consist of sequences of letters (upper- and/or lower-case), digits, dashes (‘-’) and/or underscores (‘_’) may have a pre-defined, specific, meaning to the compiler or may be invented by the programmer for his/her purposes.

The GnuCOBOL language specification defines over 1130 Reserved Words — words to which the compiler assigns a special meaning. This list and number applies to the default list which covers many implementations and that it is possible to limit the list to either a specific implementation via -std=xyz[-strict] or to manually unreserve words if they are used in existing sources as user-defined words.

Programmers may use a reserved word as part of a word they are creating themselves, but may not create their own word as an exact duplicate (without regard to case) of a COBOL reserved word. Note that a reserved word includes all classes, such as intrinsic functions, mnemonics names, system routines and reserved words. The list of reserved words can be changed by adding or removing specific words for a given compile or as a default by use of the steering command -std and -conf. See the specific config files that are by default, held in /usr/local/share/gnucobol/config. Also using the option ‘FUNCTION ALL INTRINSIC’, will add another 100+ reserved words. These can be modified to match the requirements of a business or project team but be warned that these are updated when a new version of the compiler is built so might be more prudent to create your own configuation based on an existing one but with a different name.

In addition, you can add and/or remove reserved words by adding one of these options to cobc to add -freserved-words=value or -freserved=word or, to remove, -fnot-reserved=word. As well as -freserved=word:alias to create an alias for a word as well as -fnot-register=word or -fregister=word to remove or add, a special register word.

See Appendix B - Reserved Word List, for a complete list of GnuCOBOL reserved words .

For any given version of GnuCOBOL you can also list the full current set of reserved words by running cobc with --list-reserved, --list-intrinsic, --list-system as well as --list-mnemonics. Again subject to variation depending on usage of the --std line command.

2.1.2. User-Defined Words

When you write GnuCOBOL programs, you’ll need to create a variety of words to represent various aspects of the program, the program’s data and the external environment in which the program will run. This will include internal names by which data files will be referenced, data item names and names of executable logic procedures.

User-defined words may be composed from the characters ‘A’ through ‘Z’ (upper- and/or lower-case), ‘0’ through ‘9’, dash (‘-’) and underscore (‘_’). User-defined words may neither start nor end with hyphen or underscore characters.

Other programming languages provide the programmer with a similar capability of creating their own words (names) for parts of a program; COBOL is somewhat unusual when compared to other languages in that user-defined words may start with a digit.

With the exception of logic procedure names, which may consist entirely of nothing but digits, user-defined words must contain at least one letter.

2.1.3. Case Insensitivity

All COBOL implementations allow the use of both upper and lower case letters in program coding. GnuCOBOL is completely insensitive to the case used when writing reserved words or user-defined names. Thus, AAAAA, aaaaa, Aaaaa and AaAaA are all the same word as far as GnuCOBOL is concerned.

The only time the case used does matter is within quoted character strings, where character values will be exactly as coded.

By convention throughout this document, COBOL reserved words will be shown entirely in UPPER-CASE while those words that were created by a programmer will be represented by tokens in mixed or lower case.

This isn’t a bad practice to use in actual programs, as it leads to programs where it is much easier to distinguish reserved words from user-defined ones!

2.1.4. Readability of Programs

Critics of COBOL frequently focus on the wordiness of the language, often citing the case of a so-called “Hello World” program as the “proof” that COBOL is so much more tedious to program in than more “modern” languages. This tedium is cited as such a significant impact to programmer productivity that, in their opinions, COBOL can’t go away quickly enough.

Here are two different “Hello World” applications, one written in Java and the second in GnuCOBOL. First, the Java version:

    Class HelloWorld {
        public static void main(String[] args) {
            System.out.println("Hello World!");
        }
    }

And here is the same program, written in GnuCOBOL:

    IDENTIFICATION DIVISION.
    PROGRAM-ID. HelloWorld.
    PROCEDURE DIVISION.
        DISPLAY "Hello World!".

Both of the above programs could have been written on a single line, if desired, and both languages allow a programmer to use (or not use) indentation as they see fit to improve program readability. Sounds like a tie so far.

Let’s look at how much more “wordy” COBOL is than Java. Count the characters in the two programs. The Java program has 95 (not counting carriage returns and any indentation). The COBOL program has 89 (again, not counting carriage returns and indentation)! Technically, it could have been only 65 because the IDENTIFICATION DIVISION. header is actually optional. Clearly, “Hello World” doesn’t look any more concise in Java than it does in COBOL.

Let’s look at a different problem. Surely a program that asks a user to input a positive integer, generates the sum of all positive integers from 1 to that number and then prints the result will be MUCH shorter and MUCH easier to understand when coded in Java than in COBOL, right?

You can be the judge. First, the Java version:

    import java.util.Scanner;
    public class sumofintegers {
        public static void main(String[] arg) {
            System.out.println("Enter a positive integer");
            Scanner scan=new Scanner(System.in);
            int n=scan.nextInt();
            int sum=0;
            for (int i=1;i<=n;i++) {
                sum+=i;
            }
            System.out.println("The sum is "+sum);
        }
    }

And now for the COBOL version:

    IDENTIFICATION DIVISION.
    PROGRAM-ID. SumOfIntegers.
    DATA DIVISION.
    WORKING-STORAGE SECTION.
    01 n   BINARY-LONG.
    01 i   BINARY-LONG.
    01 sum BINARY-LONG VALUE 0.
    PROCEDURE DIVISION.
    DISPLAY "Enter a positive integer"
    ACCEPT n
    PERFORM VARYING i FROM 1 BY 1 UNTIL i > n
        ADD i TO sum
    END-PERFORM
    DISPLAY "The sum is " sum.

My familiarity with COBOL may be prejudicing my opinion, but it doesn’t appear to me that the Java code is any simpler than the COBOL code. In case you’re interested in character counts, the Java code comes in at 278 (not counting indentation characters). The COBOL code is 298 (274 without the IDENTIFICATION DIVISION. header).

Despite what you’ve seen here, the more complex the programming logic being implemented, the more concise the Java code will appear to be, even compared to 2002-standard COBOL. That conciseness comes with a price though — program code readability. Java (or C or C++ or C#) programs are generally intelligible only to trained programmers. COBOL programs can, however, be quite understandable by non-programmers. This is actually a side-effect of the “wordiness” of the language, where COBOL statements use natural English words to describe their actions. This inherent readability has come in handy many times throughout my career when I’ve had to learn obscure business (or legal) processes by reading the COBOL program code that supports them.

The “modern” languages, like Java, also have their own “boilerplate” infrastructure overhead that must be coded in order to write the logic that is necessary in the program. Take for example the public static void main(String[] arg) and import java.util.Scanner; statements. The critics tend to forget about this when they criticize COBOL for its structural “overhead”.

When it first was developed, COBOL’s easily-readable syntax made it profoundly different from anything that had been seen before. For the first time, it was possible to specify logic in a manner that was — at least to some extent — comprehensible even to non-programmers. Take for example, the following code written in FORTRAN — a language developed only a year before COBOL:

    EXT = PRICE * IQTY
    INVTOT = INVTOT + EXT

With its original limitation on the length of variable names (one- to six-character names comprised of a letter followed by up to five letters and/or digits), its implicit rule that variables were automatically created as real (floating-point) unless their name started with a letter in the range I-N, and its use of algebraic notation to express actions being taken, FORTRAN wasn’t a particularly readable language, even for programmers. Compare this with the equivalent COBOL code:

    MULTIPLY price BY quantity GIVING extended-amount
    ADD extended-amount TO invoice-total

Clearly, even a non-programmer could at least conceptually understand what was going on! Over time, languages like FORTRAN evolved more robust variable names, and COBOL introduced a more formula-based syntactical capability for arithmetic operations, but FORTRAN was never as readable as COBOL.

Because of its inherent readability, I would MUCH rather be handed an assignment to make significant changes to a COBOL program about which I know nothing than to be asked to do the same with a C, C++, C# or Java program.

Those that argue that it is too boring / wasteful / time-consuming / insulting (pick one) to have to code a COBOL program “from scratch” are clearly ignorant of the following facts:

  • Many systems have program-development tools available to ease the task of coding programs; those tools that concentrate on COBOL are capable of providing templates for much of the “overhead” verbiage of any program…
  • Good programmers have — for decades — maintained their own skeleton “template” programs for a variety of program types; simply load a template into a text editor and you’ve got a good start to the program…
  • Legend has it that there’s actually only been ONE program ever written in COBOL, and all programs ever “written” thereafter were simply derivatives of that one. Although this is clearly intended as a (probably) bad joke, it is nevertheless close to the very simple truth that many programmers“reuse” existing COBOL programs when creating new ones. There’s certainly nothing preventing this from happening with programs written in other languages, but it does seem to happen more in COBOL shops. It’s ironic that “code re-usability” is one of the arguments used to justify the existence of the “modern” languages.

2.1.5. Divisions Organize Programs

COBOL programs are structured into four major areas of coding, each with its own purpose. These four areas are known as divisions.

Each division may consist of a variety of sections and each section consists of one or more paragraphs. A paragraph consists of sentences, each of which consists of one or more statements.

This hierarchical structure of program components standardizes the composition of all COBOL programs. Much of this manual describes the various divisions, sections, paragraphs and statements that may comprise any COBOL program. COPY statement (see COPY)

2.1.6. Copybooks

A Copybook is a segment of program code that may be utilized by multiple programs simply by having those programs use the COPY statement to import that code. This code may define files, data structures or procedural code.

Today’s current programming languages have a statement (usually, this statement is named “import”, “include” or “#include”) that performs this same function. What makes the COBOL copybook feature different than the “include” facility in newer languages, however, is the fact that the COPY statement can edit the imported source code as it is being copied. This capability makes copybook libraries extremely valuable to making code reusable. Also see section 3. Compiler Directing Facility commands COPY and REPLACE.

2.1.7. Structured Data

A contiguous area of storage within the memory space of a program that may be referenced, by name, in a COBOL program is referred to as a Data Item. Other programming languages use the term variable, property or attribute to describe the same thing.

COBOL introduced the concept of structured data. The principle of structured data in COBOL is based on the idea of being able to group related and contiguously-allocated data items together into a single aggregate data item, called a Group Item. For example, a 35-character ’Employee-Name’ group item might consist of a 20-character ’Last-Name’ followed by a 14-character ’First-Name’ and a 1-character ’Middle-Initial’.

A data item that isn’t itself formed from other data items is referred to in COBOL as an Elementary Item. In the previous example, ’Last-Name’, ’First-Name’ and ’Middle-Initial’ are all elementary items.

2.1.8. Files

One of COBOL’s strengths is the wide variety of data files it is capable of accessing. GnuCOBOL programs, like those created with other COBOL implementations, need to have the structure of any files they will be reading and/or writing described to them. The highest-level characteristic of a file’s structure is defined by specifying the organization of the file, as follows:

ORGANIZATION LINE SEQUENTIAL

These are files with the simplest of all internal structures. Their contents are structured simply as a series of identically- or differently-sized data records, each terminated by a special end-of-record delimiter character. An ASCII line-feed character (hexadecimal 0A) is the end-of-record delimiter character used by any UNIX or pseudo-UNIX (MinGW, Cygwin, OSX) GnuCOBOL build. A truly native Windows build would use a carriage-return, line-feed (hexadecimal 0D0A) sequence.

Records must be read from or written to these files in a purely sequential manner. The only way to read (or write) record number 100 would be to have read (or written) records number 1 through 99 first.

When the file is written to by a GnuCOBOL program, the delimiter sequence will be automatically appended to each data record as it is written to the file. A WRITE (see WRITE) to this type of file will be done as if a BEFORE ADVANCING 1 LINE clause were specified on the WRITE, if no ADVANCING clause is coded.

When the file is read, the GnuCOBOL runtime system will strip the trailing delimiter sequence from each record. The data will be padded (on the right) with spaces if the data just read is shorter than the area described for data records in the program. If the data is too long, it will be truncated and the excess will be lost.

These files should not be defined to contain any exact binary data fields because the contents of those fields could inadvertently have the end-of-record sequence as part of their values — this would confuse the runtime system when reading the file, and it would interpret that value as an actual end-of-record sequence.

LINE ADVANCING

These are files with an internal structure similar to that of a line sequential file. These files are defined (without an explicit ORGANIZATION specification) using the LINE ADVANCING clause on their SELECT statement (see SELECT).

When this kind of file is written to by a GnuCOBOL program, an end-of-record delimiter sequence will be automatically added to each data record as it is written to the file. A WRITE to this type of file will be done as if an AFTER ADVANCING 1 LINE clause were specified on the WRITE, if no ADVANCING clause is coded.

Like line sequential files, these files should not be defined to contain any exact binary data fields because the contents of those fields could inadvertently have the end-of-record sequence as part of their values — this would confuse the runtime system when reading the file, and it would interpret that value as an actual end-of-record sequence.

ORGANIZATION SEQUENTIAL

These files also have a simple internal structure. Their contents are structured simply as an arbitrarily-long sequence of data characters. This sequence of characters will be treated as a series of fixed-length records simply by logically splitting the sequence of characters up into fixed-length segments, each as long as the maximum record size defined in the program. There are no special end-of-record delimiter characters in the file and when the file is written to by a GnuCOBOL program, no delimiter sequence is appended to the data.

Records in this type of file are all the same physical length, except possibly for the very last record in the file, which may be shorter than the others. If variable-length logical records are defined to the program, the space occupied by each physical record in the file will occupy the space described by the longest record description in the program.

So, if a file contains 1275 characters of data, and a program defines the structure of that file as containing 100-character records, then the file contents will consist of twelve (12) 100-character records with a final record containing only 75 characters.

It would appear that it should be possible to locate and process any record in the file directly simply by calculating its starting character position based upon the program-defined record size. Even so, however, records must be still be read or written to these files in a purely sequential manner. The only way to read (or write) record number 100 would be to have read (or written) records number 1 through 99 first.

When the file is read, the data is transferred into the program exactly as it exists in the file. In the event that a short record is read as the very last record, that record will be padded (to the right) with spaces.

Care must be taken that programs reading such a file describe records whose length is exactly the same as that used by the program that created the file. For example, the following shows the contents of a SEQUENTIAL file created by a program that wrote five 6-character records to it. The ‘A’, ‘B’, … values reflect the records that were written to the file:

AAAAAA
BBBBBB
CCCCCC
DDDDDD
EEEEEE

Now, assume that another program reads this file, but describes 10-character records rather than 6. Here are the records that program will read:

AAAAAABBBB
BBCCCCCCDD
DDDDEEEEEE

There may be times where this is exactly what you were looking for. More often than not, however, this is not desirable behaviour. Suggestion: use a copybook to describe the record layouts of any file; this guarantees that multiple programs accessing that file will “see” the same record sizes and layouts by coding a COPY statement (see COPY) to import the record layout(s) rather than hand-coding them.

These files can contain exact binary data fields. Because there is no character sequence that constitutes an end-of-record delimiter, the contents of record fields are irrelevant to the reading process.

ORGANIZATION RELATIVE

The contents of these files consist of a series of fixed-length data records prefixed with a four-byte record header. The record header contains the length of the data, in bytes. The byte-count does not include the four-byte record header.

Records in this type of file are all the same physical length. If variable-length logical records are defined to the program, the space occupied by each physical record in the file will occupy the maximum possible space, and the logical record length field will contain the number of bytes of data in the record that are actually in use.

This file organization was defined to accommodate either sequential or random processing. With a RELATIVE file, it is possible to read or write record 100 directly, without having to have first read or written records 1-99. The GnuCOBOL runtime system uses the program-defined maximum record size to calculate a relative byte position in the file where the record header and data begin, and then transfers the necessary data to or from the program.

When the file is written by a GnuCOBOL program, no delimiter sequence is appended to the data, but a record-length field is added to the beginning of each physical record.

When the file is read, the data is transferred into the program exactly as it exists in the file.

Care must be taken that programs reading such a file describe records whose length is exactly the same as that used by the programs that created the file. It won’t end well if the GnuCOBOL runtime library interprets a four-byte ASCII character string as a record length when it transfers data from the file into the program!

Suggestion: use a copybook to describe the record layouts of any file; this guarantees that multiple programs accessing that file will “see” the same record sizes and layouts by coding a COPY statement (see COPY) to import the record layout(s) rather than hand-coding them.

These files can contain exact binary data fields. The contents of record fields are irrelevant to the reading process as there is no end-of-record delimiter.

ORGANIZATION INDEXED

This is the most advanced file structure available to GnuCOBOL programs. It’s not possible to describe the physical structure of such files because that structure will vary depending upon which advanced file-management facility was included into the GnuCOBOL build you will be using (Berkeley Database [BDB], VBISAM, etc.). We will — instead — discuss the logical structure of the file.

There will be multiple structures stored for an INDEXED file. The first will be a data component, which may be thought of as being similar to the internal structure of a relative file. Data records may not, however, be directly accessed by their record number as would be the case with a relative file, nor may they be processed sequentially by their physical sequence in the file.

The remaining structures will be one or more index components. An index component is a data structure that (somehow) enables the contents of a field, called a primary key, within each data record (a customer number, an employee number, a product code, a name, etc.) to be converted to a record number so that the data record for any given primary key value can be directly read, written and/or deleted. Additionally, the index data structure is defined in such a manner as to allow the file to be processed sequentially, record-by-record, in ascending sequence of the primary key field values. Whether this index structure exists as a binary-searchable tree structure (b-tree), an elaborate hash structure or something else is pretty much irrelevant to the programmer — the behaviour of the structure will be as it was just described. The actual mechanism used will depend upon the advanced file-management package was included into your GnuCOBOL implementation when it was built.

The runtime system will not allow two records to be written to an indexed file with the same primary key value.

The capability exists for an additional field to be defined as what is known as an alternate key. Alternate key fields behave just like primary keys, allowing both direct and sequential access to record data based upon the alternate key field values, with one exception. That exception is the fact that alternate keys may be allowed to have duplicate values, depending upon how the alternate key field is described to the GnuCOBOL compiler.

There may be any number of alternate keys, but each key field comes with a disk space penalty as well as an execution time penalty. As the number of alternate key fields increases, it will take longer and longer to write and/or modify records in the file.

These files can contain exact binary data fields. The contents of record fields are irrelevant to the reading process as there is no end-of-record delimiter.

All files are initially described to a GnuCOBOL program using a SELECT statement (see SELECT). In addition to defining a name by which the file will be referenced within the program, the SELECT statement will specify the name and path by which the file will be known to the operating system along with its organization, locking and sharing attributes.

A file description in the FILE SECTION (see FILE SECTION) will define the structure of records within the file, including whether or not variable-length records are possible and, if so, what the minimum and maximum length might be. In addition, the file description entry can specify file I/O block sizes.

2.1.9. Table Handling

Other programming languages have arrays; COBOL has tables. They’re basically the same thing. There are two special statements that exist in the COBOL language — SEARCH and SEARCH ALL — that make finding data in a table easy.

SEARCH searches a table sequentially, stopping only when either a table entry matching one of any number of search conditions is found, or when all table entries have been checked against the search criteria and none matched any of those criteria.

SEARCH ALL performs an extremely fast search against a table sorted by a key field contained in each table entry. The algorithm used for such a search is a binary search. The algorithm ensures that only a small number of entries in the table need to be checked in order to find a desired entry or to determine that the desired entry doesn’t exist in the table. The larger the table, the more effective this search becomes. For example, a binary search of a table containing 32,768 entries will locate a particular entry or determine the entry doesn’t exist by looking at no more than fifteen (15) entries! The algorithm is explained in detail in the documentation of the SEARCH ALL statement (see SEARCH ALL).

Finally, COBOL has the ability to perform in-place sorts of the data that is found in a table.

2.1.10. Sorting and Merging Data

The COBOL language includes a powerful SORT statement that can sort large amounts of data according to arbitrarily complex key structures. This data may originate from within the program or may be contained in one or more external files. The sorted data may be written automatically to one or more output files or may be processed, record-by-record in the sorted sequence.

A companion statement — MERGE — can combine the contents of multiple files together, provided those files are all pre-sorted in a similar manner according to the same key structure. The resulting output will consist of the contents of all of the input files, merged together and sequenced according to the common key structure(s). The output generated by a MERGE statement may be written automatically to one or more output files or may be processed internally by the program.

A special form of the SORT statement also exists just to sort the data that resides in a table. This is particularly useful if you wish to use SEARCH ALL against the table.

2.1.11. String Manipulation

There have been programming languages designed specifically for the processing of text strings, and there have been programming languages designed for the sole purpose of performing high-powered numerical computations. Most programming languages fall somewhere in the middle.

COBOL is no exception, although it does include some very powerful string manipulation capabilities; GnuCOBOL actually has even more string-manipulation capabilities than many other COBOL implementations. The following summarizes GnuCOBOL’s string-processing capabilities:

Concatenate two or more strings
Conversion of a numeric time or date to a formatted character string
Convert a binary value to its corresponding character in the program’s character set
  • CHAR intrinsic function (see CHAR). Add 1 to argument before invoking the function; the description of the CHAR intrinsic function presents a technique utilizing the MOVE statement that will accomplish the same thing without the need of adding 1 to the numeric argument value first.
Convert a character string to lower-case
  • LOWER-CASE intrinsic function (see LOWER-CASE).
  • C$TOLOWER built-in system subroutine (see C$TOLOWER).
  • CBL_TOLOWER built-in system subroutine (see CBL_TOLOWER).
Convert a character string to upper-case
  • UPPER-CASE intrinsic function (see UPPER-CASE).
  • C$TOUPPER built-in system subroutine (see C$TOUPPER).
  • CBL_TOUPPER built-in system subroutine (see CBL_TOUPPER).
Convert a character string to only printable characters
  • C$PRINTABLE built-in system subroutine (see C$PRINTABLE).
Convert a character to its numeric value in the program’s character set
  • ORD intrinsic function (see ORD). Subtract 1 from the result; the description of the ORD intrinsic function presents a technique utilizing the MOVE statement that will accomplish the same thing without the need of adding 1 to the numeric argument value first.
Count occurrences of sub strings in a larger string
  • INSPECT statement (see INSPECT) with the TALLYING clause.
Decode a formatted numeric string back to a numeric value
  • NUMVAL intrinsic function (see NUMVAL).
  • NUMVAL-C intrinsic function (see NUMVAL-C).
Determine the length of a string or data-item capable of storing strings
  • LENGTH intrinsic function (see LENGTH).
  • BYTE-LENGTH intrinsic function (see BYTE-LENGTH).
Extract a sub string from a string based on its starting character position and length
Format a numeric item for output, including thousands-separators (‘,’ in the USA), currency symbols (‘$’ in the USA), decimal points, credit/Debit Symbols, Leading Or Trailing Sign Characters
  • MOVE statement (see MOVE) with picture-symbol editing applied to the receiving field:
Justification (left, right or centred) of a string field
  • C$JUSTIFY built-in system subroutine (see C$JUSTIFY).
Monoalphabetic substitution of one or more characters in a string with different characters
Parse a string, breaking it up into sub strings based upon one or more delimiting character sequences1
Removal of leading or trailing spaces from a string
  • TRIM intrinsic function (see TRIM).
Substitution of a single sub string with another of the same length, based upon the sub strings starting character position and length
Substitution of one or more sub strings in a string with replacement sub strings of the same length, regardless of where they occur
  • INSPECT statement (see INSPECT) with a REPLACING clause.
  • SUBSTITUTE intrinsic function (see SUBSTITUTE).
  • SUBSTITUTE-CASE intrinsic function (see SUBSTITUTE-CASE).
Substitution of one or more sub strings in a string with replacement sub strings of a potentially different length, regardless of where they occur

2.1.12. Screen Formatting Features

The COBOL2002 standard formalizes extensions to the COBOL language that allow for the definition and processing of text-based screens, as is a typical function on mainframe and midframe computers as well as on many point-of-sale (i.e. “cash register”) systems. GnuCOBOL implements virtually all the screen-handling features described by COBOL2002.

These features allow fields to be displayed at specific row/column positions, various colors and video attributes to be assigned to screen fields and the pressing of specific function keys (F1, F2, …) to be detectable. All of this takes place through the auspices of the SCREEN SECTION (see SCREEN SECTION) and special formats of the ACCEPT statement (see ACCEPT) and the DISPLAY statement (see DISPLAY).

The COBOL2002 standard, and therefore GnuCOBOL, only covers textual user interface (TUI) screens (those comprised of ASCII characters presented using a variety of visual attributes) and not the more-advanced graphical user interface (GUI) screen design and processing capabilities built into most modern operating systems. There are subroutine-based packages available that can do full GUI presentation — most of which may be called by GnuCOBOL programs, with a moderate research time investment (Tcl/Tk, for example) — but none are currently included with GnuCOBOL.

2.1.12.1. A Sample Screen

A Sample Screen Produced by a GnuCOBOL Program:

================================================================================
 GCic (2014/01/02 11:24) GnuCOBOL 2.1 23NOV2013 Interactive Compilation
+------------------------------------------------------------------------------+
: Filename: GCic.cbl                                                           :
: Folder:   E:\Programs\GCic\2013-11-23                                        :
+------------------------------------------------------------------------------+
 Set/Clr Switches Via F1-F9; Set Config Via F12; ENTER Key Compiles; ESC Quits
+------------------------------------------------------------------------------+
: F1  Assume WITH DEBUGGING MODE  F6 >"FUNCTION" Is Optional      : Current    :
: F2  Procedure+Statement Trace   F7 >Enable All Warnings         : Config:    :
: F3  Make a Library (DLL)        F8  Source Is Free-Format       : DEFAULT    :
: F4  Execute If Compilation OK   F9 >No COMP/BINARY Truncation   :            :
: F5  Listing Off                                                 :            :
+------------------------------------------------------------------------------+
 Extra "cobc" Switches, If Any ("-save-temps=xxx" Prevents Listings):
+------------------------------------------------------------------------------+
: ____________________________________________________________________________ :
: ____________________________________________________________________________ :
+------------------------------------------------------------------------------+
 Program Execution Arguments, If Any:
+------------------------------------------------------------------------------+
: ____________________________________________________________________________ :
: ____________________________________________________________________________ :
+------------------------------------------------------------------------------+
 GCic for Windows/MinGW Copyright (C) 2009-2014, Gary L. Cutler, GPL
================================================================================

The above screen was produced by the GnuCOBOL Interactive Compiler, or GCic. See GCic in GnuCOBOL Sample Programs, for the source and cross-reference listing of this program. PDF versions of this document will include an actual graphical image of this sample screen.

Screens are defined in the screen section of the data division. Once defined, screens are used at run-time via the ACCEPT and DISPLAY statements.

2.1.12.2. Color Palette and Video Attributes

GnuCOBOL supports the following visual attribute specifications in the SCREEN SECTION (see SCREEN SECTION):

Color

Eight (8) different colors may be specified for both the background (screen) and foreground (text) color of any row/column position on the screen. Colors are specified by number, although a copybook supplied with all GnuCOBOL distributions (screenio.cpy) defines COB-COLOR-xxxxxx names for the various colors so they may be specified as a more meaningful name rather than a number. The eight colors, by number, with the constant names defined in screenio.cpy, are as follows:

Black

COB-COLOR-BLACK

Blue

COB-COLOR-BLUE

Green

COB-COLOR-GREEN

Cyan

COB-COLOR-CYAN

Red

COB-COLOR-RED

Magenta

COB-COLOR-MAGENTA

Yellow

COB-COLOR-YELLOW

White

COB-COLOR-WHITE

Text Brightness

There are three possible brightness levels supported for text — lowlight (dim), normal and highlight (bright). Not all GnuCOBOL implementations will support all three (some treat lowlight the same as normal). The deciding factor as to whether two or three levels are supported lies with the version of the curses package that is being used. This is a utility screen-IO package that is included into the GnuCOBOL run-time library when the GnuCOBOL software is built.

As a general rule of thumb, Windows implementations support two levels while Unix ones support all three.

Blinking

This too is a video feature that is dependent upon the curses package built into your version of GnuCOBOL. If blinking is enabled in that package, text displayed in fields defined in the screen section as being blinking will endlessly cycle between the brightest possible setting (highlight) and an “invisible” setting where the text color matches that of the field background color. A Windows build, which generally uses the “pcurses” package, will uses a brighter-than-normal background color to signify “blinking”.

Reverse Video

This video attribute simply swaps the foreground and background colors and display options.

Field Outlining

It is possible, if supported by the curses package being used, to draw borders on the top, left and/or bottom edges of a field.

Secure Input

If desired, screen fields used as input fields may defined as “secure” fields, where each input character (regardless of what was actually typed) will appear as an asterisk (*) character. The actual character whose key was pressed will still be stored into the field in the program, however. This is very useful for password or account number fields.

Prompt Character

Input fields may have any character used as a fill character. These fill characters provide a visual indication of the size of the input field, and will automatically be transformed into spaces when the input field is processed by the program. If no such character is defined for an input field, an underscore (‘_’) will be assumed.

2.1.13. Report Writer Features

GnuCOBOL includes an implementation of the Report Writer Control System, or RWCS. The reportwriter module is now fully implemented as of version 3.0. This is a standardized, optional add-on feature to the COBOL language which automates much of the mechanics involved in the generation of printed reports by:

  1. Controlling the pagination of reports, including:
    1. The automatic production of a one-time notice on the first page of the report (report heading).
    2. The production of zero or more header lines at the top of every page of the report (page heading).
    3. The production of zero or more footer lines at the bottom of every page of the report (page footing).
    4. The automatic numbering of printed pages.
    5. The formatting of those report lines that make up the main body of the report (detail).
    6. Full awareness of where the “pen” is about to “write” on the current page, automatically forcing an eject to a new page, along with the automatic generation of a page footer to close the old page and/or a page header to begin the new one.
    7. The production of a one-time notice at the end of the last page of a report (report footing).
  2. Performing special reporting actions based upon the fact that the data being used to generate the report has been sorted according to one or more key fields:
    1. Automatically suppressing the presentation of one or more fields of data from the detail group when the value(s) of the field(s) duplicate those of the previously generated detail group. Fields such as these are referred to as group-indicate fields.
    2. Automatically causing suppressed detail group-indicate fields to re-appear should a detail group be printed on a new page.
    3. Recognizing when control fields on the report — fields tied to those that were used as SORT statement (see SORT) keys — have changed. This is known as a control break. The RWCS can automatically perform the following reporting actions when a control break occurs:
      • Producing a footer, known as a control footing after the detail lines that shared the same old value for the control field.
      • Producing a header, known as a control heading before the detail lines that share the same new value for the control field.
  3. Perform data summarise, as follows:
    1. Automatically generating subtotals in control and/or report footings, summarizing values of any fields in the detail group.
    2. Automatically generating crossfoot totals in detail groups. These would be sums of two or more values presented in the detail group.

The REPORT SECTION (see REPORT SECTION) documentation explores the description of reports and the PROCEDURE DIVISION (see PROCEDURE DIVISION) chapter documents the various language statements that actually produce reports. Before reading these, you might find it helpful to read Report Writer Usage, which is dedicated to putting the pieces together for you.

2.1.14. Data Initialization

There are three ways in which data division data gets initialized.

  1. When a program or subprogram is first executed, much of the data in its data division will be initialized as follows:
    • Alphanumeric and alphabetic (i.e. text) data items will be initialized to SPACES.
    • Numeric data items will be initialized to a value of ZERO.
    • Data items with an explicit VALUE (see VALUE) clause in their definition will be initialized to that specific value.

    The various sections of the data division each have their own rules as to when the actions described above will occur — consult the documentation on those sections for additional information.

    These default initialization rules can vary quite substantially from one COBOL implementation to another. For example, it is quite common for data division storage to be initialized to all binary zeros except for those data items where VALUE clauses are present. Take care when working with applications originally developed for another COBOL implementation to ensure that GnuCOBOL’s default initialization rules won’t prove disruptive.

  2. A programmer may use the INITIALIZE statement (see INITIALIZE) to initialise any group or elementary data item at any time. This statement provides far more initialization options than just the simple rules stated above.
  3. When the ALLOCATE statement (see ALLOCATE) statement is used to allocate a data item or to simply allocate an area of storage of a size specified on the ALLOCATE, that allocation may occur with or without initialization, as per the programmer’s needs.

2.1.15. Syntax Diagram Conventions

Syntax of the GnuCOBOL language will be described in special syntax diagrams using the following syntactical-description techniques:

MANDATORY-RESERVED-WORD
~~~~~~~~~~~~~~~~~~~~~~~

Reserved words of the COBOL language will appear in UPPER-CASE. When they appear underlined, as this one is, they are required reserved words.

OPTIONAL-RESERVED-WORD

When reserved words appear without underlining, as this one is, they are optional; such reserved words are available in the language syntax merely to improve readability — their presence or absence has no effect upon the program.

ABBREVIATION
~~~~

When only a portion of a reserved word is underlined, it indicates that the word may either be coded in its full form or may be abbreviated to the portion that is underlined.

substitutable-items

Generic terms representing user-defined substitutable items will be shown entirely in lower-case in syntax diagrams. When such items are referenced in text, they will appear as substitutable-items.

Complex-Syntax-Clause

Items appearing in Mixed Case within a syntax diagram represent complex clauses of other syntax elements that may appear in that position. Some COBOL syntax gets quite complicated, and using a convention such as this significantly reduces the complexity of a syntax diagram. When such items are referenced in text, they will appear as Complex-Syntax-Clause.

[ ]

Square bracket meta characters on syntax diagrams document language syntax that is optional. The [] characters themselves should not be coded. If a syntax diagram contains ‘a [b] c’, the ‘a’ and ‘c’ syntax elements are mandatory but the ‘b’ element is optional.

|

Vertical bar meta characters on syntax diagrams document simple choices. The | character itself should not be coded. If a syntax diagram contains ‘a|b|c’, exactly one of the items ‘a’, ‘b’ or ‘c’ must be selected.

{ xxxxxx }
{ yyyyyy }
{ zzzzzz }

A vertical list of items, bounded by multiple brace characters, is another way of signifying a choice between a series of items where exactly one item must be selected. This form is used to show choices when one or more of the selections is more complex than just a single word, or when there are too many choices to present horizontally with ‘|’ meta characters.

| xxxxxx |
| yyyyyy |
| zzzzzz |

A vertical list of items, bounded by multiple vertical bar characters, signifies a choice between a series of items where one or more of the choices could be selected.

...

The ... meta character sequence signifies that the syntax element immediately preceding it may be repeated. The ... sequence itself should not be coded. If a syntax diagram contains a b... c, syntax element ‘a’ must be followed by at least one ‘b’ element (possibly more) and the entire sequence must be terminated by a ‘c’ syntax element.

{ }

The braces (‘{’ and ‘}’) meta characters may be used to group a sequence of syntax elements together so that they may be treated as a single entity. The {} characters themselves should not be coded. These are typically used in combination with the ‘|’ or ‘...’ meta characters.

$*^()-+=:"'<,>./

Any of these characters appearing within a syntax diagram are to be interpreted literally, and are characters that must be coded — where allowed — in the statement whose format is being described. Note that a ‘.’ character is a literal character that must be coded on a statement whereas a ‘...’ symbol is the meta character sequence described above.

2.1.16. Format of Program Source Lines

Prior to the COBOL2002 standard, source statements in COBOL programs were structured around 80-column punched cards. This means that each source line in a COBOL program consisted of five different “areas”, defined by their column number(s).

As of the COBOL2002 standard, a second mode now exists for COBOL source code statements — in this mode of operation, COBOL statements may each be up to 255 characters long, with no specific requirements as to what should appear in which columns.

Of course, in keeping with the long-standing COBOL tradition of maintaining backwards compatibility with older standards, programmers (and, of course, compliant COBOL compilers) are capable of working in either mode. It is even possible to switch back and forth in the same program. The terms Fixed Format Mode and Free Format Mode are used to refer to these two modes of source code formatting.

The GnuCOBOL compiler (cobc) supports both of these source line format modes, defaulting to Fixed Format Mode lacking any other information.

The compiler can be instructed to operate in either mode in any of the following four ways:

  1. Using a compiler option switch — use the -fixed switch to start in Fixed Format Mode (remember that this is the default) or the -free switch to start in Free Format Mode.
  2. You may use the SOURCEFORMAT AS FIXED and SOURCEFORMAT AS FREE clauses of the >>SET CDF directive (see >>SET) within your source code to switch to Fixed or Free Format Mode, respectively.
  3. You may use the >>FORMAT IS FIXED and FORMAT IS FREE clauses of the >>DEFINE CDF directive (see >>DEFINE) within your source code to switch to Fixed or Free Format Mode, respectively.
  4. You may use the >>SOURCE CDF directive (see >>SOURCE) to switch to Free Format Mode (>>SOURCE FORMAT IS FREE) or Fixed Format Mode (>>SOURCE FORMAT IS FIXED.

Using methods 2-4 above, you may switch back and forth between the two formats at will.

The last three options above are all equivalent; all three are supported by GnuCOBOL so that source code compatibility may be maintained with a wide variety of other COBOL implementations. With all three, if the compiler is currently in Fixed Format Mode, the >> must begin in column 8 or beyond, provided no part of the directive extends past column 72. If the compiler is currently in Free Format Mode, the >> may appear in any column, provided no part of the directive extends past column 255.

Depending upon which source format mode the compiler is in, you will need to follow various rules for the format mode currently in effect. These rules are presented in the upcoming paragraphs.

The following discussion presents the various components of every GnuCOBOL source line record when the compiler is operating in Fixed Format Mode. Remember that this is the default mode for the GnuCOBOL compiler.

1-6

Sequence Number Area

Historically, back in the days when punched-cards were used to submit COBOL program source to a COBOL compiler, this part of a COBOL statement was reserved for a six-digit sequence number. While the contents of this area are ignored by COBOL compilers, it existed so that a program actually punched on 80-character cards could — if the card deck were dropped on the floor — be run through a card sorter machine and restored to its proper sequence. Of course, this isn’t necessary today; if truth be told, it hasn’t been necessary for a long time.

See Marking Changes in Programs, for discussion of a valuable use to which the sequence number area may be put today.

7

Indicator Area

Column 7 serves as an indicator in which one of five possible values will appear — space, D (or d), - (dash), / or *. The meanings of these characters are as follows:

space

No special meaning — this is the normal character that will appear in this area.

D/d

The line contains a valid GnuCOBOL statement that is normally treated as a comment unless the program is being compiled in debugging mode.

*

The line is a comment.

/

The line is a comment that will also force a page eject in the compilation listing. While GnuCOBOL will honour such a line as a comment, it will not form-feed any generated listing.

-

The line is a continuation of the previous line. These are needed only when an alphanumeric literal (quoted character string), reserved word or user-defined word are being split across lines.

8-11

Area A

Language DIVISION, SECTION and paragraph section headers must b egin in Area A, as must the level numbers 01, 77 in data description entries and the FD and SD file and SORT description headers.

12-72

Area B

All other COBOL programming language components are coded in these columns.

73-80

Program Name Area

This is another obsolete area of COBOL statements. This part of every statement also hails back to the day when programs were punched on cards; it was expected that the name of the program (or at least the first 8 characters of it) would be punched here so that — if a dropped COBOL source deck contained more than one program — that handy card sorter machine could be used to first separate the cards by program name and then sort them by sequence number. Today’s COBOL compilers (including GnuCOBOL) simply ignore anything past column 72.

See Marking Changes in Programs, for discussion of a valuable use to which the program name area may be put today.

2.1.17. Program Structure

Complete GnuCOBOL Program Syntax

 [ IDENTIFICATION DIVISION. ]
   ~~~~~~~~~~~~~~~~~~~~~~~
   PROGRAM-ID|FUNCTION-ID.  name-1 [ Program-Options ] .
   ~~~~~~~~~~ ~~~~~~~~~~~
 [ ENVIRONMENT DIVISION. ]
   ~~~~~~~~~~~ ~~~~~~~~
 [ CONFIGURATION SECTION. ]
   ~~~~~~~~~~~~~ ~~~~~~~
 [ SOURCE-COMPUTER.         Compilation-Computer-Specification . ]
   ~~~~~~~~~~~~~~~
 [ OBJECT-COMPUTER.         Execution-Computer-Specification . ]
   ~~~~~~~~~~~~~~~
 [ REPOSITORY.              Function-Specification... . ]
   ~~~~~~~~~~
 [ SPECIAL-NAMES.           Program-Configuration-Specification . ]
   ~~~~~~~~~~~~~
 [ INPUT-OUTPUT SECTION. ]
   ~~~~~~~~~~~~ ~~~~~~~
 [ FILE-CONTROL.            General-File-Description... . ]
   ~~~~~~~~~~~~
 [ I-O-CONTROL.             File-Buffering-Specification... . ]
   ~~~~~~~~~~~
 [ DATA DIVISION. ]
   ~~~~~~~~~~~~~
 [ FILE SECTION.            Detailed-File-Description... . ]
   ~~~~~~~~~~~~
 [ WORKING-STORAGE SECTION. Permanent-Data-Definition... . ]
   ~~~~~~~~~~~~~~~ ~~~~~~~
 [ LOCAL-STORAGE SECTION.   Temporary-Data-Definition... . ]
   ~~~~~~~~~~~~~ ~~~~~~~
 [ LINKAGE SECTION.         Subprogram-Argument-Description... . ]
   ~~~~~~~ ~~~~~~~
 [ REPORT SECTION.          Report-Description... . ]
   ~~~~~~ ~~~~~~~
 [ SCREEN SECTION.          Screen-Layout-Definition... . ]
   ~~~~~~ ~~~~~~~
   PROCEDURE DIVISION [ { USING Subprogram-Argument...      } ]
   ~~~~~~~~~ ~~~~~~~~   { ~~~~~                             }
                        { CHAINING Main-Program-Argument... }
                          ~~~~~~~~
                      [   RETURNING identifier-1 ] .
 [ DECLARATIVES. ]        ~~~~~~~~~
   ~~~~~~~~~~~~
 [ Event-Handler-Routine... . ]
 [ END DECLARATIVES. ]
   ~~~ ~~~~~~~~~~~~
   General-Program-Logic
 [ Nested-Subprogram... ]
 [ END PROGRAM|FUNCTION name-1 ]
   ~~~ ~~~~~~~ ~~~~~~~~

Each program consists of up to four Divisions (major groupings of sections, paragraphs and descriptive or procedural coding that all relate to a common purpose), named Identification, Environment, Data and Procedure.

  1. Not all divisions are needed in every program, but they must be specified in the order shown when they are used.
  2. The following points pertain to the identification division
    • The IDENTIFICATION DIVISION. header is always optional.
  3. The following points pertain to the environment division:
    • If both optional sections of this division are coded, they must be coded in the sequence shown.
    • Each of these sections consists of a series of specific paragraphs (SOURCE-COMPUTER and OBJECT-COMPUTER, for example). Each of these paragraphs serves a specific purpose. If no code is required for the purpose one of the paragraphs serves, the entire paragraph may be omitted.
    • If none of the paragraphs within one of the sections are coded, the section header itself may be omitted.
    • The paragraphs within each section may only be coded in that section, but may be coded in any order.
    • If none of the sections within the environment division are coded, the ENVIRONMENT DIVISION. header itself may be omitted.
  4. The following points pertain to the data division:
    • The data division consists of six optional sections — when used, those sections must be coded in the order shown in the syntax diagram.
    • Each of these sections consists of code which serves a specific purpose. If no code is required for the purpose one of those sections serves, the entire section, including its header, may be omitted.
    • If none of the sections within the data division are coded (a highly unlikely, but theoretically possible circumstance), the DATA DIVISION. header itself may be omitted.
  5. The following points pertain to the procedure division:
    • As with the other divisions, the procedure division may consist of sections and those sections may — in turn — consist of paragraphs. Unlike the other divisions, however, section and paragraph names are defined by the programmer, and there may not be any defined at all if the programmer so wishes.
    • Each Event-Handler-Routine will be a separate section devoted to trapping a particular run-time event. If there are no such sections coded, the DECLARATIVES. and END DECLARATIVES. lines may be omitted.
  6. A single file of COBOL source code may contain:
    • A portion of a program; these files are known as copybooks
    • A single program. In this case, the END PROGRAM or END FUNCTION statement is optional.
    • Multiple programs, separated from one another by END PROGRAM or END FUNCTION statements. The final program in such a source code file need not have an END PROGRAM or END FUNCTION statement.
  7. Subprogram ‘B’ may be nested inside program ‘A’ by including program B’s source code at the end of program A’s procedure division without an intervening END PROGRAM A. or END FUNCTION A. statement. For now, that’s all that will be said about nesting. See Independent vs Contained vs Nested Subprograms, for more information.
  8. Regardless of how many programs comprise a single GnuCOBOL source file, only a single output executable program will be generated from that source file when the file is compiled.

2.1.18. Comments

The following information describes how comments may be embedded into GnuCOBOL program source to provide documentation.

Comment Type Source Mode — Description
Blank Lines FIXED — Blank lines may be inserted as desired.

FREE — Blank lines may be inserted as desired.

Full-line comments FIXED — An entire source line will be treated as a comment (and will be ignored by the compiler) by coding an asterisk (‘*’) in column seven (7).

FREE — An entire source line will be treated as a comment (and will be ignored by the compiler) by coding the sequence ‘*>’, starting in any column, as the first non-blank characters on the line.

Full-line comments with form-feed FIXED — An entire source line will be treated as a comment by coding a slash (‘/’) in column seven (7). Many COBOL compilers will also issue a form-feed in the program listing so that the ‘/’ line is at the top of a new page. The GnuCOBOL compiler does not support this form-feed behaviour.

The GnuCOBOL Interactive Compiler, or GCic, does support this form-feed behaviour when it generates program source listings! See GCic in GnuCOBOL Sample Programs, for the source and cross-reference listing (produced by GCic) of this program — you can see the effect of ‘/’ there.

FREE — There is no Free Source Mode equivalent to ‘/’.

Partial-line comments FIXED — Any text following the character sequence ‘*>’ on a source line will be treated as a comment. The ‘*’ must appear in column seven (7) or beyond.

FREE — Any text following the character sequence ‘*>’ on a source line will be treated as a comment. The ‘*’ may appear in any column.

Comments that may be treated as code, typically for debugging purposes FIXED — By coding a ‘D’ in column 7 (upper- or lower-case), an otherwise valid GnuCOBOL source line will be treated as a comment by the compiler.

FREE — By specifying the character sequence ‘>>D’ (upper- or lower-case) as the first non-blank characters on a source line, an otherwise valid GnuCOBOL source line will be treated as a comment by the compiler.

Debugging statements may be compiled either by specifying the -fdebugging-line switch on the GnuCOBOL compiler or by adding the WITH DEBUGGING MODE clause to the SOURCE-COMPUTER paragraph.

2.1.19. Literals

Literals are constant values that will not change during the execution of a program. There are two fundamental types of literals — numeric and alphanumeric.

2.1.19.1. Numeric LiteralsA numeric literal

  • Integers such as 1, 56, 2192 or -54.
  • Non-integer fixed point values such as 1.317 or -2.95.
  • Floating-point values using ‘Enn’ notation such as 9.92E25, representing 9.92 x 1025 (10 raised to the 25th power) or 5.7E-14, representing 5.7 x 10-14 (10 raised to the -14th power). Both the mantissa (the number before the ‘E’) and the exponent (the number after the ‘E’) may be explicitly specified as positive (with a ‘+’), negative (with a ‘-’) or unsigned (and therefore implicitly positive). A floating-point literals value must be within the range -1.7 x 10308 to +1.7 x 10308 with no more than 15 decimal digits of precision.
  • Hexadecimal numeric literals
  • Null terminated literals
  • Raw C string using L"characters".
  • Binary using B#0 or 1.
  • Octal using O#0 - 7. (That is the letter ‘O’).
  • Hexadecimal number using H# or X#0’ - ‘F’.
  • Boolean Literals (Standard) B" character ".
  • Boolean Literals (Hexadecimal) BX" hex character ".
  • National Literals (Standard) N" character " or NC" character ".
  • National Literals (Hexadecimal) NX" character ".

2.1.19.2. Alphanumeric Literals

An alphanumeric literal

An alphanumeric literal is not valid for use in arithmetic expressions unless it is first converted to its numeric computational equivalent; there are three numeric conversion intrinsic functions built into GnuCOBOL that can perform this conversion — NUMVAL (see NUMVAL), NUMVAL-C (see NUMVAL-C) and NUMVAL-F (see NUMVAL-F).

Alphanumeric literals may take any of the following forms:

  • A sequence of characters enclosed by a pair of single-quote (‘'’)
  • A literal formed according to the same rules as for a string literal (above), but prefixed with the letter ‘Z’ (upper- or lower-case) constitutes a zero-delimited string literal. These literals differ from ordinary string literals in that they will be explicitly terminated with a byte of hexadecimal value 00. These Zero-Delimited Alphanumeric Literals
  • A Hexadecimal Alphanumeric Literal

Alphanumeric literals too long to fit on a single line may be continued to the next line in one of two ways:

  1. If you are using Fixed Format Mode, the alphanumeric literal can be run right up to and including column 72. The literal may then be continued on the next line anywhere after column 11 by coding another quote or apostrophe (whichever was used to begin the literal originally). The continuation line must also have a hyphen (-)
         1         2         3         4         5         6         7   
1234567890123456789012345678901234567890123456789012345678901234567890123

       01  LONG-LITERAL-VALUE-DEMO     PIC X(60) VALUE "This is a long l
      -                                                "ong literal that
      -                                                " must be continu
      -                                                "ed.".
  1. Regardless of whether the compiler is operating in Fixed or Free Format Mode, GnuCOBOL allows alphanumeric literals to be broken up into separate fragments. These fragments have their own beginning and ending quote/apostrophe characters and are “glued together” at compilation time using ‘&
         1         2         3         4         5         6         7   
1234567890123456789012345678901234567890123456789012345678901234567890123

      01  LONG-LITERAL-VALUE-DEMO      PIC X(60) VALUE "This is a" &
                                        " long literal that must " &
                                                    "be continued.".

If your program is using Free Format Mode, there’s less need to continue long alphanumeric literals because statements may be as long as 255 characters.

Numeric literals may be split across lines just as alphanumeric literals are, using either of the above techniques and both reserved and user-defined words can be split across lines too (using the first technique). The continuation of numeric literals and user-defined/reserved words is provided merely to provide compatibility with older COBOL versions and programs, but should not be used with new programs — it just makes for ugly-looking programs.

2.1.19.3. Figurative Constants

Figurative constants are reserved words that may be used as literals anywhere the figurative constants value could be interpreted as an arbitrarily long sequence of the characters in question. When a specific length is required, such as would be the case with an argument to a subprogram, a figurative constant may not be used. Thus, the following are valid uses of figurative constants:


05 FILLER                PIC 9(10) VALUE ZEROS.
   ...
MOVE SPACES TO Employee-Name

But this is not:


CALL "SUBPGM" USING SPACES

The following are the GnuCOBOL figurative constants and their respective equivalent values.

ZERO

This figurative constant has a value of numeric 0 (zero). ZEROS and ZEROES are both synonyms of ZERO.

SPACE

This figurative constant has a value of one or more space characters. SPACES is a synonym of SPACE.

QUOTE

This figurative constant has a value of one or more double-quote characters ("). QUOTES is a synonym of QUOTE.

LOW-VALUE

This figurative constant has a value of one or more of whatever character occupies the lowest position in the program’s collating sequence as defined in the OBJECT-COMPUTER (see OBJECT-COMPUTER) paragraph or — if no such specification was made — in whatever default character set the program is using (typically, this is the ASCII character set). LOW-VALUES is a synonym of LOW-VALUE.

When the character set in use is ASCII with no collating sequence modifications, the LOW-VALUES figurative constant value is the ASCII NUL character. Because character sets can be redefined, however, you should not rely on this fact. Use the NULL figurative constant instead.

HIGH-VALUE

This figurative constant has a value of one or more of whatever character occupies the highest position in the program’s collating sequence as defined in the OBJECT-COMPUTER paragraph or — if no such specification was made — in whatever default character set the program is using (typically, this is the ASCII character set). HIGH-VALUES is a synonym of HIGH-VALUE.

NULL

A character comprised entirely of zero-bits (regardless of the programs collating sequence).

Programmers may create their own figurative constants via the SYMBOLIC CHARACTERS (see Symbolic-Characters-Clause) clause of the SPECIAL-NAMES (see SPECIAL-NAMES) paragraph.

2.1.20. Punctuation

A comma (‘,’)

The use of comma characters can cause confusion to a COBOL compiler if the DECIMAL POINT IS COMMA clause is used in the SPECIAL-NAMES (see SPECIAL-NAMES) paragraph, as might be the case in Europe. The following statement, which calls a subroutine passing it two arguments (the numeric constants 1 and 2):

CALL "SUBROUTINE" USING 1,2

Would — with DECIMAL POINT IS COMMA in effect — actually be interpreted as a subroutine call with 1 argument (the non-integer numeric literal whose value is 1 and 2 tenths). For this reason, it is best to always follow a comma with a space.

The period character (‘.’)

The rules for where and when periods are needed in the procedure division are somewhat complicated. See Use of Periods, for the details.

2.1.21. Interfacing to Other Environments

Through the CALL statement, COBOL programs may invoke other COBOL programs serving as subprograms. This is quite similar to cross-program linkage capabilities provided by other languages. In GnuCOBOL’s case, the CALL facility is powerful enough to be tailored to the point where a GnuCOBOL program can communicate with operating system, database management and run-time library APIs, even if they weren’t written in COBOL themselves. See GnuCOBOL Main Programs CALLing C Subprograms, for an example of how a GnuCOBOL program could invoke a C-language subprogram, passing information back and forth between the two.

The fact that GnuCOBOL supports a full-featured two-way interface with C-language programs means that — even if you cannot access a library API directly — you could always do so via a small C “wrapper” program that is CALLed by a GnuCOBOL program.

2.2. The COBOL Language - Advanced Techniques

2.2.1. Table References

COBOL uses parenthesis to specify the subscripts used to reference table entries (tables in COBOL are what other programming languages refer to as arrays).

For example, observe the following data structure which defines a 4 column by 3 row grid of characters:

01  GRID.
     05 GRID-ROW OCCURS 3 TIMES.
        10 GRID-COLUMN OCCURS 4 TIMES.
            15 GRID-CHARACTER       PIC X(1).

If the structure contains the following grid of characters:

A B C D
E F G H
I J K L

Then GRID-CHARACTER (2, 3) references the ‘G’ and GRID-CHARACTER (3, 2) references the ‘J’.

Subscripts may be specified as numeric (integer) literals, numeric (integer) data items, data items created with any of the picture-less integer USAGE (see USAGE) specifications, USAGE INDEX data items or arithmetic expressions resulting in a non-zero integer value.

In the above examples, a comma is used as a separator character between the two subscript values; semicolons (;) are also valid subscript separator characters, as are spaces! The use of a comma or semicolon separator in such a situation is technically optional, but by convention most COBOL programmers use one or the other. The use of no separator character (other than a space) is not recommended, even though it is syntactically correct, as this practice can lead to programmer-unfriendly code. It isn’t too difficult to read and understand GRID-CHARACTER(2 3), but it’s another story entirely when trying to comprehend GRID-CHARACTER(I + 1 J / 3) (instead of GRID-CHARACTER(I + 1, J / 3)). The compiler accepts it, but too much of this would make my head hurt.

2.2.2. Qualification of Data Names

COBOL allows data names to be duplicated within a program, provided references to those data names may be made in such a manner as to make those references unique through a process known as qualification.

To see qualification at work, observe the following segments of two data records defined in a COBOL program:

01  EMPLOYEE.                     01  CUSTOMER.
    05 MAILING-ADDRESS.               05 MAILING-ADDRESS.
       10 STREET        PIC X(35).       10 STREET        PIC X(35).
       10 CITY          PIC X(15).       10 CITY          PIC X(15).
       10 STATE         PIC X(2).        10 STATE         PIC X(2).
       10 ZIP-CODE.                      10 ZIP-CODE.
          15 ZIP-CODE-5 PIC 9(5).           15 ZIP-CODE-5 PIC 9(5).
          15 FILLER     PIC X(4).           15 FILLER     PIC X(4).

Now, let’s deal with the problem of setting the CITY portion of an EMPLOYEEs MAILING-ADDRESS to ‘Philadelphia’. Clearly, MOVE 'Philadelphia' TO CITY cannot work because the compiler will be unable to determine which of the two CITY fields you are referring to.

In an attempt to correct the problem, we could qualify the reference to CITY as MOVE 'Philadelphia' TO CITY OF MAILING-ADDRESS.

Unfortunately that too is insufficient because it still insufficiently specifies which CITY is being referenced. To truly identify which specific CITY you want, you’d have to code MOVE 'Philadelphia' TO CITY OF MAILING-ADDRESS OF EMPLOYEE.

Now there can be no confusion as to which CITY is being changed. Fortunately, you don’t need to be quite so specific; COBOL allows intermediate and unnecessary qualification levels to be omitted. This allows MOVE 'Philadelphia' TO CITY OF EMPLOYEE to do the job nicely.

If you need to qualify a reference to a table, do so by coding something like identifier-1 OF identifier-2 ( subscript(s) ).

The reserved word IN may be used in lieu of OF.

2.2.3. Reference Modifiers

Reference Modifier (Format 1) Syntax

 identifier-1 [ OF|IN identifier-2 ] [ (subscript...) ] (start:[ length ])
                ~~ ~~

Reference Modifier (Format 2) Syntax

 intrinsic-function-reference (start:[ length ])

The COBOL 1985 standard introduced the concept of a reference modifier to facilitate references to only a portion of a data item; GnuCOBOL fully supports reference modification.

The start value indicates the starting character position being referenced (character position values start with 1, not 0 as is the case in some programming languages) and length specifies how many characters are wanted.

If no length is specified, a value equivalent to the remaining character positions from start to the end of identifier-1 or to the end of the value returned by the function will be assumed.

Both start and length may be specified as integer numeric literals, integer numeric data items or arithmetic expressions with an integer value.

Here are a few examples:

CUSTOMER-LAST-NAME (1:3)

References the first three characters of CUSTOMER-LAST-NAME.

CUSTOMER-LAST-NAME (4:)

References all character positions of CUSTOMER-LAST-NAME from the fourth onward.

FUNCTION CURRENT-DATE (5:2)

References the current month as a 2-digit number in character form. See CURRENT-DATE, for more information.

Hex-Digits (Nibble + 1:1)

Assuming that Nibble is a numeric data item with a value in the range 0-15, and Hex-Digits is a PIC X(16) item with a value of 0123456789ABCDEF, this converts that numeric value to a hexadecimal digit.

Table-Entry (6) (7:5)

References characters 7 through 11 (5 characters in total) in the 6th occurrence of Table-Entry.

Reference modification may be used anywhere an identifier is legal, including serving as the receiving field of statements like MOVE (see MOVE), STRING (see STRING) and ACCEPT (see ACCEPT), to name a few.

2.2.4. Arithmetic Expressions

Arithmetic-Expression Syntax

 Unary-Expression-1 { **|^ } Unary-Expression-2
                    {  *|/ }
                    {  +|- }

Unary-Expression Syntax

 { [ +|- ] { ( Arithmetic-Expression-1 )          } }
 {         { [ LENGTH OF ] { identifier-1       } } }
 {         {   ~~~~~~ ~~   { literal-1          } } }
 {         {               { Function-Reference } } }
 { Arithmetic-Expression-2                          }

Arithmetic expressions are formed using four categories of operations — exponentiation, multiplication & division, addition & subtraction, and sign specification.

In complex expressions composed of multiple operators and operands, a precedence of operation applies whereby those operations having a higher precedence are computed first before operations with a lower precedence.

As is the case in almost any other programming language, the programmer is always free to use pairs of parenthesis to enclose sub-expressions of complex expressions that are to be evaluated before other sub-expressions rather than let operator precedence dictate the sequence of evaluation.

In highest to lowest order of precedence, here is a discussion of each category of operation:

Level 1 (Highest) — Unary Sign Specification (+ and - with a single argument)

The unary “minus” (-) operator returns the arithmetic negation of its single argument, effectively returning as its value the product of its argument and -1.

The unary “plus” (+) operator returns the value of its single argument, effectively returning as its value the product of its argument and +1.

Level 2 — Exponentiation (** or ^)

The value of the left argument is raised to the power indicated by the right argument. Non-integer powers are allowed. The ^ and ** operators are both supported to provide compatibility with programs written for other COBOL implementations.

Level 3 — Multiplication (*) and division (/)

The * operator computes the product of the left and right arguments while the / operator computes the value of the left argument divided by the value of the right argument. If the right argument has a value of zero, expression evaluation will be prematurely terminated before a value is generated. This may cause program failure at run-time.

A sequence of multiple 3rd-level operations (A * B / C, for example) will evaluate in strict left-to-right sequence if no parenthesis are used to control the order of evaluation.

Level 4 — Addition (+) or subtraction (+)

The + operator calculates the sum of the left and right arguments while the - operator computes the value of the right argument subtracted from that of the left argument.

A sequence of multiple 4th-level operations (A - B + C, for example) will evaluate in strict left-to-right sequence if no parenthesis are used to control the order of evaluation.

The syntactical rules of COBOL, allowing a dash (-) character in data item names, can lead to some ambiguity.

01  C        PIC 9 VALUE 5.
01  D        PIC 9 VALUE 2.
01  C-D      PIC 9 VALUE 7.
01  I        PIC 9 VALUE 0.
…
COMPUTE I=C-D+1

The COMPUTE (see COMPUTE) statement will evaluate the arithmetic expression C-D+1 and then save that result in I.

What value will be stored in I? The number 4, which is the result of subtracting the value of D (2) from the value of C (5) and then adding 1? Or, will it be the number 8, which is the value of adding 1 to the value of data item C-D (7)?

The right answer is 8 — the value of data item C-D plus 1! Hopefully, that was the intended result.

The GnuCOBOL compiler actually went through the following decision-making logic when generating code for the COMPUTE Statement:

  1. Is there a data item named C-D defined? If so, use its value for the character sequence C-D.
  2. If there is no C-D data item, then are there C and D data items? If not, the COMPUTE statement is in error. If there are, however, then code will be generated to subtract the value of D from C and add 1 to the result.

Had there been at least one space to the left and/or the right of the -, there would have been no ambiguity — the compiler would have been forced to use the individual C and D data items.

To avoid any possible ambiguity, as well as to improve program readability, it’s considered good COBOL programming practice to always code at least one space to both the left and right of every operator in arithmetic expressions as well as the = sign on a COMPUTE.

Here are some examples of how the precedence of operations affects the results of arithmetic expressions (all examples use numeric literals, to simplify the discussion).

Expression Result Notes
3 * 4 + 1 13 * has precedence over +
4 * 2 ^ 3 - 10 22 2^3 is 8 (^ has precedence over *), times 4 is 32, minus 10 is 22.
(4 * 2) ^ 3 - 10 502 Parenthesis provide for a recursive application of the arithmetic expression rules, effectively allowing you to alter the precedence of operations. 4 times 2 is 8 (the use of parenthesis “trumps” the exponentiation operator, so the multiplication happens first); 8 ^ 3 is 512, minus 10 is 502.
5 / 2.5 + 7 * 2 - 1.15 15.35 Integer and non-integer operands may be freely intermixed

Of course, arithmetic expression operands may be numeric data items (any USAGE except POINTER or PROGRAM POINTER) as well as numeric literals.

2.2.5. Conditional Expressions

Conditional expressions are expressions which identify the circumstances under which a program may take an action or cease taking an action. As such, conditional expressions produce a value of TRUE or FALSE.

There are seven types of conditional expressions, as discussed in the following sections.

2.2.5.1. Condition Names

These are the simplest of all conditions. Observe the following code:

05  SHIRT-SIZE               PIC 99V9.
    88 TINY                  VALUE 0 THRU 12.5
    88 XS                    VALUE 13 THRU 13.5.
    88 S                     VALUE 14, 14.5.
    88 M                     VALUE 15, 15.5.
    88 L                     VALUE 16, 16.5.
    88 XL                    VALUE 17, 17.5.
    88 XXL                   VALUE 18, 18.5.
    88 XXXL                  VALUE 19, 19.5.
    88 VERY-LARGE            VALUE 20 THRU 99.9.

The condition names TINY, XS, S, M, L, XL, XXL, XXXL and VERY-LARGE will have TRUE or FALSE values based upon the values within their parent data item (SHIRT-SIZE).

A program wanting to test whether or not the current SHIRT-SIZE value can be classified as XL could have that decision coded as a combined condition (the most complex type of conditional expression), as either:

IF SHIRT-SIZE = 17 OR SHIRT-SIZE = 17.5

- or -

IF SHIRT-SIZE = 17 OR 17.5

Or it could simply utilize the condition name XL as follows:

IF XL

2.2.5.2. Class Conditions

Class-Condition Syntax

 identifier-1 IS [ NOT ] { NUMERIC          }
                   ~~~   { ~~~~~~~          }
                         { ALPHABETIC       }
                         { ~~~~~~~~~~       }
                         { ALPHABETIC-LOWER }
                         { ~~~~~~~~~~~~~~~~ }
                         { ALPHABETIC-UPPER }
                         { ~~~~~~~~~~~~~~~~ }
                         { OMITTED          }
                         { ~~~~~~~          }
                         { class-name-1     }

Class conditions evaluate the type of data that is currently stored in a data item.

  1. The NUMERIC class test considers only the characters ‘0’, ‘1’, … , ‘9’ to be numeric; only a data item containing nothing but digits will pass a NUMERIC class test. Spaces, decimal points, commas, currency signs, plus signs, minus signs and any other characters except the digit characters will all fail NUMERIC class tests.
  2. The ALPHABETIC class test considers only upper-case letters, lower-case letters and spaces to be alphabetic in nature.
  3. The ALPHABETIC-LOWER and ALPHABETIC-UPPER class conditions consider only spaces and the respective type of letters to be acceptable in order to pass such a class test.
  4. The NOT option reverses the TRUE/FALSE value of the condition.
  5. Note that what constitutes a “letter” (or upper/lower case too, for that manner) may be influenced through the use of CHARACTER CLASSIFICATION specifications in the OBJECT-COMPUTER (see OBJECT-COMPUTER) paragraph.
  6. Only data items whose USAGE (see USAGE) is either explicitly or implicitly defined as DISPLAY may be used in NUMERIC or any of the ALPHABETIC class conditions.
  7. Some COBOL implementations disallow the use of group items or PIC A items with NUMERIC class conditions and the use of PIC 9 items with ALPHABETIC class conditions. GnuCOBOL has no such restrictions.
  8. The OMITTED class condition is used when it is necessary for a subprogram to determine whether or not a particular argument was passed to it. In such class conditions, identifier-1 must be a linkage section item defined on the USING clause of the subprograms PROCEDURE DIVISION header. See PROCEDURE DIVISION USING, for additional information.

The class-name-1 option allows you to test for a user-defined class. Here’s an example. First, assume the following SPECIAL-NAMES (see SPECIAL-NAMES) definition of the user-defined class ‘Hexadecimal’:

SPECIAL-NAMES.
    CLASS Hexadecimal IS '0' THRU '9', 'A' THRU 'F', 'a' THRU 'f'.

Now observe the following code, which will execute the 150-Process-Hex-Value procedure if Entered-Value contains nothing but valid hexadecimal digits:

    IF Entered-Value IS Hexadecimal
        PERFORM 150-Process-Hex-Value
    END-IF

2.2.5.3. Sign Conditions

Sign-Condition Syntax

 identifier-1 IS [ NOT ] { POSITIVE }
                   ~~~   { ~~~~~~~~ }
                         { NEGATIVE }
                         { ~~~~~~~~ }
                         { ZERO     }
                           ~~~~

Sign conditions evaluate the numeric state of a data item defined with a PICTURE (see PICTURE) and/or USAGE (see USAGE) that supports numeric values.

  1. A POSITIVE or NEGATIVE class condition will be TRUE only if the value of identifier-1 is strictly greater than or less than zero, respectively.
  2. A ZERO class condition can be passed only if the value of identifier-1 is exactly zero.
  3. The NOT option reverses the TRUE/FALSE value of the condition.

2.2.5.4. Switch-Status Conditions

In the SPECIAL-NAMES paragraph, an external switch name can be associated with one or more condition names. These condition names may then be used to test the ON/OFF status of the external switch.

Here are the relevant sections of code in a program named testprog, which is designed to simply announce if SWITCH-1 is on:

…
ENVIRONMENT DIVISION.
SPECIAL-NAMES.
    SWITCH-1 ON STATUS IS Switch-1-Is-ON.
…
PROCEDURE DIVISION.
…
    IF Switch-1-Is-ON
        DISPLAY "Switch 1 Is On"
    END-IF
…

The following are two different command window sessions — the left on a Unix/Cygwin/OSX system and the right on a windows system — that will set the switch on and then execute the testprog program. Notice how the message indicating that the program detected the switch was set is displayed in both examples:

$ COB_SWITCH_1=ON           C:>SET COB_SWITCH_1=ON
$ export COB_SWITCH_1       C:>testprog
$ ./testprog                Switch 1 Is On
Switch 1 Is On              C:>
$

2.2.5.5. Relation Conditions

Relation-Condition Syntax

 { identifier-1            } IS [ NOT ] RelOp { identifier-2            }
 { literal-1               }      ~~~         { literal-2               }
 { arithmetic-expression-1 }                  { arithmetic-expression-2 }
 { index-name-1            }                  { index-name-2            }

RelOp Syntax

 { EQUAL TO                 }
 { ~~~~~                    }
 { EQUALS                   }
 { ~~~~~~                   }
 { GREATER THAN             }
 { ~~~~~~~                  }
 { GREATER THAN OR EQUAL TO }
 { ~~~~~~~      ~~ ~~~~~    }
 { LESS THAN                }
 { ~~~~                     }
 { LESS THAN OR EQUAL TO    }
 { ~~~~      ~~ ~~~~~       }
 { =                        }
 { >                        }
 { >=                       }
 { <                        }
 { <=                       }

These conditions evaluate how two different values "relate" to each other.

  1. When comparing one numeric value to another, the USAGE (see USAGE) and number of significant digits in either value are irrelevant as the comparison is performed using the actual algebraic values.
  2. When comparing strings, the comparison is made based upon the program’s collating sequence. When the two string arguments are of unequal length, the shorter is assumed to be padded (on the right) with a sufficient number of spaces as to make the two strings of equal length. String comparisons take place on a corresponding character-by-character basis, left to right, until the TRUE/FALSE value for the relation test can be established. Characters are compared according to their relative position in the program’s COLLATING SEQUENCE (as defined in SPECIAL-NAMES (see SPECIAL-NAMES)), not according to the bit-pattern values the characters have in storage.
  3. By default, the program’s COLLATING SEQUENCE will, however, be based entirely on the bit-pattern values of the various characters.
  4. There is no functional difference between using the wordy version (IS EQUAL TO, IS LESS THAN, …) versus the symbolic version (=, <, …) of the actual relation operators.

2.2.5.6. Combined Conditions

Combined Condition Syntax

 [ ( ] Condition-1 [ ) ] { AND } [ ( ] Condition-2 [ ) ]
                         { ~~~ }
                         { OR  }
                         { ~~  }

A combined condition is one that computes a TRUE/FALSE value from the TRUE/FALSE values of two other conditions (which could themselves be combined conditions).

  1. If either condition has a value of TRUE, the result of ORing the two together will result in a value of TRUE. ORing two FALSE conditions will result in a value of FALSE.
  2. In order for AND to yield a value of TRUE, both conditions must have a value of TRUE. In all other circumstances, AND produces a FALSE value.
  3. When chaining multiple, similar conditions together with the same operator (OR/AND), and left or right arguments have common subjects, it is possible to abbreviate the program code. For example:
    IF ACCOUNT-STATUS = 1 OR ACCOUNT-STATUS = 2 OR ACCOUNT-STATUS = 7
    

    Could be abbreviated as:

    IF ACCOUNT-STATUS = 1 OR 2 OR 7
    
  4. Just as multiplication takes precedence over addition in arithmetic expressions, so does AND take precedence over OR in combined conditions. Use parenthesis to change this precedence, if necessary. For example:
    FALSE AND FALSE OR TRUE AND TRUE

    Evaluates to TRUE

    (FALSE AND FALSE) OR (TRUE AND TRUE)

    Evaluates to TRUE (since AND has precedence over OR) - this is identical to the previous example

    (FALSE AND (FALSE OR TRUE)) AND TRUE

    Evaluates to FALSE

2.2.5.7. Negated Conditions

Negated Condition Syntax

 NOT Condition-1
 ~~~

A condition may be negated by prefixing it with the NOT operator.

  1. The NOT operator has the highest precedence of all logical operators, just as a unary minus sign (which “negates” a numeric value) is the highest precedence arithmetic operator.
  2. Parenthesis must be used to explicitly signify the sequence in which conditions are evaluated and processed if the default precedence isn’t desired. For example:
    NOT TRUE AND FALSE AND NOT FALSE

    Evaluates to FALSE AND FALSE AND TRUE which evaluates to FALSE

    NOT (TRUE AND FALSE AND NOT FALSE)

    Evaluates to NOT (FALSE) which evaluates to TRUE

    NOT TRUE AND (FALSE AND NOT FALSE)

    Evaluates to FALSE AND (FALSE AND TRUE) which evaluates to FALSE

2.2.6. Use of Periods

All COBOL implementations distinguish between sentences and statements in the procedure division. A Statement is a single executable COBOL instruction. For example, these are all statements:

MOVE SPACES TO Employee-Address
ADD 1 TO Record-Counter
DISPLAY "Record-Counter=" Record-Counter

Some COBOL statements have a scope of applicability associated with them where one or more other statements can be considered to be part of or related to the statement in question. An example of such a situation might be the following, where the interest on a loan is being calculated and displayed at 4% interest if the loan balance is under $10,000, and 4.5% otherwise. (WARNING: the following code has an error!):

IF Loan-Balance < 10000
    MULTIPLY Loan-Balance BY 0.04 GIVING Interest
ELSE
    MULTIPLY Loan-Balance BY 0.045 GIVING Interest
DISPLAY "Interest Amount = " Interest

In this example, the IF statement actually has a scope that can include two sets of associated statements: one set to be executed when the IF (see IF) condition is TRUE, and another if it is FALSE.

Unfortunately, there’s a problem with the above. A human being looking at that code would probably infer that the DISPLAY (see DISPLAY) statement, because of its lack of indentation, is to be executed regardless of the TRUE/FALSE value of the IF condition. Unfortunately, the compiler (any COBOL compiler) won’t see it that way because it really couldn’t care less what sort of indentation, if any, is used. In fact, any COBOL compiler would be just as happy to see the code written like this:

IF Loan-Balance < 10000 MULTIPLY Loan-balance
BY 0.04 GIVING Interest ELSE MULTIPLY
Loan-Balance BY 0.045 GIVING Interest DISPLAY
"Interest Amount = " Interest

How then do we inform the compiler that the DISPLAY statement is outside the scope of the IF?

That’s where sentences come in.

A COBOL Sentence is defined as any arbitrarily long sequence of statements, followed by a period (.) character. The period character is what terminates the scope of a set of statements. Therefore, our example should have been coded like this:

IF Loan-Balance < 10000
    MULTIPLY Loan-Balance BY 0.04 GIVING Interest
ELSE
    MULTIPLY Loan-Balance BY 0.045 GIVING Interest.
DISPLAY "Interest Amount = " Interest

See the period at the end of the second MULTIPLY (see MULTIPLY)? That is what terminates the scope of the IF, thus making the DISPLAY statement’s execution completely independent of the TRUE/FALSE status of the IF.

2.2.7. Use of VERB/END-VERB Constructs

Prior to the 1985 COBOL standard, using a period character was the only way to signal the end of a statement’s scope.

Unfortunately, this caused some problems. Take a look at this code:

IF A = 1
    IF B = 1
        DISPLAY "A & B = 1"
ELSE *> This ELSE has a problem!
    IF B = 1
        DISPLAY "A NOT = 1 BUT B = 1"
    ELSE
        DISPLAY "NEITHER A NOR B = 1".

The problem with this code is that indentation — so critical to improving the human-readability of a program — can provide an erroneous view of the logical flow. An ELSE is always associated with the most-recently encountered IF; this means the emphasized ELSE will be associated with the IF B = 1 statement, not the IF A = 1 statement as the indentation would appear to imply.

This sort of problem led to a band-aid solution being added to the COBOL language: the NEXT SENTENCE clause:

IF A = 1
    IF B = 1
        DISPLAY "A & B = 1"
    ELSE
        NEXT SENTENCE
ELSE
    IF B = 1
        DISPLAY "A NOT = 1 BUT B = 1"
    ELSE
        DISPLAY "NEITHER A NOR B = 1".

NEXT SENTENCE informs the compiler that if the B = 1 condition is false, control should fall into the first statement that follows the next period.

With the 1985 standard for COBOL, a much more elegant solution was introduced. Any COBOL Verb (the first reserved word of a statement) that needed such a thing was allowed to use an END-verb construct to end its scope without disrupting the scope of any other statement it might have been in. Any COBOL 85 compiler would have allowed the following solution to our problem:

IF A = 1
    IF B = 1
        DISPLAY "A & B = 1"
    END-IF
ELSE
    IF B = 1
        DISPLAY "A NOT = 1 BUT B = 1"
    ELSE
        DISPLAY "NEITHER A NOR B = 1".

This new facility made the period almost obsolete, as our program segment would probably be coded like this today:

IF A = 1
    IF B = 1
        DISPLAY "A & B = 1"
    END-IF
ELSE
    IF B = 1
        DISPLAY "A NOT = 1 BUT B = 1"
    ELSE
        DISPLAY "NEITHER A NOR B = 1"
    END-IF
END-IF

COBOL (GnuCOBOL included) still requires that each procedure division paragraph contain at least one sentence if there is any executable code in that paragraph, but a popular coding style is now to simply code a single period right before the end of each paragraph.

The standard for the COBOL language shows the various END-verb clauses are optional because using a period as a scope-terminator remains legal.

If you will be porting existing code over to GnuCOBOL, you’ll find it an accommodating facility capable of conforming to whatever language and coding standards that code is likely to use. If you are creating new GnuCOBOL programs, however, I would strongly counsel you to use the END-verb structures in those programs.

2.2.8. Concurrent Access to Files

The manipulation of data files is one of the COBOL language’s great strengths. There are features built into COBOL to deal with the possibility that multiple programs may be attempting to access the same file concurrently. Multiple program concurrent access is dealt with in two ways — file sharing and record locking.

Not all GnuCOBOL implementations support file sharing and record-locking options. Whether they do or not depends upon the operating system they were built for and the build options that were used when the specific GnuCOBOL implementation was generated.

2.2.8.1. File Sharing

GnuCOBOL controls concurrent-file access at the highest level through the concept of file sharing, enforced when a program attempts to open a file. This is accomplished via a UNIX operating-system routine called fcntl. That module is not currently supported by Windows and is not present in the MinGW Unix-emulation package. GnuCOBOL builds created using a MinGW environment will be incapable of supporting file-sharing controls — files will always be shared in such environments. A GnuCOBOL build created using the Cygwin environment on Windows would have access to fcntl and therefore will support file sharing. Of course, actual Unix builds of GnuCOBOL, as well as OSX builds, should have no issues because fcntl should be available.

Any limitations imposed on a successful OPEN (see OPEN) will remain in place until your program either issues a CLOSE (see CLOSE) against the file or the program terminates.

File sharing is controlled through the use of a SHARING clause:

    SHARING WITH { ALL OTHER }
    ~~~~~~~      { ~~~       }
                 { NO OTHER  }
                 { ~~        }
                 { READ ONLY }
                   ~~~~ ~~~~

This clause may be used either in the file’s SELECT statement (see SELECT), on the OPEN statement (see OPEN) which initiates your program’s use of the file, or both. If a SHARING option is specified in both places, the specifications made on the OPEN statement will take precedence over those from the SELECT statement.

Here are the meanings of the three options:

ALL OTHER

When your program opens a file with this sharing option in effect, no restrictions will be placed on other programs attempting to OPEN the file after your program did. This is the default sharing mode.

NO OTHER

When your program opens a file with this sharing option in effect, your program announces that it is unwilling to allow any other program to have any access to the file as long as you are using that file; OPEN attempts made in other programs will fail with a file status of 37 (PERMISSION DENIED) until such time as you CLOSE (see CLOSE) the file.

READ ONLY

Opening a file with this sharing option indicates you are willing to allow other programs to OPEN the file for input while you have it open. If they attempt any other OPEN, theirs will fail with a file status of 37. Of course, your program may fail if someone else got to the file first and opened it with a sharing option that imposed file-sharing limitations.

If the SELECT of a file is coded with a FILE STATUS clause, OPEN failures — including those induced by sharing failures — will be detectable by the program and a graceful recovery (or at least a graceful termination) will be possible. If no such clause was coded, however, a runtime message will be issued and the program will be terminated.

2.2.8.2. Record Locking

Record-locking is supported by advanced file-management software built-in to the GnuCOBOL implementation you are using. This software provides a single point-of-control for access to files — usually ORGANIZATION INDEXED files. One such runtime package capable of doing this is the Berkeley Database (BDB) package — a package frequently used in GnuCOBOL builds to support indexed files.

The various I/O statements your program can execute are capable of imposing limitations on access by other concurrently-executing programs to the file record they just accessed. These limitations are syntactically imposed by placing a lock on the record using a LOCK clause. Other records in the file remain available, assuming that file-sharing limitations imposed at the time the file was opened didn’t prevent access to the entire file.

  1. If the GnuCOBOL build you are using was configured to use the Berkeley Database (BDB) package for indexed file I/O, record locking will be available by using the DB_HOME run-time environment variable.
  2. If the SELECT (see SELECT) statement or file OPEN (see OPEN) specifies SHARING WITH NO OTHER, record locking will be disabled.
  3. If the file’s SELECT contains a LOCK MODE IS AUTOMATIC clause, every time a record is read from the file, that record is automatically locked. Other programs may access other records within the file, but not a locked record.
  4. If the file’s SELECT contains a LOCK MODE IS MANUAL clause, locks are placed on records only when a READ statement executed against the file includes a LOCK clause (this clause will be discussed shortly).
  5. If the LOCK ON clause is specified in the file’s SELECT, locks (either automatically or manually acquired) will continue to accumulate as more and more records are read, until they are explicitly released. This is referred to as multiple record locking.

    Locks acquired vie multiple record locking remain in-effect until the program holding the lock…

    • …terminates, or …
    • …executes a CLOSE statement (see CLOSE) against the file, or …
    • …executes an UNLOCK statement (see UNLOCK) against the file, or …
    • …executes a COMMIT statement (see COMMIT) or …
    • …executes a ROLLBACK statement (see ROLLBACK).
  6. If the LOCK ON clause is not specified, then the next I/O statement your program executes, except for START (see START), will release the lock. This is referred to as single record locking.
  7. A LOCK clause, which may be coded on a READ (see READ), REWRITE (see REWRITE) or WRITE statement (see WRITE) looks like this:
        { IGNORING LOCK    }
        { ~~~~~~~~ ~~~~    }
        { WITH [ NO ] LOCK }
        {        ~~   ~~~~ }
        { WITH KEPT LOCK   }
        {      ~~~~ ~~~~   }
        { WITH IGNORE LOCK }
        {      ~~~~~~ ~~~~ }
        { WITH WAIT        }
               ~~~~
    

    The WITH [ NO ] LOCK option is the only one available to REWRITE or WRITE statements.

    The meanings of the various record locking options are as follows:

    IGNORING LOCK
    WITH IGNORE LOCK

    These options (which are synonymous) inform GnuCOBOL that any locks held by other programs should be ignored.

    WITH LOCK

    Access to the record by other programs will be denied.

    WITH NO LOCK

    The record will not be locked. This is the default for all statements.

    WITH KEPT LOCK

    When single record locking is in effect, as a new record is accessed, locks held for previous records are released. By using this option, not only is the newly accessed record locked (as WITH LOCK would do), but prior record locks will be retained as well. A subsequent READ without the KEPT LOCK option will release all “kept” locks, as will the UNLOCK statement.

    WITH WAIT

    This option informs GnuCOBOL that the program is willing to wait for a lock held (by another program) on the record being read to be released.

    Without this option, an attempt to read a locked record will be immediately aborted and a file status of 51 will be returned.

    With this option, the program will wait for a preconfigured time for the lock to be released. If the lock is released within the preconfigured wait time, the read will be successful. If the preconfigured wait time expires before the lock is released, the read attempt will be aborted and a 51 file status will be issued.

3. CDF - Compiler Directing Facility

The Compiler Directing Facility, or CDF, is a means of controlling the compilation of GnuCOBOL programs. CDF provides a mechanism for dynamically setting or resetting certain compiler switches, introducing new source code from one or more source code libraries, making dynamic source code modifications and conditionally processing or ignoring source statements altogether. This is accomplished via a series of special CDF statements and directives that will appear in the program source code.

When the compiler is operating in Fixed Format Mode, all CDF statements must begin in column eight (8) or beyond.

There are two types of supported CDF statements in GnuCOBOL — Text Manipulation Statements and Compiler Directives.

The CDF text manipulation statements COPY and REPLACE are used to introduce new code into programs either with or without changes, or may be used to modify existing statements already in the program. Text manipulation statements are always terminated with a period.

CDF directives, denoted by the presence of a >> character sequence as part of the statement name itself, influence the process of program compilation.

Compiler directives are never terminated with a period.

The compiler command-line option -D offers additional control (see cobc - The GnuCOBOL Compiler).

3.1. >>CALL-CONVENTION

CDF >>CALL-CONVENTION Syntax

 >>CALL-CONVENTION    { COBOL   }
 ~~~~~~~~~~~~~~~~~    { EXTERN  }
                      { STDCALL }
                      { STATIC  } 

This directive instructs the compiler how to treat references to program names and may be used to determine other details for interacting with a function or program. There are four options with COBOL being the default.

COBOL

The program name is treated as a COBOL word that maps to the externalised name program to be called, cancelled or referenced in the program-address-identifier, applying the same mapping rules as for a program name for which no AS phrase is specified. (The is the default.)

EXTERN

The program name is treated as an external reference.

STDCALL

[more info needed]

STATIC

The program name is called as a included element and not dynamically which is the normal default.

3.2. COPY

CDF COPY Statement Syntax

 COPY copybook-name
 ~~~~
 [ IN|OF library-name ]
   ~~ ~~
 [ SUPPRESS PRINTING ]
   ~~~~~~~~
 [ REPLACING { Phrase-Clause | String-Clause }... ] .
   ~~~~~~~~~

CDF COPY Phrase-Clause Syntax

 { ==pseudo-text-1== } BY { ==pseudo-text-2== }
 { identifier-1      } ~~ { identifier-2      }
 { literal-1         }    { literal-2         }
 { word-1            }    { word-2            }

CDF COPY String-Clause Syntax

 [ LEADING|TRAILING ] ==partial-word-1== BY ==partial-word-2==
   ~~~~~~~ ~~~~~~~~                      ~~
  1. COPY statements are used to import copybooks (see Copybooks) into a program.
  2. COPY statements may be used anywhere within a COBOL program where the code contained within the copybook would be syntactically valid.
  3. The optional SUPPRESS clause (with or without the optional PRINTING reserved word) is valid syntactically but is non-functional. It is supported to facilitate compatibility with source code written for other versions of COBOL.
  4. There is no difference between the use of the word IN and the word OF — use the one you prefer.
  5. A period is absolutely mandatory at the end of every COPY statement, even if the statement occurs within the scope of another one where a period might appear disruptive, such as within the scope of an IF (see IF) statement. This mandatory period at the end of the statement does not, however, affect the statement scope in which the COPY occurs.
  6. Both pseudo-text-2 and partial-word-2 may be null.
  7. All COPY statements are located and the contents of the corresponding copybooks inserted into the program source code before the actual compilation process begins. If a copybook contains a COPY statement, the copybook insertion process will be repeated to resolve the embedded COPY. This will continue until no unresolved COPY statements remain. At that point, actual program compilation will begin.
  8. See Locating Copybooks, for the specific rules on how copybooks are located by the compiler.
  9. The optional REPLACING clause allows for one or more of either of the following kinds of text replacements to be made:
    Phrase-Clause

    Replacement of one or more complete reserved words, user-defined identifiers or literals; the following points apply to this option:

    • This option cannot be used to replace part of a word, identifier or literal.
    • Whatever precedes the BY will be referred to here as the search string.
    • Single-item search strings can be specified by coding the identifier-1, literal-1 or word-1 being replaced.
    • Multiple-item search strings can be specified using the ==pseudo-text-1== option. For example, to replace all occurrences of UPON PRINTER, you would specify ==UPON PRINTER==.
    • The replacement string, which follows the BY, may be specified using any of the four options.
    • If the replacement string is a multiple-item phrase or is to be deleted altogether, you must use the ==pseudo-text-2== option. If pseudo-text-2 is null (in other words, the replacement text is specified as ====), all encountered occurrences of the search string will be deleted.
    String-Clause

    Using this, you may replace character sequences that occur at the beginning (see LEADING) or end (see TRAILING) of reserved or user-defined words. For example, to change all words of the form "0100-xxxxxx" to "020-xxxxxx", code LEADING ==0100-== BY ==020-==. To simply remove all "0100-" prefixes from words, code LEADING ==0100-== BY ====.

3.3. REPLACE

CDF REPLACE Statement (Format 1) Syntax

 REPLACE [ ALSO ] { Phrase-Clause | String-Clause }... .
 ~~~~~~~   ~~~~

CDF REPLACE Statement (Format 2) Syntax

 REPLACE [ LAST ] OFF .
 ~~~~~~~   ~~~~   ~~~

CDF REPLACE Phrase-Clause Syntax

 { ==pseudo-text-1== } BY { ==pseudo-text-2== }
                       ~~

CDF REPLACE String-Clause Syntax

 [ LEADING|TRAILING ] ==partial-word-1== BY ==partial-word-2==
   ~~~~~~~ ~~~~~~~~                      ~~
  1. The REPLACE statement provides a mechanism for changing all or part of one or more GnuCOBOL statements.
  2. A period is absolutely mandatory at the end of every REPLACE statement (either format), even if the statement occurs within the scope of another one where a period might appear disruptive (such as within the scope of an IF (see IF) statement; the period will not, however, affect the statement scope in which the REPLACE occurs.
  3. The following points apply to Format 1 of the REPLACE statement:
    1. Format 1 of the REPLACE statement can be used to make changes to program source code in much the same way as the REPLACING option of the COPY statement can, via these options:
      Phrase-Clause

      Replace one or more complete reserved words, user-defined identifiers or literals; the following points apply to this option:

      • This option cannot be used to replace part of a word, identifier or literal.
      • Whatever precedes the BY will be referred to here as the search string.
      • Search strings on REPLACE are always specified using the ==pseudo-text-1== option. For example, to replace all occurrences of UPON PRINTER, you would specify ==UPON PRINTER==.
      • The replacement string, which follows the BY, is specified using the ==pseudo-text-2== option. If pseudo-text-2 is null (in other words, the replacement text is specified as ====), all encountered occurrences of the search string will be deleted.
      String-Clause

      Using this, you may replace character sequences that occur at the beginning (see LEADING) or end (see TRAILING) of reserved or user-defined words. For example, to change all words of the form "0100-xxxxxx" to "020-xxxxxx", code LEADING ==0100-== BY ==020-==. To simply remove all "0100-" prefixes from words, code LEADING ==0100-== BY ====.

    2. Once a Format 1 REPLACE statement is encountered in the currently-compiling source file, Replace Mode becomes active, and the change(s) specified by that statement will be automatically made on all subsequent source statements the compiler reads from the file.
    3. Replace Mode remains in-effect — continuing to make source code changes — until another Format 1 REPLACE is encountered, the end of currently compiling program source file is reached or a Format 2 REPLACE statement is encountered.
    4. When a Format 1 REPLACE statement with the ALSO keyword is encountered without Replace Mode being currently active, the effect will be as if the ALSO had not been specified. If Replace Mode already was in effect, the effect will be to “push” the current change specification(s) onto the top of a stack and add the specification(s) of the new statement to those that were already in effect.
    5. When a Format 1 REPLACE without the ALSO keyword is encountered, any stacked change specification(s), if any, will be discarded and the currently in-effect change specification(s), if any, will be replaced by those of the new statement.
    6. When the end of the currently-compiling source file is reached, Replace Mode is deactivated and any stacked replace specifications will be discarded — compilation of the next source file (if any) will begin with Replace Mode inactive and no change specification(s) on the stack.
  4. The following points apply to Format 2 of the REPLACE statement:
    1. If Replace Mode is currently inactive, the Format 2 REPLACE statement will be ignored.
    2. If Replace Mode is currently active, a REPLACE OFF. will deactivate Replace Mode and discard any replace specification(s) on the stack. The compiler will henceforth operate as if no REPLACE had ever been encountered, until such time as another Format 1 REPLACE is encountered.
    3. If Replace Mode is currently active, a REPLACE LAST OFF. will replace the current replace specification(s) with those popped off the top of the stack. If there were no replace specification(s) on the stack, the effect will be as if a REPLACE OFF. had been coded.

3.4. >>DEFINE

CDF >>DEFINE Directive Syntax

 >>DEFINE [ CONSTANT ] cdf-variable-1 AS { OFF                    }
 ~~~~~~~~   ~~~~~~~~                     { ~~~                    }
                                         { literal-1 [ OVERRIDE ] }
                                         {             ~~~~~~~~   }
                                         { PARAMETER [ OVERRIDE ] }
                                           ~~~~~~~~~   ~~~~~~~~

Use the >>DEFINE CDF directive to create CDF variables and (optionally) assign them either literal or environment variable values.

  1. The reserved word AS is optional and may be included, or not, at the discretion of the programmer. The presence or absence of this word has no effect upon the program.
  2. CDF variables defined in this way become undefined once an END PROGRAM or END FUNCTION directive is encountered in the input source.
  3. The >>DEFINE CDF directive is one way to create CDF variables that may be processed by other CDF statements such as >>IF (see >>IF). The >>SET CDF directive (see >>SET) provides another way to create them.
  4. CDF variable names follow the rules for standard GnuCOBOL user-defined names, and may not duplicate any CDF reserved word. CDF variable names may duplicate COBOL reserved words, provided the CONSTANT option is not specified, but such names are not recommended.
  5. The CONSTANT option is valid only in conjunction with literal-1. When CONSTANT is specified, the CDF variable that is created may be used within your regular COBOL code as if it were a literal value. Without this option, the CDF variable may only be referenced on other CDF statements. The OFF option is used to create a variable without assigning it any value.
  6. The PARAMETER option is used to create a variable whose value is that of the environment variable of the same name. Note that this value assignment occurs at compilation time, not program execution time.
  7. In the absence of the OVERRIDE option, cdf-variable-1 must not yet have been defined. When the OVERRIDE option is specified, cdf-variable-1 will be created with the specified value, if it had not yet been defined. If it had already been defined, it will be redefined with the new value.

3.5. >>IF

CDF >>IF Directive Syntax

 >>IF CDF-Conditional-Expression-1
 ~~~~     [ Program-Source-Lines-1 ]

 [ >>ELIF CDF-Conditional-Expression-2
   ~~~~~~ [ Program-Source-Lines-2 ] ]...

 [ >>ELSE
   ~~~~~~ [ Program-Source-Lines-3 ] ]

 >>END-IF
 ~~~~~~~~

CDF-Conditional-Expression Syntax

 { cdf-variable-1 } IS [ NOT ] { DEFINED                      }
 { literal-1      }      ~~~   { ~~~~~~~                      }
                               { SET                          }
                               { ~~~                          }
                               { CDF-RelOp { cdf-variable-2 } }
                               {           { literal-2      } }

CDF-RelOp Syntax

 >=    or    GREATER THAN OR EQUAL TO
             ~~~~~~~      ~~ ~~~~~
 >     or    GREATER THAN
             ~~~~~~~
 <=    or    LESS THAN OR EQUAL TO
             ~~~~      ~~ ~~~~~
 <     or    LESS THAN
             ~~~~
 =     or    EQUAL TO
             ~~~~~
 <>    or    EQUAL TO (with "NOT")
             ~~~~~

The >>IF CDF directive causes the GnuCOBOL compiler to process or ignore COBOL source statements, CDF text-manipulation statements and/or CDF directives depending upon the value of one or more conditional expressions based upon CDF variables.

  1. The reserved words IS, THAN and TO are optional and may be omitted. The presence or absence of these words has no effect on the program.
  2. Each >>IF directive must be terminated by an >>END-IF directive.
  3. There may be any number of >>ELIF clauses following an >>IF, including zero.
  4. There may no more than one >>ELSE clause following an >>IF. When >>ELSE is used, it must follow the >>IF and all >>ELIF clauses.
  5. Only one of the Program-Source-Lines-n block of statements that lie within the scope of the >>IF>>END-IF may be processed by the compiler. Which one (if any) that gets processed will be decided as follows:
    1. Each CDF-Conditional-Expression-n will be evaluated, in turn, in the sequence in which they are coded in the >>IF statement and any >>ELIF clauses that may be present until one evaluates to TRUE. Once one of them evaluates to TRUE, the Program-Source-Lines-n block of code that corresponds to the TRUE CDF-Conditional-Expression-n will be one that is processed. All others within the >>IF->>END-IF scope will be ignored.
    2. If no CDF-Conditional-Expression evaluates to TRUE, and there is an >>ELSE clause, the Program-Source-Lines-3 block of statements following the >>ELSE clause will be processed by the compiler and all others within the >>IF->>END-IF scope will be ignored.
    3. If no CDF-Conditional-Expression-n evaluates to TRUE and there is no >>ELSE clause, then none of the Program-Source-Lines-n block of statements within the >>IF->>END-IF scope will be processed by the compiler.
    4. If the Program-Source-Lines-n> statement block selected for processing is empty, no error results — there will just be no code generated from the >>IF->>END-IF structure.
  6. A Program-Source-Lines-n block may contain any valid COBOL or CDF code.
  7. The following points pertain to any CDF-Conditional-Expression-n:
    1. The DEFINED option tests for whether cdf-variable-1 has been defined, but not yet assigned a value (>>DEFINE … OFF); use the NOT option to test for the variable not being defined.
    2. The SET option tests for whether cdf-variable-1 has been given a value, either via a >>SET statement or via a >>DEFINE without the OFF option.
    3. Two CDF variables, two literals or a single CDF variable and a single literal may be compared against each other using a relational operator. Unlike the standard GnuCOBOL IF statement (see IF), multiple comparisons cannot be ANDed or ORed together; you may nest a second >>IF inside the first, however, to simulate an AND and an OR may be simulated via the >>ELIF option.
    4. The <> symbol stands for NOT EQUAL TO.

3.6. >>SET

CDF >>SET Directive Syntax

 >>SET { [ CONSTANT ] cdf-variable-1 literal-1 ] }
 ~~~~~ {   ~~~~~~~~                              }
       { SOURCEFORMAT AS FIXED|FREE              }
       { ~~~~~~~~~~~~    ~~~~~ ~~~~              }
       { NOFOLDCOPYNAME                          }
       { ~~~~~~~~~~~~~~                          }
       { FOLDCOPYNAME AS UPPER|LOWER             }
         ~~~~~~~~~~~~    ~~~~~ ~~~~~

The >>SET CDF directive provides an alternate means of performing the actions of the >>DEFINE and >>SOURCE directives, as well as a means of controlling the compiler’s -free switch, -fixed switch and -ffold-copy switch from within program source code.

  1. The reserved word AS is optional (only on the SOURCEFORMAT and FOLDCOPYNAME clauses) and may be included, or not, at the discretion of the programmer. The presence or absence of this word has no effect upon the program.
  2. CDF variables defined in this way become undefined once an END PROGRAM or END FUNCTION directive is encountered in the input source.
  3. The FOLDCOPYNAME option provides the equivalent of specifying the compiler -ffold-copy=xxx switch, where xxx is either UPPER or LOWER.
  4. The NOFOLDCOPYNAME option turns off the effect of either the >>SET FOLDCOPYNAME statement or the compiler -ffold-copy=xxx switch.
  5. If the CONSTANT option is used, literal-1 must also be used. This option provides another means of defining constants that may be used anywhere in the program that a literal could be specified.
  6. The remaining options of the >>SET CDF directive provide equivalent functionality to the >>DEFINE and >>SOURCE directives, as follows:
    >>SET cdf-variable-1

    >>DEFINE cdf-variable-1 AS OFF

    >>SET cdf-variable-1 AS literal-1

    >>DEFINE cdf-variable-1 AS literal-1

    >>SET CONSTANT cdf-variable-1 literal-1

    >>DEFINE CONSTANT cdf-variable-1 literal-1

    >>SET SOURCEFORMAT AS FIXED

    >>SOURCE FORMAT IS FIXED

    >>SET SOURCEFORMAT AS FREE

    >>SOURCE FORMAT IS FREE

    >>SET XFD literal-1

    [to do]

    >>SET Micro-Focus-Directive

    [to do]

3.7. >>SOURCE

CDF >>SOURCE Directive Syntax

 >>SOURCE FORMAT IS FIXED|FREE|VARIABLE
 ~~~~~~~~           ~~~~~ ~~~~ ~~~~~~~~

The >>SOURCE CDF directive puts the compiler into FIXED or FREE source-code format mode. This, in effect, provides yet another mechanism for controlling the compiler’s -free switch and -fixed switch.

  1. The reserved words FORMAT and IS are optional and may be included, or not, at the discretion of the programmer. The presence or absence of these words has no effect upon the program.
  2. You may switch between FIXED and FREE mode as desired.
  3. You may also use the >>SET CDF directive to perform this function.
  4. If the compiler is already in the specified mode, this statement will have no effect.

3.8. >>TURN

CDF >>TURN Directive Syntax

 >>TURN { exception-name-1 [ file-name-1 ]... }...
 ~~~~~~
    { OFF                           }
    { ~~~                           }
    { CHECKING ON [ WITH LOCATION ] }
      ~~~~~~~~ ~~        ~~~~~~~~

The directive will (de-)activate exception checks.

3.9. >>D

CDF >>D Directive Syntax

 >>D
 ~~~

The directive removes all floating debug lines if debug mode not active. Otherwise will ignore the directive part of the line.

3.10. >>DISPLAY

CDF >>DISPLAY Directive Syntax

 >>DISPLAY source-text [ VCS = version-string ]
 ~~~~~~~~~               ~~~

The directive is a v1.0 extension and will display messages during compilation.

3.11. >>PAGE

CDF >>PAGE Directive Syntax

 >>PAGE
 ~~~~~~

The directive allows usage of the IBM paging controls EJECT, SKIP1, SKIP2, SKIP3 and TITLE.

3.12. >>LISTING

CDF >>LISTING Directive Syntax

 >>LISTING  {ON}
 ~~~~~~~~~  {OFF}

The directive allows the program listing to be de-(activated).

3.13. >>LEAP-SECONDS

CDF >>LEAP-SECONDS Directive Syntax

 >>LEAP-SECONDS
 ~~~~~~~~~~~~~~

The >>LEAP-SECONDS CDF directive is syntactically recognized but is otherwise non-functional.

Allows for more than 60 seconds per minute.

3.14. $ Directives

CDF $ Directive Syntax

 $ (Dollar) Directives - Active.

 These directives are active and have the same function as ones starting with >>:

 $DEFINE
 $DISPLAY ON|OFF
 $IF
 $ELIF
 $ELSE
 $ELSE-IF
 $END
 $SET

 It is recommend to use the standard directives only instead of the MF directives
 (when possible) as these have a a higher chance for being portable.

 $ (Dollar) Directives - Not Active.
 These are NOT active and will produce a warning message:

 $DISPLAY VCS ...

 Recognised but otherwise ignored.

 @OPTIONS options-text

 Additional Micro-Focus directives accepted :

 ADDRSV | ADD-RSV literal-1
 ADDSYN | ADD-SYN literal-1 = literal-2
 ASSIGN  "EXTERNAL" | "DYNAMIC"
 BOUND
 CALLFH  literal-1
 COMP1  |  COMP-1  "BINARY" | "FLOAT"
 FOLDCOPYNAME | FOLD-COPY-NAME  AS "UPPER" | "LOWER"
 MAKESYN  |  MAKE-SYN
 NOBOUND  |  NO-BOUND
 NOFOLDCOPYNAME  |  NOFOLD-COPY-NAME  |  NO-FOLD-COPY-NAME
 OVERRIDE  literal-1 = literal-2
 REMOVE  literal-1
 SOURCEFORMAT | SOURCE-FORMAT "FIXED" | "FREE" | "VARIABLE"
 SSRANGE "2"
 NOSSRANGE  |  NO-SSRANGE

Offers support for MF Compiler Directives.

3.15. Predefined compilation variables

GnuCOBOL defines compilation variables when various conditions are true. If the condition associated with a variable is false, the variable is not defined.

DEBUG

The -d debug flag is specified.

EXECUTABLE

Module being compiled contains the main program.

GCCOMP

The size of a COMP item is determined according to the GnuCOBOL scheme where for a picture of length:

1 - 2

item = 1 byte

3 - 4

item = 2 bytes

5 - 9

item = 4 bytes

10 - 18

item = 8 bytes

GNUCOBOL

GnuCOBOL is compiling the source unit.

HOSTSIGNS

A signed packed decimal item’s value may be considered NUMERIC if sign = X"F".

IBMCOMP

The size of a COMP item is determined according to the IBM scheme, where for a PICTURE of length:

1 - 4

item = 2 bytes

5 - 9

item = 4 bytes

10 - 18

item = 8 bytes

MODULE

The element being compiled does not contain the main program.

NOHOSTSIGNS

A signed packed decimal item’s value may not be considered NUMERIC if sign = X"F".

NOIBMCOMP

The size of a COMP item is not determined according to the IBM scheme.

NOSTICKY-LINKAGE

Sticky linkage (linkage section items remaining allocated between invocations) is not active.

NOTRUNC

Numeric data items are truncated according to their internal representation.

P64

Pointers are greater than 32 bits.

STICKY-LINKAGE

Sticky linkage (linkage section items remaining allocated between invocations) is active.

TRUNC

Numeric data items are truncated according to their PICTURE clauses.

While still supported, this may well be removed in the future and should not be used. See GCCOMP and GNUCOBOL instead:

OCCOMP

The size of a COMP item is determined according to the GnuCOBOL scheme, where for a PICTURE of length:

1 - 2

item = 1 byte

3 - 4

item = 2 bytes

5 - 9

item = 4 bytes

10 - 18

item = 8 bytes

OPENCOBOL

GnuCOBOL is compiling the source unit.

4. IDENTIFICATION DIVISION

IDENTIFICATION DIVISION Syntax

[{ IDENTIFICATION } DIVISION. ]
 { ~~~~~~~~~~~~~~ } ~~~~~~~~
 { ID             }
   ~~
 { PROGRAM-ID.  } { program name } .
 { ~~~~~~~~~~   } { literal-1    } [ AS { literal-2 } ] [ Type-clause ] .
 { FUNCTION-ID. } { literal-3 } [ AS literal-4 ] .
   ~~~~~~~~~~~    { function-name } .
 { OPTIONS. }
   ~~~~~~~
 [ DEFAULT ROUNDED MODE IS {AWAY-FROM-ZERO         }
   ~~~~~~~ ~~~~~~~         {NEAREST-AWAY-FROM-ZERO }
                           {NEAREST-EVEN           }
                           {NEAREST-TOWARDS-ZERO   }
                           {PROHIBITED             }
                           {TOWARDS-GREATER        }
                           {TOWARDS-LESSER         }
                           {TRUNCATION             }]
 [ ENTRY-CONVENTION IS {COBOL   }
   ~~~~~~~~~~~~~~~~    {EXTERN  }
                       {STDCALL }]
 [ AUTHOR.        comment-1. ]
   ~~~~~~
 [ DATE-COMPILED. comment-2. ]
   ~~~~~~~~~~~~~
 [ DATE-MODIFIED. comment-3. ]
   ~~~~~~~~~~~~~
 [ DATE-WRITTEN.  comment-4. ]
   ~~~~~~~~~~~~
 [ INSTALLATION.  comment-5. ]
   ~~~~~~~~~~~~
 [ REMARKS.       comment-6. ]
   ~~~~~~~
 [ SECURITY.      comment-7. ]
   ~~~~~~~~

The AUTHOR, DATE-COMPILED, DATE-MODIFIED, DATE-WRITTEN, INSTALLATION, REMARKS and SECURITY paragraphs are supported by GnuCOBOL only to provide compatibility with programs written for the ANS1974 (or earlier) standards. As of the ANS1985 standard, these clauses have become obsolete and should not be used in new programs.

PROGRAM-ID Type Clause Syntax

 IS [ COMMON ] [ INITIAL|RECURSIVE PROGRAM ]
      ~~~~~~     ~~~~~~~ ~~~~~~~~~

The identification division provides basic identification of the program by giving it a name and optionally defining some high-level characteristics via the eight pre-defined paragraphs that may be specified.

  1. The paragraphs shown above may be coded in any sequence.
  2. The reserved words AS, IS and PROGRAM are optional and may be included, or not, at the discretion of the programmer. The presence or absence of these words has no effect upon the program.
  3. A Type Clause may be coded only when PROGRAM-ID is specified. If one is coded, either COMMON, COMMON INITIAL or COMMON RECURSIVE must be specified.
  4. While the actual IDENTIFICATION DIVISION or ID DIVISION header is optional, the PROGRAM-ID / FUNCTION-ID paragraphs are not; only one or the other, however, may be coded.
  5. The compiler’s -Wobsolete switch will cause the GnuCOBOL compiler to issue warnings messages if these (or any other obsolete syntax) is used in a program.
  6. If specified, literal-1 must be an actual alphanumeric literal and may not be a figurative constant.
  7. The PROGRAM-ID and FUNCTION-ID paragraphs serve to identify the program to the external (i.e. operating system) environment. If there is no AS clause present, the program-id will serve as that external identification. If there is an AS clause specified, that specified literal will serve as the external identification. For the remainder of this document, that "external identification" will be referred to as the primary entry-point name.
  8. The INITIAL, COMMON and RECURSIVE words are used only within subprograms serving as subroutines. Their purposes are as follows:
    1. COMMON should be used only within subprograms that are nested subprograms. A nested subprogram declared as COMMON may be called from any nested program in the source file being compiled, not just those "above" it in the nesting structure.
    2. The RECURSIVE clause, if any, will cause the compiler to generate different object code for the subprogram that will enable it to invoke itself and to properly return back to the program that invoked it.

      User-defined functions (i.e. FUNCTION-ID) are always recursive.

    3. The INITIAL clause, if specified, guarantees the subprogram will be in its initial (i.e. compiled) state each and every time it is executed, not just the first time.

5. ENVIRONMENT DIVISION

ENVIRONMENT DIVISION Syntax

   ENVIRONMENT DIVISION.
   ~~~~~~~~~~~ ~~~~~~~~
 [ CONFIGURATION SECTION. ]
   ~~~~~~~~~~~~~ ~~~~~~~~
 [ SOURCE-COMPUTER.         Compilation-Computer-Specification . ]
   ~~~~~~~~~~~~~~~
 [ OBJECT-COMPUTER.         Execution-Computer-Specification . ]
   ~~~~~~~~~~~~~~~
 [ SPECIAL-NAMES.           Program-Configuration-Specification . ]
   ~~~~~~~~~~~~~
 [ REPOSITORY.              Function-Specification... . ]
   ~~~~~~~~~~
 [ INPUT-OUTPUT SECTION. ]
   ~~~~~~~~~~~~ ~~~~~~~
 [ FILE-CONTROL.            General-File-Description... . ]
   ~~~~~~~~~~~~
 [ I-O-CONTROL.             File-Buffering Specification... . ]
   ~~~~~~~~~~~

This division defines the external computer environment in which the program will be operating. This includes defining any files that the program may be .

  • If both optional sections of this division are coded, they must be coded in the sequence shown.
  • The paragraphs within the sections may be coded in any order.
  • These sections consist of a series of specific, pre-defined, paragraphs (SOURCE-COMPUTER and OBJECT-COMPUTER, for example), each of which serves a specific purpose. If no code is required for the purpose one of the paragraphs serves, the entire paragraph may be omitted.
  • If any of the paragraphs within one of the sections are coded, the section header itself must be coded.
  • If none of the paragraphs within one of the sections are coded, the section header itself may be omitted.
  • If none of the sections within the environment division are coded, the ENVIRONMENT DIVISION. header itself may be omitted.

5.1. CONFIGURATION SECTION

CONFIGURATION SECTION Syntax

   CONFIGURATION SECTION.
   ~~~~~~~~~~~~~ ~~~~~~~
 [ SOURCE-COMPUTER. Compilation-Computer-Specification . ]
   ~~~~~~~~~~~~~~~
 [ OBJECT-COMPUTER. Execution-Computer-Specification . ]
   ~~~~~~~~~~~~~~~
 [ SPECIAL-NAMES.   Program-Configuration-Specification . ]
   ~~~~~~~~~~~~~
 [ REPOSITORY.      Function-Specification... . ]
   ~~~~~~~~~~

This section defines the computer system upon which the program is being compiled and executed and also specifies any special environmental configuration or compatibility characteristics.

  1. The four paragraphs in this section may be specified in any order but if not in this order, a warning will be issued.
  2. The configuration section is not allowed in a nested subprogram. A nested program inherits the configuration section settings of its parent program.
  3. If none of the features provided by the configuration section are required by a program, the entire CONFIGURATION SECTION. header may be omitted from the program.

5.1.1. SOURCE-COMPUTER

SOURCE-COMPUTER Syntax

 SOURCE-COMPUTER. computer-name [ WITH DEBUGGING MODE ] .
 ~~~~~~~~~~~~~~~                       ~~~~~~~~~ ~~~~

This paragraph defines the computer upon which the program is being compiled and provides one way in which debugging code embedded within the program may be activated.

  1. The reserved word WITH is optional and may be omitted. The presence or absence of this word has no effect upon the program.
  2. This paragraph is not allowed in a nested subprogram. A nested program inherits the SOURCE-COMPUTER settings of its parent program.
  3. The value specified for computer-name is irrelevant, provided it is a valid COBOL word that does not match any GnuCOBOL reserved word. The computer-name value may include spaces. This need not match the computer-name used with the OBJECT-COMPUTER paragraph, if any.
  4. The DEBUGGING MODE clause, if present, will inform the compiler that debugging lines (those with a ‘D’ in column 7 if Fixed Source Mode is in effect, or those prefixed with a >>D if Free Source Mode is in effect) — normally treated as comments — are to be compiled.
  5. Even without the DEBUGGING MODE clause, it is still possible to compile debugging lines. Debugging lines may also be compiled by specifying the -fdebugging-line switch to the GnuCOBOL compiler.

5.1.2. OBJECT-COMPUTER

OBJECT-COMPUTER Syntax

 OBJECT-COMPUTER.  [ computer-name ]
 ~~~~~~~~~~~~~~~
 [ MEMORY SIZE IS integer-1 WORDS|CHARACTERS ]
   ~~~~~~ ~~~~              ~~~~~ ~~~~~~~~~~
 [ PROGRAM COLLATING SEQUENCE IS alphabet-name-1 ]
           ~~~~~~~~~
 [ SEGMENT-LIMIT IS integer-2 ]
   ~~~~~~~~~~~~~
 [ CHARACTER CLASSIFICATION IS { locale-name-1  } ]
             ~~~~~~~~~~~~~~    { LOCALE         }
                               { ~~~~~~         }
                               { USER-DEFAULT   }
                               { ~~~~~~~~~~~~   }
                               { SYSTEM-DEFAULT }
                                 ~~~~~~~~~~~~~~
 .

The MEMORY SIZE and SEGMENT-LIMIT clauses are syntactically recognized but are otherwise non-functional.

This paragraph describes the computer upon which the program will execute.

  1. The computer-name, if specified, must immediately follow the OBJECT-COMPUTER paragraph name. The remaining clauses may be coded in any sequence.
  2. The reserved words CHARACTER, IS, PROGRAM and SEQUENCE are optional and may be omitted. The presence or absence of these words has no effect on the program.
  3. The value specified for computer-name, if any, is irrelevant provided it is a valid COBOL word that does not match any GnuCOBOL reserved word. The computer-name may include spaces. This need not match the computer-name used with the SOURCE-COMPUTER paragraph, if any.
  4. The OBJECT-COMPUTER paragraph is not allowed in a nested subprogram. A nested program inherits the OBJECT-COMPUTER settings of its parent program.
  5. The COLLATING SEQUENCE clause allows you to specify a customized character collating sequence to be used when alphanumeric values are compared to one another. Data will still be stored in the character set native to the computer, but the logical sequence in which characters are ordered for comparison purposes can be altered from that defined by the computer’s native character set. The alphabet-name-1 you specify needs to be defined in the SPECIAL-NAMES (see SPECIAL-NAMES) paragraph.
  6. If no COLLATING SEQUENCE clause is specified, the collating sequence implied by the character set native to the computer (usually ASCII) will be used.
  7. The optional CLASSIFICATION clause may be used to specify a locale for the environment in which the program will execute, for the purpose of influencing the upper-case and lower-case mappings of characters for the UPPER-CASE (see UPPER-CASE) and LOWER-CASE (see LOWER-CASE) intrinsic functions and the classification of characters for the ALPHABETIC, ALPHABETIC-LOWER and ALPHABETIC-UPPER class tests. The definitions of these classes is taken from the cultural convention specification (LC_CTYPE) from the specified locale.

    The meanings of the four locale specifications are as follows:

    1. locale-name-1 references a LOCALE (see SPECIAL-NAMES) definition.
    2. The keyword LOCALE refers to the current locale (in effect at the time the program is executed)
    3. The keyword USER-DEFAULT references the default locale specified for the user currently executing this program.
    4. The keyword SYSTEM-DEFAULT denotes the default locale specified for the computer upon which the program is executing.
  8. Absence of a CLASSIFICATION clause will cause character classification to occur according to the rules for the computer’s native character set (ASCII, EBCDIC, etc.).

5.1.3. SPECIAL-NAMES

SPECIAL-NAMES Syntax

 SPECIAL-NAMES.
 ~~~~~~~~~~~~~
  [ CALL-CONVENTION integer-1 IS mnemonic-name-1 ]
    ~~~~~~~~~~~~~~~
  [ CONSOLE IS CRT ]
    ~~~~~~~    ~~~
  [ CRT STATUS IS identifier-1 ]
    ~~~ ~~~~~~
  [ CURRENCY SIGN IS literal-1 ]
    ~~~~~~~~ ~~~~
  [ CURSOR IS identifier-2 ]
    ~~~~~~
  [ DECIMAL-POINT IS COMMA ]
    ~~~~~~~~~~~~~    ~~~~~
  [ EVENT STATUS IS identifier-3 ]
    ~~~~~ ~~~~~~
  [ LOCALE locale-name-1 IS literal-2 ]...
    ~~~~~~
  [ NUMERIC SIGN IS TRAILING SEPARATE ]
    ~~~~~~~ ~~~~    ~~~~~~~~ ~~~~~~~~
  [ SCREEN CONTROL IS identifier-4 ]
    ~~~~~~ ~~~~~~~
  [ device-name-1 IS mnemonic-name-2 ]...

  [ feature-name-1 IS mnemonic-name-3 ]...

  [ Alphabet-Clause ]...

  [ Class-Definition-Clause ]...

  [ Switch-Definition-Clause ]...

  [ Symbolic-Characters-Clause ]...
  .

The EVENT STATUS and SCREEN CONTROL clauses are syntactically recognized but are otherwise non-functional.

Alphabet-Name-Clause, Class-Definition-Clause,
Switch-Definition-Clause and Symbolic-Characters-Clause
are discussed in detail in the next four sections.

The SPECIAL-NAMES paragraph provides a means for specifying various program and operating environment configuration options.

  1. The various clauses that may be specified within the SPECIAL-NAMES paragraph may be coded in any order.
  2. The reserved word IS is optional and may be omitted. The presence or absence of this word has no effect upon the program.
  3. The SPECIAL-NAMES paragraph is not allowed in a nested subprogram. A nested program inherits the SPECIAL-NAMES settings of its parent program.
  4. Only the final clause specified within this paragraph should be terminated with a period.
  5. The CALL-CONVENTION clause allows a decimal integer, representing a series of ON/OFF switch settings, to be associated with a mnemonic name which may then be coded on a CALL statement (see CALL). The switch settings defined by this mnemonic will then control how the linkage to a subroutine invoked by the CALL statement that references mnemonic-name-1 will be handled.
  6. The CONSOLE IS CRT clause, if specified, will cause a DISPLAY statement lacking an explicit UPON clause to be treated as a DISPLAY screen-data-item statement (see DISPLAY screen-data-item), and any ACCEPT statement lacking a FROM clause to be treated as a ACCEPT screen-data-item statement (see ACCEPT screen-data-item).
  7. If the CRT STATUS clause is not specified, an implicit COB-CRT-STATUS identifier (with a PICTURE 9(4)) will be allocated for the purpose of receiving screen ACCEPT statuses. If CRT STATUS is specified, then identifier-1 must be defined in the program as a PICTURE 9(4) field.
  8. The CURRENCY SIGN clause may be used to redefine the character to be used as a currency sign in a PICTURE (see PICTURE) clause. The default currency sign is a dollar-sign (‘$’). You may specify any character except 0-9, A-Z, a-z, +, -, ,, ., *, /, ;, (, ), =, \\, quote (‘"’) or space.
  9. The CURSOR IS clause allows you to specify a 4- or 6-character data item into which the cursor screen location at the time a screen ACCEPT is satisfied. The value will be returned as rrcc or rrrccc, depending upon the length of the specified identifier-2, where rr and rrr represent the row number (starting at zero) and cc and ccc represent the column number (also starting at zero). There is no default data item allocated for this data if the CURSOR IS clause is not specified, and it is the programmer’s responsibility to define identifier-2 if the clause is specified.
  10. The DECIMAL POINT IS COMMA clause reverses the definition of the ‘,’ and ‘.’ characters when they are used as PICTURE editing symbols and within numeric literals. This can have unwanted side-effects - see Punctuation.
  11. The LOCALE clause may be used to associate external OS-defined locale names (literal-2) with an internal name (locale-name-1) that may then be referenced within the program. Locale names are defined by the Operating System and/or C compiler GnuCOBOL will be utilizing on your computer.
  12. The following is the list of possible locale codes, for example, that would be available on a Windows computer running a GnuCOBOL version that was built utilizing the MinGW Unix-emulator and the GNU C compiler (gcc):
    A

    af_ZA, am_ET, ar_AE, ar_BH, ar_DZ, ar_EG, ar_IQ, ar_JO, ar_KW, ar_LB, ar_LY, ar_MA, ar_OM, ar_QA, ar_SA, ar_SY, ar_TN, ar_YE, arn_CL, as_IN, az_Cyrl_AZ, az_Latn_AZ

    B

    ba_R, be_BY, bg_BG, bn_IN bo_BT, bo_CN, br_FR, bs_Cyrl_BA, bs_Latn_BA

    C

    ca_ES, cs_CZ, cy_GB

    D

    da_DK, de_AT, de_CH, de_DE, de_LI, de_LU, dsb_DE, dv_MV

    E

    el_GR, en_029, en_AU, en_BZ, en_CA, en_GB, en_IE, en_IN, en_JM, en_MY en_NZ, en_PH, en_SG, en_TT, en_US, en_ZA, en_ZW, es_AR, es_BO, es_CL, es_CO, es_CR, es_DO, es_EC, es_ES, es_GT, es_HN, es_MX, es_NI, es_PA, es_PE, es_PR, es_PY, es_SV, es_US, es_UY es_VE, et_EE, eu_ES

    F

    fa_IR, fi_FI, fil_PH, fo_FO, fr_BE, fr_CA, fr_CH, fr_FR, fr_LU, fr_MC, fy_NL

    G

    ga_IE, gbz_AF, gl_ES, gsw_FR, gu_IN

    H

    ha_Latn_NG, he_IL, hi_IN, hr_BA, hr_HR, hu_HU, hy_AM

    I

    id_ID, ig_NG, ii_CN, is_IS, it_CH, it_IT, iu_Cans_CA, iu_Latn_CA

    J

    ja_JP

    K

    ka_GE, kh_KH, kk_KZ, kl_GL, kn_IN, ko_KR, kok_IN, ky_KG

    L

    lb_LU, lo_LA, lt_LT, lv_LV

    M

    mi_NZ, mk_MK, ml_IN, mn_Cyrl_MN, mn_Mong_CN moh_CA, mr_IN, ms_BN, ms_MY, mt_MT

    N

    nb_NO, ne_NP, nl_BE, nl_NL, nn_NO, ns_ZA

    O

    oc_FR, or_IN

    P

    pa_IN, pl_PL, ps_AF, pt_BR, pt_PT

    Q

    qut_GT, quz_BO, quz_EC, quz_PE

    R

    rm_CH, ro_RO, ru_RU, rw_RW

    S

    sa_IN, sah_RU, se_FI, se_NO se_SE, si_LK, sk_SK, sl_SI, sma_NO, sma_SE, smj_NO, smj_SE, smn_FI, sms_FI, sq_AL, sr_Cyrl_BA, sr_Cyrl_CS, sr_Latn_BA, sr_Latn_CS, sv_FI, sv_SE, sw_KE syr_SY

    T

    ta_IN, te_IN, tg_Cyrl_TJ, th_TH tk_TM, tmz_Latn_DZ, tn_ZA, tr_IN, tr_TR, tt_RU

    U

    ug_CN, uk_UA, ur_PK, uz_Cyrl_UZ, uz_Latn_UZ

    V

    vi_VN

    W

    wen_DE, wo_SN

    X

    xh_ZA

    Y

    yo_NG

    Z

    zh_CN, zh_HK, zh_MO, zh_SG, zh_TW, zu_ZA

  13. The NUMERIC SIGN TRAILING SEPARATE specification causes all signed numeric USAGE DISPLAY data items to be created as if the SIGN IS TRAILING SEPARATE CHARACTER clause was included in their definitions.
  14. The device-name-1 IS mnemonic-name-2 clause allows you to specify an alternate name (device-name-1) for one of the built-in GnuCOBOL device name mnemonic-name-2. The list of device names built-into GnuCOBOL, and the physical device associated with that name, are as follows:
    CONSOLE

    This is the (screen-mode) display of the PC or Unix system.

    STDIN
    SYSIN
    SYSIPT

    These devices (they are all synonymous) represent standard system input (pipe 0). On a PC or UNIX system, this is typically the keyboard. The contents of a file may be delivered to a GnuCOBOL program for access via one of these device names by adding the sequence ‘0< filename’ to the end of the programs execution command.

    PRINTER
    STDOUT
    SYSLIST
    SYSLST
    SYSOUT

    These devices (they are all synonymous) represent standard system output (pipe 1). On a PC or UNIX system, this is typically the display. Output sent to one of these devices by a GnuCOBOL program can be sent to a file by adding the sequence ‘1> filename’ to the end of the programs execution command.

    STDERR
    SYSERR

    These devices (they are synonymous) represent standard system error output (pipe 2). On a PC or UNIX system, this is typically the display. Output sent to one of these devices by a GnuCOBOL program can be sent to a file by adding the sequence ‘2> filename’ to the end of the programs execution command.

  15. The feature-name-1 IS mnemonic-name-3 clause allow for mnemonic names to be assigned to up to the 13 printer channel (i.e. vertical page positioning) position feature names Cnn (nn=01-12) and CSP. Once a channel position has been assigned a mnemonic name, statements of the form WRITE record-name AFTER ADVANCING mnemonic-name-3 may be coded to write the specified print record at the channel position assigned to mnemonic-name-3.

    Printers supporting channel positioning are generally mainframe-type line printers. When writing to printers that do not support channel positioning, a formfeed will be issued to the printer.

    The CSP positioning option stands for “No Spacing”. Testing on a MinGW build of GnuCOBOL shows that this too results in a formfeed being issued.

5.1.3.1. Alphabet-Name-Clause

SPECIAL-NAMES Alphabet-Clause Syntax

 ALPHABET alphabet-name-1 IS { ASCII             }
 ~~~~~~~~                    { ~~~~~             }
                             { EBCDIC            }
                             { ~~~~~~            }
                             { NATIVE            }
                             { ~~~~~~            }
                             { STANDARD-1        }
                             { ~~~~~~~~~~        }
                             { STANDARD-2        }
                             { ~~~~~~~~~~        }
                             { Literal-Clause... }

SPECIAL-NAMES ALPHABET Literal-Clause Syntax

 literal-1 [ { THRU|THROUGH literal-2 } ]
             { ~~~~ ~~~~~~~           }
             { {ALSO literal-3}...    }
                ~~~~

The ALPHABET clause relates alphabet-name-1 to a specified character code set or collating sequence, including one you define yourself using the literal-1 option.

  1. The reserved word IS is optional and may be omitted. The presence or absence of this word has no effect upon the program.
  2. The reserved words THRU and THROUGH are interchangeable.
  3. GnuCOBOL considers ASCII, STANDARD-1 and STANDARD-2 to be interchangeable.
  4. NATIVE specifies the system default character set.
  5. The following points apply to using the literal-n specifications to compose a custom character set:
    1. The literal-n values are either integers or alphanumeric quoted characters. These represent a single character in the NATIVE character set, either by its actual text value (alphanumeric quoted character) or by ordinal position in the NATIVE character set (integer),
    2. The sequence in which characters are defined in this clause specifies the relative order those characters should have when comparisons are made using this alphabet.
    3. Character positions in this list do not affect the actual binary storage values used for the characters. Binary values will still be those of the NATIVE character set.
    4. You may specify any of the figurative constants SPACE, SPACES, ZERO, ZEROS, ZEROES, QUOTE, QUOTES, HIGH-VALUE, HIGH-VALUES, LOW-VALUE or LOW-VALUES for any of the literal-1, literal-2 or literal-3 specifications.
  6. Once you have defined an alphabet name, that alphabet name may be used on specifications in CODE-SET, COLLATING SEQUENCE, or SYMBOLIC CHARACTERS clauses elsewhere in the program.

5.1.3.2. Class-Definition-Clause

SPECIAL-NAMES Class-Definition-Clause Syntax

 CLASS class-name-1 IS { literal-1 [ THRU|THROUGH literal-2 ] }...
 ~~~~~                               ~~~~ ~~~~~~~
  1. The reserved word IS is optional and may be omitted. The presence or absence of this word has no effect upon the program.
  2. The reserved words THRU and THROUGH are interchangeable.
  3. Both literal-1 and literal-2 must be alphanumeric literals of length 1.
  4. The literal(s) specified on this clause define the possible characters that may be found in a data item’s value in order to be considered part of the class.
  5. For example, the following defines a class called Hexadecimal, the definition of which specifies the only characters that may be present in an alphanumeric data item if that data item is to be part of the Hexadecimal class:
    CLASS Hexadecimal IS '0' THRU '9'
                         'A' THRU 'F'
                         'a' THRU 'f'
    
  6. Once class Hexadecimal has been defined, program code could then use a statement such as IF input-item IS Hexadecimal to determine if the value of characters in a data item are valid according to that class.

5.1.3.3. Switch-Definition-Clause

SPECIAL-NAMES Switch-Definition-Clause Syntax

 switch-name-1 [ IS mnemonic-name-1 ]

   [ ON STATUS IS condition-name-1 ]
     ~~
   [ OFF STATUS IS condition-name-2 ]
     ~~~

The switch-definition clause associates a condition-name with a run-time execution switch so that the status of that switch may be tested from within a program.

  1. The reserved words IS and STATUS are optional and may be omitted. The presence or absence of these words has no effect upon the program.
  2. The valid switch-name-1 names are SWITCH-n (n = 0-36).
  3. If the program is compiled with the -fsyntax-extension switch, the switch names SWn (n = 0-15) are also valid; they correspond to SWITCH-0 through SWITCH-15, respectively as well as SWITCH-16 through SWITCH-36, SWITCH 0 through SWITCH 26 and SWITCH A through SWITCH Z.
  4. At execution time, each switch will be associated with a COB_SWITCH_n run-time environment variable, where n will have the value ‘0’ through ‘15’. Any of these sixteen environment variables that have the value ON (regardless of upper- or lower-case value) will be considered to be set “on”. Any of these sixteen environment variables having no value at all or a value other than ON will be considered OFF.
  5. Each specified switch must have at least one of a IS mnemonic-name-1, ON STATUS or an OFF STATUS option defined for it, otherwise there will be no way to reference the switch from within a GnuCOBOL program.
  6. The IS mnemonic-name-1 syntax provides a means for setting the switch to either an ON or OFF value via the SET statement (see SET).
  7. The ON STATUS and OFF STATUS syntax provides a way of associating a condition-name with either the on or off status of the switch, so that status may be tested at execution time via the IF statement (see IF).

5.1.3.4. Symbolic-Characters-Clause

SPECIAL-NAMES-Symbolic-Characters-Clause Syntax

 SYMBOLIC CHARACTERS
 ~~~~~~~~
   { symbolic-character-1... IS|ARE integer-1... }...

   [ IN alphabet-name-1 ]
     ~~

This clause may be used to define your own figurative constants.

  1. The reserved words ARE, CHARACTERS and IS are optional and may be omitted. The presence or absence of these words has no effect upon the program.
  2. There must be exactly as many integer-1 values specified as there are symbolic-character-1 names.

  3. Each symbolic character name will be associated with the corresponding integer-1th character in the alphabet named in the IN clause. The integer values are selecting characters from the alphabet by their ordinal position and not by their numeric value; thus, an integer of 15 will select the 15th character in the specified alphabet, regardless of the actual numeric value of the bit pattern that constitutes that character.
  4. If no alphabet-name-1 is specified, the systems native character set will be assumed.
  5. The following two code examples define the same set of figurative constant names for five ASCII control characters (assuming that ASCII is the system’s native character set). The two examples are identical in their effects, even though the way the figurative constants are defined is different.
    Individually
    SYMBOLIC CHARACTERS NUL IS 1
                        SOH IS 2
                        BEL IS 8
                        DC1 IS 18
                        DC2 IS 19
    
    Respectively
    SYMBOLIC CHARACTERS NUL SOH BEL DC1 DC2
                    ARE   1   2   8  18  19
    

5.1.4. REPOSITORY

REPOSITORY Syntax

 REPOSITORY.
 ~~~~~~~~~~
    FUNCTION { function-prototype-name-1 [ AS literal-1 ] }...
    ~~~~~~~~ {                             ~~             }
             { intrinsic-function-name-1 [ AS literal-2 ] }
             {                             ~~             }
             { intrinsic-function-name-2 INTRINSIC        }
             { ALL INTRINSIC             ~~~~~~~~~        }
               ~~~ ~~~~~~~~~

The REPOSITORY paragraph provides a way to control access to the various built-in intrinsic functions and any user defined functions that your program will be using.

  1. The REPOSITORY paragraph is not allowed in a nested subprogram. A nested program inherits the REPOSITORY settings of its parent program.
  2. The INTRINSIC clause allows you to flag one or more (or ALL) built-in intrinsic functions as being usable without the need to code the keyword FUNCTION in front of the function names.
  3. As an alternative to using the ALL INTRINSIC clause, you may instead compile your GnuCOBOL programs using the -fintrinsics=ALL switch.
  4. The function-prototype-name-1 option is required to specify the name of a user-defined function your program will be using. Optionally, should you desire, you may specify an alias name by which you will reference that user-defined function. Should you wish, you may also use the AS clause to provide an alias name for a built-in intrinsic function.
  5. The following example
    • enables all intrinsic functions to be specified without the use of the FUNCTION keyword,
    • names two user-defined functions named MY-FUNCTION-1 and MY-FUNCTION-2 that will be used by the program and
    • specifies the alias names SIGMA for the intrinsic function STANDARD-DEVIATION and MF2 for MY-FUNCTION-2.
    REPOSITORY.
        FUNCTION ALL INTRINSIC.
        FUNCTION MY-FUNCTION-1.
        FUNCTION MY-FUNCTION-2 AS "MF2".
        FUNCTION STANDARD-DEVIATION AS "SIGMA".
    

A special note about user-defined functions — because you must name a user-defined function that your program will be using in the REPOSITORY paragraph, you may always reference that function from your program’s procedure division without needing to use the FUNCTION keyword.

5.2. INPUT-OUTPUT SECTION

INPUT-OUTPUT SECTION Syntax

 [ INPUT-OUTPUT SECTION. ]
   ~~~~~~~~~~~~ ~~~~~~~
 [ FILE-CONTROL. ]
   ~~~~~~~~~~~~
     [ SELECT-Statement... ]

 [ I-O-CONTROL. ]
   ~~~~~~~~~~~
     [ MULTIPLE-FILE-Statement ]

     [ SAME-RECORD-Statement ]

The INPUT-OUTPUT section provides for the definition of any files the program will be accessing as well as control of the I/O buffering process against those files through the FILE-CONTROL and I-O-CONTROL paragraphs, respectively.

  1. As the diagram shows, there are three types of statements that may occur in the two paragraphs of this section. If none of the statements are coded in a particular paragraph, the paragraph itself may be omitted, otherwise it is required.
  2. If neither paragraph is coded, the INPUT-OUTPUT SECTION. header itself may be omitted, otherwise it is normally required.
  3. If the compiler configuration file (see Compiler Configuration Files) you are using has relaxed-syntax-check set to ‘yes’, the FILE-CONTROL and I-O-CONTROL paragraphs may be specified without the INPUT-OUTPUT SECTION header having been coded.
  4. If both statement types are coded in the I-O-CONTROL paragraph, the order in which those statements are coded is irrelevant.

5.2.1. SELECT

SELECT Statement Syntax

 SELECT [ [ NOT ] OPTIONAL ] file-name-1
 ~~~~~~     ~~~   ~~~~~~~~
 [ ASSIGN { TO    } [{ EXTERNAL }] [{ DISC|DISK      }] [{ identifier-1 }] ]
   ~~~~~~ { USING }  { ~~~~~~~~ }   { ~~~~ ~~~~      }   { word-1       }
                     { DYNAMIC  }   { DISPLAY        }   { literal-1    }
                       ~~~~~~~      { ~~~~~~~        }
                                    { KEYBOARD       }
                                    { ~~~~~~~~       }
                                    { LINE ADVANCING }
                                    { ~~~~ ~~~~~~~~~ }
                                    { PRINTER        }
                                    { ~~~~~~~        }
                                    { RANDOM         }
                                    { ~~~~~~         }
                                    { TAPE           }
                                      ~~~~
 [ COLLATING SEQUENCE IS alphabet-name-1 ]
   ~~~~~~~~~
 [ FILE|SORT ] STATUS IS identifier-2 [ identifier-3 ] ]
   ~~~~ ~~~~   ~~~~~~
 [ LOCK MODE IS { MANUAL|AUTOMATIC                                } ]
   ~~~~         { ~~~~~~ ~~~~~~~~~                                }
                { EXCLUSIVE [ WITH { LOCK ON MULTIPLE RECORDS } ] }
                  ~~~~~~~~~        { ~~~~ ~~ ~~~~~~~~ ~~~~~~~ }
                                   { LOCK ON RECORD           }
                                   { ~~~~ ~~ ~~~~~~           }
                                   { ROLLBACK                 }
                                   { ~~~~~~~~                 }
 [ ORGANIZATION Clause ]
   ~~~~~~~~~~~~
 [ ORGANISATION Clause ]
   ~~~~~~~~~~~~
 [ RECORD DELIMITER IS STANDARD-1 ]
   ~~~~~~ ~~~~~~~~~    ~~~~~~~~~~
 [ RESERVE integer-1 AREAS ]
   ~~~~~~~
 [ SHARING WITH { ALL OTHER } ]
   ~~~~~~~      { ~~~       }
                { NO OTHER  }
                { ~~        }
                { READ ONLY }
                  ~~~~ ~~~~

The COLLATING SEQUENCE, RECORD DELIMITER, RESERVE and ALL OTHER clauses are syntactically recognized but are otherwise non-functional.

The SELECT statement creates a definition of a file and links that COBOL definition to the external operating system environment.

  1. The reserved words AREAS, IS, MODE, OTHER, SEQUENCE, TO, USING and WITH are optional and may be omitted. The presence or absence of these words has no effect upon the program.
  2. After file-name-1, the various clauses may be coded in any sequence.
  3. A period must follow the last coded clause.
  4. The OPTIONAL clause, to be used only for files that will be used to provide input data to the program, indicates the file may or may not actually be available at run-time. Attempts to OPEN an OPTIONAL file when the file does not exist will receive a special non-fatal file status value (see status 05 in the list of file status values below) indicating the file is not available; a subsequent attempt to READ that file will return an AT END (end-of-file) condition. Optionally, files may be designated as NOT OPTIONAL, if desired. This is useful when specifying the compiler’s -foptional-file switch, which automatically makes all files OPTIONAL except for those explicitly declared as NOT OPTIONAL.
  5. The file-name-1 value that you specify will be the name by which you will reference the file within your program. This name should be formed according to the rules for user-defined names (see User-Defined Words).
  6. The optional ASSIGN clause specifies how — at runtime, when file-name-1 is opened — either a logical device (STDIN, STDOUT) or a file anywhere in one of the currently-mounted file systems will be associated with file-name-1, as follows:
    1. There are three components to the ASSIGN clause:
      Type

      EXTERNAL, DYNAMIC or neither

      Device

      the list of device choices

      Locator

      shown as a choice between identifier-1, word-1 and literal-1.

    2. ASSIGN TO DISC file-name-1 will be assumed if there is no ASSIGN clause on a SELECT.
    3. If an ASSIGN clause is coded without a Device, the device DISC will be assumed.
    4. If a Locator clause is coded, the COBOL file file-name-1 will be attached to a data file within any file system that is mounted and available to the executing program at the time file-name-1 is opened. How that file is identified varies, depending upon the specified Locator, as follows:
      1. If literal-1 is coded, the value of the literal will serve as the File Location String that will identify the data file.
      2. If identifier-1 is coded, the value of the identifier will serve as the File Location String that will identify the data file.
      3. If word-1 (a syntactically valid word not duplicating a reserved or user-defined word) is coded, and a Type is EXTERNAL, then word-1 itself will serve as the File Location String that will identify the data file. If, however, a Type of EXTERNAL was not specified, the compiler will create a PIC X(1024) data item named word-1 within the program; the contents of that data item at the time the program opens file-name-1 will then serve as the File Location String that will identify the data file.
      4. File Location Strings will be discussed shortly.
    5. If no Locator is coded, file-name-1 will be attached to a logical device or a file based upon the specified (or implied) Device, as follows:
      1. DISC or DISK will assume an attachment to a file named file-name-1 in whatever directory is current at the time the file is opened.
      2. DISPLAY will assume an attachment to the STDOUT logical device; these files should only be used for output.
      3. KEYBOARD will assume an attachment to the STDIN logical device; these files should only be used for input.
      4. PRINTER will assume an attachment to the LPT1 logical device/port; these files should only be used for output.
      5. RANDOM or TAPE will behave exactly as DISC does. These two additional Devices are provided to facilitate the compilation of COBOL source from other COBOL implementations.
    6. The LINE ADVANCING device requires that a Locator be specified; these files should only be used for output. A COBOL Line Advancing file will allow carriage-control characters such as line-feeds and form-feeds to be written to the attached operating system file, via the ADVANCING clause of the WRITE statement (see WRITE).
    7. File Location Strings are used (at runtime) to identify the path and filename to the data file that must be attached to file-name-1 when that file is opened.
    8. If the compiler configuration file (see Compiler Configuration Files) you used to compile the program with had a filename-mapping value of yes, the GnuCOBOL runtime system will first attempt to identify a currently-defined environment variable whose value will serve as the data file’s path and filename, as follows:
      1. If the compiler configuration file (see Compiler Configuration Files) (see Compiler Configuration Files) you used to compile the program specified mf as the assign-clause value, then the File Locator String will be interpreted according to Microfocus COBOL rules — namely, everything before the last ‘-’ in the File Locator String will be ignored; the characters after the last ‘-’ will be treated as the base of an environment variable name. If there is no ‘-’ character in the File Locator String then the entire File Locator String will serve as the base of an environment variable name. This is the default behaviour for every config file except ibm.
      2. If, on the other hand, the compiler configuration file (see Compiler Configuration Files) you used to compile the program specified mf as the assign-clause value, then the File Locator String will be interpreted according to according to IBM COBOL rules — namely, the File Locator String is expected to be of the form S-xxx or AS-xxx, in which case the xxx will be treated as the base of an environment variable name. If there is no ‘-’ character in the File Locator String then the entire File Locator String will serve as the base of an environment variable name.
      3. Once an environment variable name base (let’s refer to it as bbbb) has been determined, the runtime system will look for the first one of the following environment variables that exists, in this sequence:
        DD_bbbb
        dd_bbbb
        bbbb
        

        Windows systems are case-insensitive with regard to environment variables, so there is no difference between the first two when using a GnuCOBOL implementation built for either Windows/MinGW or native Windows.

        If an environment variable was found, its value will serve as the path and filename to the data file.

    9. If no environment variable was found, or the configuration file (see Compiler Configuration Files) used to compile the program had a filename-mapping value of NO, then the File Locator String value will serve as the path and filename to the data file.
    10. Paths and file names may be specified on an absolute (C:\\Data\\datafile.dat, /Data/datafile.dat, …) or relative to the current directory (Data\\datafile.dat, Data/datafile.dat, …) basis. If no directory name is included (datafile.dat), the file must be in the current directory.
  7. The FILE STATUS or SORT STATUS clause (they are both equivalent and only one or the other, if any, should be specified) is used to specify the name of a two-digit numeric data item into which an I/O status code will be saved after every I/O verb that is executed against the file. This does not actually allocate the data item — you must define the item yourself somewhere in the data division. Note that the following list is not definitive: more can be added and any tests should include one for non zeros as a catch all. Possible status codes that can be returned to a FILE STATUS data item are as follows:
    00

    Success

    02

    Success (Duplicate Record Key Written)

    04

    Success (Incomplete)

    05

    Success (Optional File Not Found)

    07

    Success (No Unit)

    10

    End of file reached if reading forward or beginning-of-file reached if reading backward

    14

    Out of key range

    21

    Key invalid

    22

    Key already exists

    23

    Key not found

    24

    Key boundary violation

    30

    Permanent I/O error

    31

    Inconsistent filename

    34

    Boundary violation

    35

    File not found

    37

    Permission denied

    38

    Closed with lock

    39

    Conflicting attribute

    41

    File already open

    42

    File not open

    43

    Read not done

    44

    Record overflow

    46

    Read error

    47

    OPEN INPUT denied (insufficient permissions to read file)

    48

    OPEN OUTPUT denied (insufficient permissions to write to file)

    49

    OPEN I-O denied (insufficient permissions to read and/or write file)

    51

    Record locked

    52

    End of page

    57

    LINAGE bad specification (I-O linage)

    61

    File sharing failure

    71

    Bad character

    91

    File not available

  8. The SHARING clause defines the conditions under which the program will be willing (or not) to allow other programs executing at the same time to access the file. See File Sharing, for the details.
  9. The LOCK clause defines how concurrent access to the file will be managed on a record-by-record basis. See Record Locking, for the details.
  10. For syntax details for the ORGANIZATION clause, see next group of paragraphs.
  11. A SELECT statement without an ORGANIZATION explicitly coded will be handled as if the following ORGANIZATION clause had been specified:
    ORGANIZATION IS SEQUENTIAL
    ACCESS MODE IS SEQUENTIAL
    

5.2.1.1. ORGANIZATION SEQUENTIAL

ORGANIZATION SEQUENTIAL Clause Syntax

 [ ORGANIZATION|ORGANISATION IS ] RECORD BINARY SEQUENTIAL
   ~~~~~~~~~~~~ ~~~~~~~~~~~~                    ~~~~~~~~~~
    [ ACCESS MODE IS SEQUENTIAL ]
      ~~~~~~         ~~~~~~~~~~

Files declared as ORGANIZATION SEQUENTIAL will consist of records with no explicit end-of-record delimiter character sequences; records in such files are “delineated” by a calculated byte-offset (based on the maximum record length) into the file.

  1. The reserved words BINARY, IS, MODE and RECORD are optional and may be omitted. The presence or absence of these words has no effect upon the program.
  2. The reserved words ORGANIZATION and ORGANISATION are interchangeable.
  3. The phrase ORGANIZATION IS (and its internationalized alternative, ORGANISATION IS) is optional to provide compatibility with those (few) COBOL implementations that consider ORGANIZATION to be optional. Most COBOL implementations do require the word ORGANIZATION, so it should be used in new programs.
  4. These files cannot be prepared with any standard text-editing or word processing software as all such programs will embed delimiter characters at the end of records (use ORGANIZATION IS LINE SEQUENTIAL instead).
  5. These files may contain either USAGE DISPLAY or USAGE COMPUTATIONAL (of any variety) data since no binary data sequence can be accidentally interpreted as an end-of-record delimiter.
  6. While records in a ORGANIZATION SEQUENTIAL file may be defined as having variable-length records, the file will be structured in such a manner as to reserve space for each record equal to the size of the largest possible record, based on the file’s description in the FILE SECTION.
  7. The ACCESS MODE SEQUENTIAL clause is optional because, if absent, it will be assumed anyway for this type of file. The internal structure of these files is such that they can only be processed in a sequential manner; in order to read the 100th record in such a file, for example, you first must read records 1 through 99.
  8. Sequential files are processed using the following statements:

5.2.1.2. ORGANIZATION LINE SEQUENTIAL

ORGANIZATION LINE SEQUENTIAL Clause Syntax

 [ ORGANIZATION|ORGANISATION IS ] LINE SEQUENTIAL
   ~~~~~~~~~~~~ ~~~~~~~~~~~~      ~~~~ ~~~~~~~~~~
    [ ACCESS MODE IS SEQUENTIAL ]
      ~~~~~~         ~~~~~~~~~~
    [ PADDING CHARACTER IS literal-1 | identifier-1 ]
      ~~~~~~~

The PADDING CHARACTER clause is syntactically recognized but is otherwise non-functional.

Files declared as ORGANIZATION LINE SEQUENTIAL will consist of records terminated by an end-of-record delimiter character or character sequence.

  1. The reserved words CHARACTER, IS and MODE are optional and may be omitted. The presence or absence of these words has no effect upon the program.
  2. The reserved words ORGANIZATION and ORGANISATION are interchangeable.
  3. The phrase ORGANIZATION IS (and its internationalized alternative, ORGANISATION IS) is optional to provide compatibility with those (few) COBOL implementations that consider that word to be optional. Most COBOL implementations do require the word ORGANIZATION, so it should be used in new programs.
  4. This is the only ORGANIZATION valid for files that are assigned to the PRINTER device.
  5. These files may be created with any standard text-editing or word processing software capable of writing text files. Such files should not contain any USAGE COMPUTATIONAL or BINARY (of any variety) data since such fields could accidentally contain byte sequences that could be interpreted as an end-of-record delimiter.
  6. Both fixed- and variable-length record formats are supported.
  7. The end-of-record delimiter sequence will be X‘0A (an ASCII line-feed character) or a X‘0D0A (an ASCII carriage-return + line-feed sequence). The former is used on Unix implementations of GnuCOBOL (including Windows/MinGW, Windows/Cygwin and OSX implementations) while the latter would be used with native Windows implementations.
  8. When reading a LINE SEQUENTIAL file, records in excess of the size implied by the file’s description in the FILE SECTION will be truncated while records shorter than that size will be padded to the right with SPACES.
  9. The ACCESS MODE SEQUENTIAL clause is optional because, if absent, it will be assumed anyway for this type of file. The internal structure of these files is such that the data can only be processed in a sequential manner; in order to read the 100th record in such a file, for example, you first must read records 1 through 99.
  10. Files assigned to PRINTER or CONSOLE should be specified as ORGANIZATION LINE SEQUENTIAL.
  11. Line Sequential files are processed using the following statements:

5.2.1.3. ORGANIZATION RELATIVE

ORGANIZATION RELATIVE Clause Syntax

 [ ORGANIZATION|ORGANISATION IS ] RELATIVE
   ~~~~~~~~~~~~ ~~~~~~~~~~~~      ~~~~~~~~
    [ ACCESS MODE IS { SEQUENTIAL } ]
      ~~~~~~         { ~~~~~~~~~~ }
                     { DYNAMIC    }
                     { ~~~~~~~    }
                     { RANDOM     }
                       ~~~~~~
    [ RELATIVE KEY IS identifier-1 ]
      ~~~~~~~~

These files are files with an internal organization such that records may be processed in a sequential manner based upon their physical location in the file or in a random manner by allowing records to be read, written or updated by specifying the relative record number in the file.

  1. The reserved words IS, KEY and MODE are optional and may be omitted. The presence or absence of these words has no effect upon the program.
  2. The reserved words ORGANIZATION and ORGANISATION are interchangeable.
  3. The phrase ORGANIZATION IS (and its internationalized alternative, ORGANISATION IS) is optional to provide compatibility with those (few) COBOL implementations that consider that word to be optional. Most COBOL implementations do require the word ORGANIZATION, so it should be used in new programs.
  4. ORGANIZATION RELATIVE files cannot be assigned to the CONSOLE, DISPLAY, LINE ADVANCING or PRINTER devices.
  5. The RELATIVE KEY clause is optional only if ACCESS MODE SEQUENTIAL is specified.
  6. While an ORGANIZATION RELATIVE file may be defined as having variable-length records, the file will be structured in such a manner as to reserve space for each record equal to the size of the largest possible record as defined by the file’s description in the FILE SECTION.
  7. ACCESS MODE SEQUENTIAL, the default ACCESS MODE if none is specified, indicates that the records of the file will be processed in a sequential manner, according to their physical sequence in the file.
  8. ACCESS MODE RANDOM means that records will be processed in random sequence by specifying their record number in the file every time the file is read or written.
  9. ACCESS MODE DYNAMIC indicates the program may switch back and forth between SEQUENTIAL and RANDOM mode during execution. The file starts out initially in SEQUENTIAL mode when first opened but the program may use the START statement (see START) to switch between sequential and random access.
  10. The RELATIVE KEY data item is a numeric data item that cannot be defined as a field within records of this file. Its purpose is to return the current relative record number of a relative file that is being processed in SEQUENTIAL access mode and to serve as a key that specifies the relative record number to be read or written when processing a relative file in RANDOM access mode.
  11. Relative files are processed using the following statements:

5.2.1.4. ORGANIZATION INDEXED

ORGANIZATION INDEXED Clause Syntax

 [ ORGANIZATION|ORGANISATION IS ] INDEXED
   ~~~~~~~~~~~~ ~~~~~~~~~~~~      ~~~~~~~
    [ ACCESS MODE IS { SEQUENTIAL } ]
      ~~~~~~         { ~~~~~~~~~  }
                     { DYNAMIC    }
                     { ~~~~~~~    }
                     { RANDOM     }
                       ~~~~~~
    [ RECORD KEY IS { [ data-name-1       ]
      ~~~~~~
                    { [ record-key-name-1 ]
                      [ =|{SOURCE IS} data-name-2 ] ... ] }
                           ~~~~~~
    [ ALTERNATE RECORD KEY IS { [ data-name-3       ]
      ~~~~~~~~~ ~~~~~~
                              { [ record-key-name-2 ]
                                [ =|{SOURCE IS} data-name-4 ] ... ] }
                                     ~~~~~~
                              [ WITH DUPLICATES ] ]...
                                     ~~~~~~~~~~
                              [ SUPPRESS WHEN ALL literal     ]
                                ~~~~~~~~~~~~~~~~~
                              [ SUPPRESS WHEN SPACES | ZEROES ]
                                ~~~~~~~~~~~~~~~~~~~~   ~~~~~~

Indexed files, like relative files, may have their records processed in either a sequential or random manner. Unlike relative files, however, the actual location of a record in an indexed file is calculated automatically based upon the value(s) of one or more alphanumeric fields within records of the file. For example, an indexed file containing product data might use the product identification code as a record key. This means you may read, write or update the A6G4328th record or the Z8X7723th record directly, based upon the product id value of those records!

  1. The reserved words IS, KEY and MODE are optional and may be omitted. The presence or absence of these words has no effect upon the program.
  2. The reserved words ORGANIZATION and ORGANISATION are interchangeable.
  3. The phrase ORGANIZATION IS (and its internationalized alternative, ORGANISATION IS) is optional to provide compatibility with those (few) COBOL implementations that consider that word to be optional. Most COBOL implementations do require the word ORGANIZATION, so it should be used in new programs.
  4. ORGANIZATION INDEXED files cannot be assigned to CONSOLE, DISPLAY, KEYBOARD, LINE ADVANCING or PRINTER.
  5. ACCESS MODE SEQUENTIAL, the default ACCESS MODE if none is specified, indicates that the records of the file will be processed in a sequential manner with respect to the values of the RECORD KEY or the ALTERNATE RECORD KEY most-recently referenced on a START statement (see START).
  6. ACCESS MODE RANDOM means that records will be processed in random sequence by accessing the record with specific record key or alternate record key values.
  7. ACCESS MODE DYNAMIC allows the file will be processed either in RANDOM or SEQUENTIAL mode; the program may switch between the two modes as needed. The START statement is used to make the switch between modes.
  8. The RECORD KEY clause defines the field within the record used to provide the primary access to records within the file. No two records in the file will be allowed to have the same PRIMARY KEY field value. The SOURCE IS clause is for use with Split Keys.
  9. The ALTERNATE RECORD KEY clause, if used, defines an additional field within the record that provides an alternate means of directly accessing records or an additional field by which the file’s contents may be processed sequentially. You have the choice of allowing records to have duplicate alternate key values, if necessary.
  10. There may be multiple ALTERNATE RECORD KEY clauses, each defining an additional alternate key for the file.
  11. Usage of the SUPPRESS WHEN clause is used when Sparse Keys are required which may take the form for a literal or spaces or zeroes.
  12. Indexed files are processed using the following statements:

5.2.2. SAME RECORD AREA

I-O-CONTROL SAME AREA Syntax

 SAME { SORT-MERGE } AREA FOR file-name-1... .
 ~~~~ { ~~~~~~~~~~ }
      { SORT       }
      { ~~~~       }
      { RECORD     }
        ~~~~~~

The SAME SORT-MERGE and SAME SORT clauses are syntactically recognized but are otherwise non-functional.

The SAME RECORD AREA clause allows you to specify that multiple files should share the same input and output memory buffers.

  1. The reserved words AREA and FOR are optional and may be omitted. The presence or absence of these words has no effect upon the program.
  2. This statement must be terminated with a period.
  3. While coding only a single file name (the repeated file-name-1 item) is syntactically valid, this statement will have no effect upon the program unless at least two files are specified.
  4. The effect of this statement will be to cause the specified files to share the same I/O buffer in memory. These buffers can sometimes get quite large, and by having multiple files share the same buffer memory you may significantly cut down the amount of memory the program is using (thus making “room” for more procedural code or data). If you do use this feature, take care to ensure that no more than one of the specified files are ever OPEN simultaneously.

5.2.3. MULTIPLE FILE

I-O-CONTROL MULTIPLE FILE Syntax

 MULTIPLE FILE TAPE CONTAINS
 ~~~~~~~~
    { file-name-1 [ POSITION integer-1 ] }...
                    ~~~~~~~~
    .

The MULTIPLE FILE TAPE clause is obsolete and is therefore recognized but not functional.

6. DATA DIVISION

DATA DIVISION Syntax

   DATA DIVISION.
   ~~~~ ~~~~~~~~
 [ FILE SECTION.
   ~~~~ ~~~~~~~
   { File/Sort-Description [ { FILE-SECTION-Data-Item } ]... }... ]
   {                         { 01-Level-Constant      }      }
   {                         { 78-Level-Constant      }      }
   { 01-Level-Constant                                       }
   { 78-Level-Constant                                       }
 [ WORKING-STORAGE SECTION.
   ~~~~~~~~~~~~~~~ ~~~~~~~
   [ { WORKING-STORAGE-SECTION-Data-Item } ]... ]
     { 01-Level-Constant                 }
     { 78-Level-Constant                 }
 [ LOCAL-STORAGE SECTION.
   ~~~~~~~~~~~~~ ~~~~~~~
   [ { LOCAL-STORAGE-SECTION-Data-Item } ]... ]
     { 01-Level-Constant               }
     { 78-Level-Constant               }
 [ LINKAGE SECTION.
   ~~~~~~~ ~~~~~~~
   [ { LINKAGE-SECTION-Data-Item } ]... ]
     { 01-Level-Constant         }
     { 78-Level-Constant         }
 [ REPORT SECTION.
   ~~~~~~ ~~~~~~~
   { Report-Description [ { Report-Group-Definition } ]... }... ]
   {                      { 01-Level-Constant       }      }
   {                      { 78-Level-Constant       }      }
   { 01-Level-Constant                                     }
   { 78-Level-Constant                                     }
 [ SCREEN SECTION.
   ~~~~~~ ~~~~~~~
   [ { SCREEN-SECTION-Data-Item } ]... ]
     { 01-Level-Constant        }
     { 78-Level-Constant        }

All data used by any COBOL program must be defined in one of the six sections of the data division, depending upon the purpose of the data.

  1. If no data will be described in one of the data division sections, that section header may be omitted.
  2. If no data division sections are needed, the DATA DIVISION. header itself may be omitted.
  3. If more than one section is needed in the data division (a common situation), the sections must be coded in the sequence they are presented above.

6.1. Data Definition Principles

GnuCOBOL data items, like those of other COBOL implementations, are described in a hierarchical manner. This accommodates the fact that data items frequently need to be able to be broken up into subordinate items. Take for example, the following logical layout of a portion of a data item named Employee:

                           Employee
                               |                     Additional
              :----------------:----------------:--> Data Items ...
              |                                 |
        Employee-name                    Employment-Dates
              |                                 |
    :---------:-------------:           :-------:-------:
    |         |             |           |               |
Last-Name First-Name Middle-Initial From-Date        To-Date
                                        |               |
                                   :----:----:     :----:----:
                                   |    |    |     |    |    |
                                 Year Month Day  Year Month Day

The Employee data item consists of two subordinate data items — an Employee-Name and an Employment-Dates data item (presumably there would be a lot of others too, but we don’t care about them right now). As the diagram shows, each of those data items are, in turn, broken down into subordinate data items. This hierarchy of data items can get rather deep, and GnuCOBOL, like other COBOL implementations, can handle up to 49 levels of such hierarchical structures.

As was presented earlier (see Structured Data), a data item that is broken down into other data items is referred to as a group item, while one that isn’t broken down is called an elementary item.

COBOL uses the concept of a level number to indicate the level at which a data item occurs in a data structure such as the example shown above. When these data items are defined, they are all defined together with a number in the range 1-49 specified in front of their names. Over the years, a convention has come to exist among COBOL programmers that level numbers are always coded as two-digit numbers — they don’t need to be specified as two-digit numbers, but every example you see in this document will take that approach!

The data item at the top, also referred to as a record, always has a level number of 01. After that, you may assign level numbers as you wish (01–02–03–04…, 01–05–10–15…, etc.), as long as you follow these simple rules:

  1. Every data item at the same level of a hierarchy diagram such as the one you see here (if you were to make one, which you rarely will, if ever, once you get used to this concept) must have the same level number.
  2. Every new level uses a level number that is strictly greater than the one used in the parent (next higher) level.
  3. When describing data hierarchies, you may never use a level number greater than 49 (except for 66, 77, 78 and 88 which have very special meanings (see Special Data Items).

So, the definition of these data items in a GnuCOBOL program would go something like this:

    01  Employee
        05 Employee-Name
           10 Last-Name
           10 First-Name
           10 Middle-Initial
        05 Employment-Dates
           10 From-Date
              15 Year
              15 Month
              15 Day
           10 To-Date
              15 Year
              15 Month
              15 Day

The indentation is purely at the discretion of the programmer to make things easier for humans to read (the compiler couldn’t care less). Historically, COBOL implementations that required Fixed Format Mode source programs required that the 01 level number begin in Area A and that everything else begins in Area B. GnuCOBOL only requires that all data definition syntax occur in columns 8-72. In Free Format Mode, of course, there aren’t even those limitations.

Did you notice that there are two each of Year, Month and Day data names defined? That’s perfectly legal, provided that each can be uniquely qualified so as to be distinct from the other. Take for example the Year items. One is defined as part of the From-Date data item while the other is defined as part of the To-Date data item. In COBOL, we would actually code references to these two data items as either Year OF From-Date and Year OF To-Date or Year IN From-Date and Year IN To-Date (COBOL allows either IN or OF to be used). Since these references would clarify any confusion to us as to which Year might be referenced, the GnuCOBOL compiler won’t be confused either.

The coding example shown above is incomplete; it only describes the data item names and their hierarchical relationships to one other. In addition, any valid data item definitions will also need to describe what type of data is to be contained in a data item (Numeric? Alphanumeric? Alphabetic?), how much data can be held in the data item and a multitude of other characteristics.

When group items are being defined, subordinate items may be assigned the “name” FILLER. There may be any number of FILLER items defined within a group item. A data item named FILLER cannot be referenced directly; these items are generally used to specify an unused portion of the total storage allocated to a group item. Note that it is possible that the name of the group item itself might be specified as FILLER if there is no need to ever refer directly to the group structure itself.

6.2. FILE SECTION

FILE SECTION Syntax

 [ FILE SECTION.
   ~~~~ ~~~~~~~
   { File/Sort-Description [ { FILE-SECTION-Data-Item } ]... }... ]
   {                         { 01-Level-Constant      }      }
   {                         { 78-Level-Constant      }      }
   { 01-Level-Constant                                       }
   { 78-Level-Constant                                       }

Every file that has been referenced by a SELECT statement (see SELECT) must also be described in the file section of the data division.

Files destined for use as sort/merge work files must be described with a Sort/Merge File Description (SD) while every other file is described with a File Description (FD). Each of these descriptions will almost always be followed with at least one record description.

6.2.1. File/Sort-Description

File/Sort-Description Syntax

 FD|SD file-name-1 [ IS EXTERNAL|GLOBAL ]
 ~~ ~~                  ~~~~~~~~ ~~~~~~
 [ BLOCK CONTAINS [ integer-1 TO ] integer-2 CHARACTERS|RECORDS ]
   ~~~~~                      ~~             ~~~~~~~~~~ ~~~~~~~
 [ CODE-SET IS alphabet-name-1 ]
   ~~~~~~~~
 [ DATA { RECORD IS   } identifier-1... ]
   ~~~~ { ~~~~~~      }
        { RECORDS ARE }
          ~~~~~~~
 [ LABEL { RECORD IS   } OMITTED|STANDARD ]
   ~~~~~ { ~~~~~~      } ~~~~~~~ ~~~~~~~~
         { RECORDS ARE }
           ~~~~~~~
 [ LINAGE IS integer-3 | identifier-2 LINES
   ~~~~~~
     [ LINES AT BOTTOM integer-4 | identifier-3 ]
                ~~~~~~
     [ LINES AT TOP integer-5 | identifier-4 ]
                ~~~
     [ WITH FOOTING AT integer-6 | identifier-5 ] ]
            ~~~~~~~
 [ RECORD { CONTAINS [ integer-7 TO ] integer-8 CHARACTERS   } ]
   ~~~~~~ {                      ~~                          }
          { IS VARYING IN SIZE                               }
          {    ~~~~~~~                                       }
          {     [ FROM [ integer-7 TO ] integer-8 CHARACTERS }
          {                        ~~                        }
          {         DEPENDING ON identifier-6 ]              }
                    ~~~~~~~~~
 [ RECORDING MODE IS recording-mode ]
   ~~~~~~~~~
 [ { REPORT IS   } report-name-1... ]
   { ~~~~~~      }
   { REPORTS ARE }
     ~~~~~~~
 [ VALUE OF implementor-name-1 IS literal-1 | identifier-7 ] .
   ~~~~~ ~~

The BLOCK CONTAINS, DATA RECORD, LABEL RECORD, RECORDING MODE and VALUE OF clauses are syntactically recognized but are obsolete and non-functional. These clauses should not be coded in new programs.

  1. The reserved words ARE, AT, CHARACTERS (RECORD clause only), CONTAINS, FROM, IN, IS, ON and WITH are optional and may be included, or not, at the discretion of the programmer. The presence or absence of these words has no effect upon the program.
  2. The terms RECORD IS and RECORDS ARE are interchangeable.
  3. The terms REPORT IS and REPORTS ARE are interchangeable.
  4. Only files intended for use as work files for either the SORT (see SORT) or MERGE (see MERGE) statements should be coded with an SD — all others should be defined with a FD.
  5. The sequence in which files are defined via FD or SD, as compared to the sequence in which their SELECT statements were coded, is irrelevant.
  6. The name specified as file-name-1 must exactly match the name specified on the file’s SELECT statement.
  7. The CODE-SET clause allows a custom alphabet, defined in the SPECIAL-NAMES (see SPECIAL-NAMES) paragraph, to be associated with a file. This clause is valid only when used with sequential or line sequential files.
  8. The LINAGE clause may only be specified in the FD of a sequential or line sequential file. If used with a sequential file, the organization of that file will be implicitly changed to line sequential. The various components of the LINAGE clause define the layout of printed pages as follows:
    LINES AT TOP

    Number of unused (i.e. left blank) lines at the top of every page. The default if this if not specified is zero.

    LINES AT BOTTOM

    Number of unused (i.e. left blank) lines at the bottom of every page. The default if this if not specified is zero.

    LINAGE IS n LINES

    Total number of used/usable lines on the page.

    The sum of the previous three specifications should be the total number of possible lines available on one printed page.
    FOOTING AT

    Line number beyond which nothing may be printed except for any footing that is to appear on every page. The default for this if not specified is zero, meaning there will be no footings. This value cannot be larger than the LINAGE IS n LINES value.

  9. This page structure — once defined — can be automatically enforced by the WRITE statement (see WRITE).
  10. Specifying a LINAGE clause in an FD will cause the LINAGE-COUNTER special register to be created for the file. This automatically-created data item will always contain the current relative line number on the page being prepared which will serve as the starting point for a WRITE statement.
  11. The RECORD CONTAINS and RECORD IS VARYING clauses are ignored (with a warning message issued) when used with line sequential files. With other file organizations, these mutually-exclusive clauses define the length of data records within the file. The data item specified as identifier-6 must be defined within one of the record descriptions of file-name-1.
  12. The REPORT IS clause announces to the compiler that the file will be dedicated to the Report Writer Control System (RWCS); the clause names one or more reports, each to be described in the report section. The following special rules apply when the REPORT clause is used:
    1. The clause may only be specified in the FD of a sequential or line sequential file. If used with a sequential file, the organization of that file will be implicitly changed to line sequential.
    2. The FD cannot be followed by record descriptions. Detailed descriptions of data to be printed to the file will be defined in the REPORT SECTION (see REPORT SECTION).
    3. If a LINAGE clause is also specified, Values specified for LINAGE IS and FOOTING AT will be ignored. The values of LINES AT BOTTOM and LINES AT TOP, if any, will be honoured.
  13. The following special rules apply only to sort/merge work files:
    1. Sort/merge work files should be assigned to DISK (or DISC) on their SELECT statements.
    2. Sorts and merges will be performed in memory, if the amount of data being sorted allows.
    3. Should actual disk work files be necessary due to the amount of data being sorted or merged, they will be automatically allocated to disk in a folder defined by:
      • The TMPDIR run-time environment variable (see Run Time Environment Variables)
      • The TMP run-time environment variable
      • The TEMP run-time environment variable

      (in that order).

    4. These disk files will be automatically purged upon SORT or MERGE termination. They will also be purged if the program terminates abnormally before the SORT or MERGE finishes. Should you ever need to know, temporary sort/merge work files will be named cob*.tmp.
    5. If you specify a specific filename in the sort/merge work file’s SELECT, it will be ignored.
  14. See Data Description Clauses, for information on the EXTERNAL and GLOBAL options.

6.2.2. FILE-SECTION-Data-Item

FILE-SECTION-Data-Item Syntax

 level-number [ identifier-1 | FILLER ] [ IS GLOBAL|EXTERNAL ]
                               ~~~~~~        ~~~~~~ ~~~~~~~~
 [ BLANK WHEN ZERO ]
   ~~~~~      ~~~~
 [ JUSTIFIED RIGHT ]
   ~~~~
 [ OCCURS [ integer-1 TO ] integer-2 TIMES
   ~~~~~~             ~~
        [ DEPENDING ON identifier-2 ]
          ~~~~~~~~~
        [ ASCENDING|DESCENDING KEY IS identifier-3 ]
          ~~~~~~~~~ ~~~~~~~~~~
        [ INDEXED BY identifier-4 ] ]
          ~~~~~~~
 [ PICTURE IS picture-string ]
   ~~~
 [ REDEFINES identifier-5 ]
   ~~~~~~~~~
 [ SIGN IS LEADING|TRAILING [ SEPARATE [CHARACTER] ] ]
   ~~~~    ~~~~~~~ ~~~~~~~~   ~~~~~~~~
 [ SYNCRONIZED|SYNCHRONISED [ LEFT|RIGHT ] ]
   ~~~~        ~~~~           ~~~~ ~~~~~
 [ USAGE IS data-item-usage ] . [ FILE-SECTION-Data-Item ]...
   ~~~~~

The LEFT and RIGHT (SYNCRONIZED) clauses are syntactically recognized but are otherwise non-functional.

Every sort file description (SD or FD) must be followed by at least one 01-level data item, except for file descriptions containing the REPORT IS clause. These 01-level data items, in turn, may be broken down into subordinate group and elementary items. An 01-level data item defined here in the file section is also known as a Record, even if it is an elementary item, provided that elementary item lacks the CONSTANT attribute.

  1. The reserved words BY, IS, KEY, ON and WHEN are optional and may be included, or not, at the discretion of the programmer. The presence or absence of these words has no effect upon the program.
  2. The reserved words SYNCRONIZED and SYNCRONIZED are interchangeable. Both may be abbreviated to SYNC.
  3. The reserved word PICTURE may be abbreviated to PIC.
  4. As the syntax diagram shows, the definition of a FILE-SECTION-Data-Item is a recursive one in that there may be any number of such specifications coded following a FD or SD. The first such specification must have a level number of 01, and will describe a specific format of data record within the file. Specifications that follow that one may have level numbers greater than 01, in which case they are defining a hierarchical breakdown of the record. The definition of a record is terminated when one of the following occurs:
    Another 01-level item is found

    signifies the start of another record layout for the file.

    Another FD or SD is found

    marks the completion of the detailed description of the file and begins another.

    A division or section header is found

    also marks the completion of the detailed description of the file and signifies the end of the file section as well.

  5. Every FILE-SECTION-Data-Item description must be terminated with a period.
  6. If there are multiple record descriptions present for a given FD or SD, the one with the longest length will define the size of the record buffer into which a READ statement (see READ) or a RETURN statement (see RETURN) will deliver data read from the file and from which a WRITE statement (see WRITE) or RELEASE statement (see RELEASE) statement will obtain the data to be written to the file.
  7. The various 01-level record descriptions for a file description implicitly share that one common record buffer (thus, they provide different ways to view the structure of data that can exist within the file). Record buffers can be shared between files by using the SAME RECORD AREA (see SAME RECORD AREA) clause.
  8. The only valid level numbers are 01-49, 66, 77, 78 and 88. Level numbers 66, 77, 78 and 88 all have special uses — See Special Data Items, for details.
  9. Not specifying an identifier-1 or FILLER immediately after the level number has the same effect as if FILLER were specified. A data item named FILLER cannot be referenced directly; these items are generally used to specify an unused portion of the total storage allocated to a group item or to describe a group item whose contents which will only be referenced using the names of those items that belong to it.
  10. EXTERNAL cannot be combined with GLOBAL or REDEFINES.
  11. File section data buffers (and therefore all 01-level record layouts defined in the file section) are initialized to all binary zeros when the program is loaded into storage.
  12. See Data Description Clauses, for information on the usage of the various data description clauses.

6.3. WORKING-STORAGE SECTION

WORKING-STORAGE-SECTION-Data-Item Syntax

 level-number [ identifier-1 | FILLER ] [ IS GLOBAL | EXTERNAL ]
                               ~~~~~~        ~~~~~~   ~~~~~~~~
 [ BASED ]
   ~~~~~
 [ BLANK WHEN ZERO ]
   ~~~~~      ~~~~
 [ JUSTIFIED RIGHT ]
   ~~~~
 [ OCCURS [ integer-1 TO ] integer-2 TIMES
   ~~~~~~             ~~
       [ DEPENDING ON identifier-2 ]
         ~~~~~~~~~
       [ ASCENDING|DESCENDING KEY IS identifier-3 ]
         ~~~~~~~~~ ~~~~~~~~~~
       [ INDEXED BY identifier-4 ] ]
         ~~~~~~~
 [ PICTURE IS picture-string ]
   ~~~
 [ REDEFINES identifier-5 ]
   ~~~~~~~~~
 [ SIGN IS LEADING|TRAILING [ SEPARATE CHARACTER ] ]
   ~~~~    ~~~~~~~ ~~~~~~~~   ~~~~~~~~
 [ SYNCRONIZED|SYNCHRONISED [ LEFT|RIGHT ] ]
   ~~~~        ~~~~           ~~~~ ~~~~~
 [ USAGE IS data-item-usage ]
   ~~~~~
 [ VALUE IS [ ALL ] literal-1 ] . [ WORKING-STORAGE-SECTION-Data-Item ]...
   ~~~~~      ~~~

The LEFT and RIGHT (SYNCRONIZED) clauses are syntactically recognized but are otherwise non-functional.

The working-storage section is used to describe data items that are not part of files, screens or reports and whose data values persist throughout the execution of the program.

  1. The reserved words BY, CHARACTER, IS, KEY, ON, RIGHT (JUSTIFIED), TIMES and WHEN are optional and may be included, or not, at the discretion of the programmer. The presence or absence of these words has no effect upon the program.
  2. The reserved words SYNCRONIZED and SYNCHRONISED are interchangeable. Both may be abbreviated as SYNC.
  3. The reserved word PICTURE may be abbreviated to PIC.
  4. The reserved word JUSTIFIED may be abbreviated to JUST.
  5. As the syntax diagram shows, the definition of a WORKING-STORAGE-SECTION-Data-Item is a recursive one in that there may be any number of such specifications coded following one another. The first such specification must have a level number of 01. Specifications that follow that one may have level numbers greater than 01, in which case they are defining a hierarchical breakdown of a record. The definition of a record is terminated when one of the following occurs:
    • Another 01-level item is found — this signifies the end of the definition of one record and the start of a another.
    • A 77-level item is found — this signifies the end of the definition of the record and begins the definition of a special data item; See 77-Level Data Items, for more information.
    • A division or section header is found — this also marks the completion of a record and signifies the end of the working-storage section as well.
  6. Every WORKING-STORAGE-SECTION-Data-Item description must be terminated with a period.
  7. The only valid level numbers are 01-49, 66, 77, 78 and 88. Level numbers 01 through 49 are used to define data items that may be part of a hierarchical structure. Level number 01 can also be used to define a constant — an item with an unchangeable value specified at compilation time.
  8. Level numbers 66, 77, 78 and 88 all have special uses — See Special Data Items, for details.
  9. Not specifying an identifier-1 or FILLER immediately after the level number has the same effect as if FILLER were specified. A data item named FILLER cannot be referenced directly; these items are generally used to specify an unused portion of the total storage allocated to a group item or to describe a group item whose contents which will only be referenced using the names of those items that belong to it.
  10. Data items defined within the working-storage section are automatically initialized once — as the program in which the data is defined is loaded into memory. Subprograms may be loaded into memory more than once (see the CANCEL statement (see CANCEL)), in which case initialization will happen each time they are loaded. See Data Initialization, for a discussion of the initialization rules.
  11. See Data Description Clauses, for information on the usage of the various data description clauses.

6.4. LOCAL-STORAGE SECTION

LOCAL-STORAGE-SECTION-Data-Item Syntax

 level-number [ identifier-1 | FILLER ] [ IS GLOBAL|EXTERNAL ]
                               ~~~~~~        ~~~~~~ ~~~~~~~~
 [ BASED ]
   ~~~~~
 [ BLANK WHEN ZERO ]
   ~~~~~      ~~~~
 [ JUSTIFIED RIGHT ]
   ~~~~
 [ OCCURS [ integer-1 TO ] integer-2 TIMES
   ~~~~~~             ~~
       [ DEPENDING ON identifier-2 ]
         ~~~~~~~~~
       [ ASCENDING|DESCENDING KEY IS identifier-3 ]
         ~~~~~~~~~ ~~~~~~~~~~
       [ INDEXED BY identifier-4 ] ]
         ~~~~~~~
 [ PICTURE IS picture-string ]
   ~~~
 [ REDEFINES identifier-5 ]
   ~~~~~~~~~
 [ SIGN IS LEADING|TRAILING [ SEPARATE CHARACTER ] ]
   ~~~~    ~~~~~~~ ~~~~~~~~   ~~~~~~~~
 [ SYNCRONIZED|SYNCHRONISED [ LEFT|RIGHT ] ]
   ~~~~        ~~~~           ~~~~ ~~~~~
 [ USAGE IS data-item-usage ]
   ~~~~~
 [ VALUE IS [ ALL ] literal-1 ] . [ LOCAL-STORAGE-SECTION-Data-Item ]...
   ~~~~~      ~~~

The LEFT and RIGHT (SYNCRONIZED) clauses are syntactically recognized but are otherwise non-functional.

The local-storage section is similar to working-storage, but describes data within a subprogram that will be dynamically allocated and initialized (automatically) each time the subprogram is executed. See Data Initialization, for the rules of data initialization.

  1. The reserved words BY, CHARACTER IS, KEY, ON, RIGHT (JUSTIFIED), TIMES and WHEN are optional and may be included, or not, at the discretion of the programmer. The presence or absence of these words has no effect upon the program.
  2. The reserved words SYNCRONIZED and SYNCHRONISED are interchangeable. Both may be abbreviated as SYNC.
  3. The reserved word PICTURE may be abbreviated to PIC.
  4. The reserved word JUSTIFIED may be abbreviated to JUST.
  5. As the syntax diagram shows, the definition of a LOCAL-STORAGE-SECTION-Data-Item is a recursive one in that there may be any number of such specifications coded following one another. The first such specification must have a level number of 01. Specifications that follow that one may have level numbers greater than 01, in which case they are defining a hierarchical breakdown of a record. The definition of a record is terminated when one of the following occurs:
    • Another 01-level item is found — this signifies the end of the definition of one record and the start of a another.
    • A division or section header is found — this also marks the completion of a record and signifies the end of the local-storage section as well.
  6. Every LOCAL-STORAGE-SECTION-Data-Item description must be terminated with a period.
  7. The only valid level numbers are 01-49, 66, 77, 78 and 88. Level numbers 01 through 49 are used to define data items that may be part of a hierarchical structure. Level number 01 can also be used to define a constant — an item with an unchangeable value specified at compilation time.
  8. Level numbers 66, 77, 78 and 88 all have special uses — See Special Data Items, for details.
  9. Not specifying an identifier-1 or FILLER immediately after the level number has the same effect as if FILLER were specified. A data item named FILLER cannot be referenced directly; these items are generally used to specify an unused portion of the total storage allocated to a group item or to describe a group item whose contents which will only be referenced using the names of those items that belong to it.
  10. Local-storage cannot be used in nested subprograms.
  11. See Data Description Clauses, for information on the usage of the various data description clauses.

6.5. LINKAGE SECTION

LINKAGE-SECTION-Data-Item Syntax

 level-number [ identifier-1 | FILLER ] [ IS GLOBAL|EXTERNAL ]
                               ~~~~~~        ~~~~~~ ~~~~~~~~
 [ ANY LENGTH ]
   ~~~ ~~~~~~
 [ BASED ]
   ~~~~~
 [ BLANK WHEN ZERO ]
   ~~~~~      ~~~~
 [ JUSTIFIED RIGHT ]
   ~~~~
 [ OCCURS [ integer-1 TO ] integer-2 TIMES
   ~~~~~~             ~~
       [ DEPENDING ON identifier-3 ]
         ~~~~~~~~~
       [ ASCENDING|DESCENDING KEY IS identifier-4 ]
         ~~~~~~~~~ ~~~~~~~~~~
       [ INDEXED BY identifier-5 ] ]
         ~~~~~~~
 [ PICTURE IS picture-string ]
   ~~~
 [ REDEFINES identifier-6 ]
   ~~~~~~~~~
 [ SIGN IS LEADING|TRAILING [ SEPARATE CHARACTER ] ]
   ~~~~    ~~~~~~~ ~~~~~~~~   ~~~~~~~~
 [ SYNCRONIZED|SYNCHRONISED [ LEFT|RIGHT ] ]
   ~~~~        ~~~~           ~~~~ ~~~~~
 [ USAGE IS data-item-usage ] . [ LINKAGE-SECTION-Data-Item ]...
   ~~~~~

The LEFT and RIGHT (SYNCRONIZED) clauses are syntactically recognized but are otherwise non-functional.

The linkage section describes data within a subprogram that serves as either input arguments to or output results from the subprogram.

  1. The reserved words BY, CHARACTER, IS, KEY, ON and WHEN are optional and may be included, or not, at the discretion of the programmer. The presence or absence of these words has no effect upon the program.
  2. The reserved words SYNCRONIZED and “SYNCHRONISED” are interchangeable. Both may be abbreviated as SYNC.
  3. The reserved word PICTURE may be abbreviated to PIC.
  4. The reserved word JUSTIFIED may be abbreviated to JUST.
  5. As the syntax diagram shows, the definition of a LINKAGE-SECTION-Data-Item is a recursive one in that there may be any number of such specifications coded following one another. The first such specification must have a level number of 01. Specifications that follow that one may have level numbers greater than 01, in which case they are defining a hierarchical breakdown of a record. The definition of a record is terminated when one of the following occurs:
    • Another 01-level item is found — this signifies the end of the definition of one record and the start of a another.
    • A division or section header is found — this also marks the completion of a record and signifies the end of the linkage section as well.
  6. Every LINKAGE-SECTION-Data-Item description must be terminated with a period.
  7. The only valid level numbers are 01-49, 66, 77, 78 and 88. Level numbers 01 through 49 are used to define data items that may be part of a hierarchical structure. Level number 01 can also be used to define a constant — an item with an unchangeable value specified at compilation time.
  8. Level numbers 66, 77, 78 and 88 all have special uses — See Special Data Items, for details.
  9. It is expected that:
    1. A linkage section should occur only within a subprogram. The compiler will not prevent its use in a main program, however.
    2. All 01-level data items described within a subprogram’s linkage section should appear in a PROCEDURE DIVISION USING (see PROCEDURE DIVISION USING) or as arguments on an ENTRY statement.
    3. Each 01-level data item described within a subprogram’s linkage section should correspond to an argument passed on a CALL statement (see CALL) or an argument on a function call to the subprogram.
  10. Not specifying an identifier-1 or FILLER immediately after the level number has the same effect as if FILLER were specified. A data item named FILLER cannot be referenced directly; these items are generally used to specify an unused portion of the total storage allocated to a group item or to describe a group item whose contents which will only be referenced using the names of those items that belong to it. In the linkage section, 01-level data items cannot be named FILLER.
  11. No storage is allocated for data defined in the linkage section; the data descriptions there are merely defining storage areas that will be passed to the subprogram by a calling program. Therefore, any discussion of the default initialization of such data is irrelevant. It is possible, however, to manually allocate linkage section data items that aren’t subprogram arguments via the ALLOCATE statement (see ALLOCATE) statement. In such cases, initialization will take place as per the documentation of that statement.
  12. See Data Description Clauses, for information on the usage of the various data description clauses.

6.6. REPORT SECTION

REPORT SECTION Syntax

 [ REPORT SECTION.
   ~~~~~~ ~~~~~~~
   { Report-Description [ { Report-Group-Definition } ]... }... ]
   {                      { 01-Level-Constant       }      }
   {                      { 78-Level-Constant       }      }
   { 01-Level-Constant                                     }
   { 78-Level-Constant                                     }

Report-Description (RD) Syntax

 RD report-name [ IS GLOBAL ]
 ~~                  ~~~~~~
 [ CODE IS literal-1 | identifier-1 ]
   ~~~~
 [ { CONTROL IS   } { FINAL        }... ]
   { ~~~~~~~      } { ~~~~~        }
   { CONTROLS ARE } { identifier-2 }
     ~~~~~~~~
 [ PAGE [ { LIMIT IS   } ] [ { literal-2    } LINES ]
   ~~~~   { ~~~~~      }     { identifier-3 } ~~~~
          { LIMITS ARE }
            ~~~~~~
       [ literal-3 | identifier-4 COLUMNS|COLS ]
                                  ~~~~~~~ ~~~~
       [ HEADING IS literal-4 | identifier-5 ]
         ~~~~~~~
       [ FIRST DE|DETAIL IS literal-5 | identifier-6 ]
         ~~~~~ ~~ ~~~~~~
       [ LAST CH|{CONTROL HEADING} IS literal-6 | identifier-7 ]
         ~~~~ ~~  ~~~~~~~ ~~~~~~~
       [ LAST DE|DETAIL IS literal-7 | identifier-8 ]
         ~~~~ ~~ ~~~~~~
       [ FOOTING IS literal-8 | identifier-9 ] ] .
         ~~~~~~~

This section describes the layout of printed reports as well as many of the functional aspects of the generation of reports that will be produced via the Report Writer Control System. It is important to maintain the order of these clauses and ensure that all fields defined or referenced with this section are actually defined in the WORKING-STORAGE SECTION and not elsewhere.

  1. The reserved words ARE and IS are optional and may be omitted. The presence or absence of these words has no effect upon the program.
  2. The phrases CONTROL IS and CONTROLS ARE are interchangeable, as are the PAGE LIMIT and PAGE LIMITS phrases.
  3. The reserved word LINES may be abbreviated as LINE.
  4. The reserved word COLUMNS may be abbreviated as COLS.
  5. Each report referenced on a REPORT IS clause (see File/Sort-Description) must be described with a report description (RD).
  6. See GLOBAL, for information on the GLOBAL option.
  7. Please see Report Writer Features, if you have not read it already. It will familiarize you with the Report Writer terminology that follows.
  8. The following rules pertain to the PAGE LIMITS clause:
    1. If no PAGE LIMITS clause is specified, the entire report will be generated as if it consists of a single arbitrarily long page.
    2. All literals (literal-2 through literal-8) must be numeric with non-zero positive integer values.
    3. All identifiers (identifier-2 through identifier-8) must be numeric, unedited with non-zero positive integer values.
    4. Any value specified for literal-2 or identifier-2 will define the total number of available lines on any report page, not counting any unused margins at the top and/or bottom of the page (defined by the LINES AT TOP and LINES AT BOTTOM values specified on the LINAGE clause of the FD this RD is linked to — see File/Sort-Description).
    5. Any value specified for literal-3 or identifier-3 will be ignored.
    6. The HEADING clause defines the first line number at which a report heading or page heading may be presented.
    7. The FIRST DETAIL clause defines the first line at which a detail group may be presented.
    8. The LAST CONTROL HEADING clause defines the last line at which any line of a control heading may be presented.
    9. The LAST DETAIL clause defines the last line at which any line of a detail group may be presented.
    10. The FOOTING clause defines the last line at which any line of a control footing group may be presented.
    11. The following rules establish default values for the various PAGE LIMIT clauses, assuming there is one:
      HEADING

      default is one (1)

      FIRST DETAIL HEADING

      value is used

      LAST CONTROL HEADING

      value from LAST DETAIL or, if that is absent, the value from FOOTING or, if that too is absent, the value from PAGE LIMIT

      LAST DETAIL

      value from FOOTING or, if that is absent, the value from PAGE LIMIT

      FOOTING

      value from LAST DETAIL or, if that is absent, the value from PAGE LIMIT

    12. For the values specified on a PAGE LIMIT clause to be valid, all of the following must be true:
      • FIRST DETAILHEADING
      • LAST CONTROL HEADINGFIRST DETAIL
      • LAST DETAILLAST CONTROL HEADING
      • FOOTINGLAST DETAIL
  9. The following rules pertain to the CONTROL clause:
    1. If there is no CONTROL clause, the report will contain no control breaks; this implies that there can be no CONTROL HEADING or CONTROL FOOTING report groups defined for this RD.
    2. Include the reserved word FINAL if you want to include a special control heading before the first detail line is generated (CONTROL HEADING FINAL) or after the last detail line is generated (CONTROL FOOTING FINAL).
    3. If you specify FINAL, it must be the first control break named in the RD.
    4. Any identifier-9 specifications included on the CONTROL clause are referencing data names defined in any data division section except for the report section.
    5. There must be a CONTROL HEADING and/or CONTROL FOOTING report group defined in the report section for each identifier-9.
    6. At execution time:
      • Each time a GENERATE statement (see GENERATE) is executed against a detail report group defined for this RD, the RWCS will check the contents of each identifier-2 data item; whenever an identifier-9’s value has changed since the previous GENERATE, a control break condition will be in effect for that identifier-2.
      • Once the list of control breaks has been determined, the CONTROL FOOTING for each identifier-2 having a control break (if any such report group is defined) will be presented.
      • Next, the CONTROL HEADING for each identifier-2 having a control break (if any such report group is defined) will be presented.
      • The CONTROL FOOTING and CONTROL HEADING report groups will be presented in the sequence in which they are listed on the CONTROL clause.
      • Only after this processing has occurred will the detail report group specified on the GENERATE be presented.
  10. Each RD will have the following allocated for it:
    1. The PAGE-COUNTER special register (see Special Registers), which will contain the current report page number.
      • This register will be set to a value of 1 when an INITIATE statement (see INITIATE) is executed for the report and will be incremented by 1 each time the RWCS starts a new page of the report.
      • References to PAGE-COUNTER within the report section will be implicitly qualified with the name of the report to which the report group referencing the register belongs.
      • References to PAGE-COUNTER in the procedure division must be qualified with the appropriate report name if there are multiple RDs defined.
    2. The LINE-COUNTER special register, which will contain the current line number on the current page.
  11. The RD must be followed by at least one 01-level report group definition.

6.6.1. Report Group Definitions

Report-Group-Definition Syntax

 01 [ identifier-1 ]

 [ LINE NUMBER IS { integer-1 [ [ ON NEXT PAGE ] } ]
   ~~~~           {                  ~~~~ ~~~~   }
                  { +|PLUS integer-1             }
                  {   ~~~~                       }
                  { ON NEXT PAGE                 }
                       ~~~~ ~~~~
 [ NEXT GROUP IS { [ +|PLUS ] integer-2  } ]
   ~~~~ ~~~~~    {     ~~~~              }
                 { NEXT|{NEXT PAGE}|PAGE }
                   ~~~~  ~~~~ ~~~~  ~~~~
 [ TYPE IS { RH|{REPORT HEADING}                      } ]
   ~~~~    { ~~  ~~~~~~ ~~~~~~~                       }
           { PH|{PAGE HEADING}                        }
           { ~~  ~~~~ ~~~~~~~                         }
           { CH|{CONTROL HEADING} FINAL|identifier-2  }
           { ~~  ~~~~~~~ ~~~~~~~  ~~~~~               }
           { DE|DETAIL                                }
           { ~~ ~~~~~~                                }
           { CF|{CONTROL FOOTING} FINAL|identifier-2  }
           { ~~  ~~~~~~~ ~~~~~~~  ~~~~~               }
           { PF|{PAGE FOOTING}                        }
           {  ~~ ~~~~ ~~~~~~~                         }
           { RF|{REPORT FOOTING}                      }
             ~~  ~~~~~~ ~~~~~~~
 . [ REPORT-SECTION-Data-Item ]...

The syntax shown here documents how a report group is defined to a report. This syntax is valid only in the report section, and only then after an RD.

  1. The reserved words IS, NUMBER and ON are optional and may be omitted. The presence or absence of these words has no effect upon the program.
  2. The RH and REPORT HEADING terms are interchangeable, as are PH and PAGE HEADING, CH and CONTROL HEADING, DE and DETAIL, CF and CONTROL FOOTING, PF and PAGE FOOTING as well as RF and REPORT FOOTING.
  3. The report group being defined will be a part of the most-recently coded RD.
  4. The TYPE (see TYPE) clause specifies the type of report group being defined.
  5. The level number used for a report group definition must be 01.
  6. The optional identifier-1 specification assigns a name to this report group so that the group may be referenced either by a GENERATE statement or on a USE BEFORE REPORTING.
  7. No two report groups in the same report (RD) may named with the same identifier-1. There may, however, be multiple identifier-1 definitions in different reports. In such instances, references to identifier-1 must be qualified by the report name.
  8. There may only be one report heading, report footing, final control heading, final control footing, page heading and page footing defined per report.
  9. Report group declarations must be followed by at least one REPORT-SECTION-Data-Item with a level number in the range 02-49.
  10. See Data Description Clauses, for information on the usage of the various data description clauses.

6.6.2. REPORT SECTION Data Items

REPORT-SECTION-Data-Item Syntax

 level-number [ identifier-1 ]

 [ BLANK WHEN ZERO ]
   ~~~~~      ~~~~
 [ COLUMN [ { NUMBER IS   } ] [ +|PLUS ] integer-1 ]
   ~~~      { ~~~~~~      }       ~~~~
            { NUMBERS ARE }
              ~~~~~~~
 [ GROUP INDICATE ]
   ~~~~~ ~~~~~~~~
 [ JUSTIFIED RIGHT ]
   ~~~~
 [ LINE NUMBER IS { integer-2 [ [ ON NEXT PAGE ] } ]
   ~~~~           { +|PLUS integer-2 ~~~~ ~~~~   }
                  {   ~~~~                       }
                  { ON NEXT PAGE                 }
                       ~~~~ ~~~~
 [ OCCURS [ integer-3 TO ] integer-4 TIMES
   ~~~~~~             ~~
     [ DEPENDING ON identifier-2 ]
       ~~~~~~~~~
     [ STEP integer-5 ]
       ~~~~
     [ VARYING identifier-3 FROM { identifier-4 } BY { identifier-5 } ]
       ~~~~~~~              ~~~~ { integer-6    } ~~ { integer-7    }
 [ PICTURE IS picture-string ]
   ~~~
 [ PRESENT WHEN condition-name ]
   ~~~~~~~ ~~~~
 [ SIGN IS LEADING|TRAILING [ SEPARATE CHARACTER ] ]
   ~~~~    ~~~~~~~ ~~~~~~~~   ~~~~~~~~
 [ { SOURCE IS literal-1|identifier-6 [ ROUNDED ]                   } ]
   { ~~~~~~                             ~~~~~~~                     }
   { SUM OF { identifier-7 }... [ { RESET ON FINAL|identifier-8 } ] }
   { ~~~    { literal-2    }      { ~~~~~    ~~~~~              }   }
   { VALUE IS [ ALL ] literal-3   { UPON identifier-9           }   }
     ~~~~~      ~~~                 ~~~~
 . [ REPORT-SECTION-Data-Item ]...

Data item descriptions describing the report lines and fields that make up the substance of a report group immediately follow the definition of that group.

  1. The reserved words IS, NUMBER, OF, ON, RIGHT, TIMES and WHEN (BLANK) are optional and may be omitted. The presence or absence of these words has no effect upon the program.
  2. The reserved word COLUMN may be abbreviated as COL.
  3. The reserved word JUSTIFIED may be abbreviated as JUST.
  4. The reserved word PICTURE may be abbreviated as PIC.
  5. The SOURCE (see SOURCE), SUM (see SUM) and VALUE (see VALUE) clauses, valid only on an elementary item, are mutually-exclusive of each other.
  6. Group items (those without PICTURE clauses) are frequently used to describe entire lines of a report, while elementary items (those with a picture clause) are frequently used to describe specific fields of information on the report. When this coding convention is being used, group items will have LINE (see LINE) clauses and no COLUMN (see COLUMN) clauses while elementary items will be specified the other way around.
  7. See Data Description Clauses, for information on the usage of the various data description clauses.

6.7. SCREEN SECTION

SCREEN-SECTION-Data-Item Syntax

 level-number [ identifier-1 | FILLER ]
                               ~~~~~~
 [ AUTO | AUTO-SKIP | AUTOTERMINATE ] [ BELL | BEEP ]
   ~~~~   ~~~~~~~~~   ~~~~~~~~~~~~~     ~~~~   ~~~~
 [ BACKGROUND-COLOR|BACKGROUND-COLOUR IS integer-1 | identifier-2 ]
   ~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~
 [ BLANK LINE|SCREEN ] [ ERASE EOL|EOS ]
   ~~~~~ ~~~~ ~~~~~~     ~~~~~ ~~~ ~~~
 [ BLANK WHEN ZERO ] [ JUSTIFIED RIGHT ]
   ~~~~~      ~~~~     ~~~~
 [ BLINK ] [ HIGHLIGHT | LOWLIGHT ] [ REVERSE-VIDEO ]
   ~~~~~     ~~~~~~~~~   ~~~~~~~~     ~~~~~~~~~~~~~
 [ COLUMN NUMBER IS [ +|PLUS ] integer-2 | identifier-3 ]
   ~~~                  ~~~~
 [ FOREGROUND-COLOR|FOREGROUND-COLOUR IS integer-3 | identifier-4 ]
   ~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~
 [ { FROM literal-1 | identifier-5 } ]
   { ~~~~                          }
   { TO identifier-5               }
   { ~~                            }
   { USING identifier-5            }
   { ~~~~~                         }
   { VALUE IS [ ALL ] literal-1    }
     ~~~~~      ~~~
 [ FULL | LENGTH-CHECK ] [ REQUIRED | EMPTY-CHECK ] [ SECURE | NO-ECHO ]
   ~~~~   ~~~~~~~~~~~~     ~~~~~~~~   ~~~~~~~~~~~     ~~~~~~