MVSDASD

Technical Details

The mvsdasd module is made of two components:

  1. A disk driver specific to the MVS DASD format.
  2. A file system which together with the above driver presents to Linux a standard, hierarchical file system.

In order to implement the driver, a special layer is implemented, a layer that on one side understands the underlying structure of the MVS disk, and on the other side, presents a coherent Linux-compatible format. The Linux kernel is very versatile in its ability to accept many flavors of operating systems, but nevertheless, it expects the file systems to adhere to some standards, and the most important ones are:

•  All of the file system blocks must have the same block size, and that should be a multiple of 512 bytes.
•  Files are organized in directories.
•  Each file is uniquely identified with a number called "inode".
•  Each file system has one superblock that describes the file system.

The virtual layer in the disk driver is responsible for implementing the first two requirements, while the file system implements the other two.

Once the mvsdasd module is loaded into the kernel, it identifies any possible MVS disks and creates for each one of them a shadow (virtual) device. This shadow device is the one that should be mounted by the file system. For example, if we have in our system a dasd device in /dev/dasdc, which is MVS formated, the driver will create a shadow device called /dev/mvsdasdc. It is the later device which should be mounted using the standard mount command.

Once the disk has been mounted all flat files on it can be read like standard Linux files.

For usage description, please refer to the Installation And Operation page.

Comparison between mvsdasd and NFS:

IBM has a component that implements NFS over MVS Data Set's. The following table summarizes the differences between NFS and mvsdasd.

Item

Mvsdasd

NFS

Access means

Direct access from Linux

Communication (i.e., TCP/IP)

Supports file translation (EBCDIC/ASCII, Line terminator)

Yes

Yes

Write access/ File Create

No

Yes

Can run even when MVS is down

Yes

No

Support for VSAM

No1

Yes

DBCS Support

No

Yes

Speed Very Fast Dependant on communication lines

1 This feature will be implemented in the future

Security

Initially, all files are mounted read-only with superuser ownership. If you would like to grant access to a file to users other than the superuser, use the chown command. For example "chown user /sys2/source/myprog" or "chown group.user /sys2/source/myprog".

Note: this change is not persistent. Once the disk has been unmounted and mounted again, these ownership overrides are reset. You can automate this process with a simple script that will mount the file and grant some users access to certain files.

MVS file System

z/OS is the incarnation of the legacy MVS operating system (marketed as OS/390 in the 90's). IBM still names various components in this operating system (like the disk structure) "MVS". Specifically z/OS uses for it's non-Unix implementation a different disk structure than is used in other systems. Actually there is really very little in common between the file systems employed today by operating systems and the MVS structure. Just to name the most important differences:

  1. Data sets on MVS can have various block sizes, and sometimes even one data set can have blocks of more than one size. Not only the last block of a data set can be shorter than the rest of the blocks: some file copy utilities do not re-block the output data set, and when concatenating several data sets into one, the resulting data set sometimes retains the original block size relative to the data set from which it was copied.
  2. The hierarchy of files is not expressed in directories or folders. Instead, files (or data sets as they are referred to) are named with a qualification level embedded in their name. Each level is terminated by a dot (".") which is an integral part of the file's name.
  3. Data sets are "record oriented" and not "stream oriented". This means that data sets do not have a record terminator, and the size of a record is either:
    1. Fixed and defined during data set creation.
    2. Variable, in which case each logical record is preceded by a four-bytes descriptor of which two bytes represent the length of the record, enabling a maximum of 65,535 bytes per record. In order to overcome this limitation, "spanned" records were introduced, which can span over more than one physical record (this is where the other two bytes come to play, in describing the current record's position in the group of spanned records).

This means that in order to convert files to be "stream oriented" a line terminator must be appended to the record's end. This has also implication on the perceived file size.

  1. Data sets must be pre-allocated before being written to. Note that dynamically allocating a data set is unlike opening a file for write in other operating systems, since even with a dynamically allocated data set, some or all the eventual size of the data set must be obtained from the operating system. There is some flexibility to this behavior by defining "extents" to a data set. This results in extents being added to the initial data set's allocation. The size of the extents is also predetermined during the data set creation. The important point is that data sets usually use less space than is allocated on the disk (see next paragraph about file sizes). With the introduction of SMS, this constraint was largely been made obsolete.
  2. File sizes are not recorded anywhere in a standard MVS Dasd. A file (data set) can be allocated with a certain size, and actually be empty from the application point of view (e.g., when no one had written anything into that file yet). The only way to determine the exact size of the file is by reading it through until reaching the last record. This issue is also addressed by IBM's NFS implementation. Appendix A in "Network File System Guide and Reference" (IBM publication SC26-7417-04) calls this process "read-for-size". mvsdasd uses a similar procedure to determine the effective file size (which is different from the physical size).
  3. There are special data sets which are commonly referred to as "libraries". These data sets are actually "a file system within a file system" in that they contain other files called "members". Each library can contain many members (sometimes thousands). All members in a library must share the same record length. Members have names composed of 1-8 characters.
  4. Data set names (and contents) on MVS Disks are encoded in EBCDIC rather than ASCII.

In implementing mvsdasd, all the above discrepancies needed to be addressed.