UDA(4) OpenBSD Programmer's Manual (VAX) UDA(4)NAMEuda - UDA50 disk controller interface
SYNOPSIS
uda0 at uba? csr 0172150
uda1 at uba? csr 0160334
mscpbus* at uda?
DESCRIPTION
This is a driver for the DEC UDA50 disk controller and other compatible
controllers. The UDA50 communicates with the host through a packet
protocol known as the Mass Storage Control Protocol (MSCP). Consult the
file <vax/mscp.h> for a detailed description of this protocol.
The uda driver is a typical block-device disk driver; see physio(9) for a
description of block I/O. The script MAKEDEV(8) should be used to create
the uda special files; should a special file need to be created by hand,
consult mknod(8).
The MSCP_PARANOIA option enables runtime checking on all transfer
completion responses from the controller. This increases disk I/O
overhead and may be undesirable on slow machines, but is otherwise
recommended.
The first sector of each disk contains both a first-stage bootstrap
program and a disk label containing geometry information and partition
layouts (see disklabel(5)). This sector is normally write-protected, and
disk-to-disk copies should avoid copying this sector. The label may be
updated with disklabel(8), which can also be used to write-enable and
write-disable the sector. The next 15 sectors contain a second-stage
bootstrap program.
DISK SUPPORT
During autoconfiguration, as well as when a drive is opened after all
partitions are closed, the first sector of the drive is examined for a
disk label. If a label is found, the geometry of the drive and the
partition tables are taken from it. If no label is found, the driver
configures the type of each drive when it is first encountered. A
default partition table in the driver is used for each type of disk when
a pack is not labelled. The origin and size (in sectors) of the default
pseudo-disks on each drive are shown below. Not all partitions begin on
cylinder boundaries, as on other drives, because previous drivers used
one partition table for all drive types. Variants of the partition
tables are common; check the driver and the file /etc/disktab
(disktab(5)) for other possibilities.
Special file names begin with `ra' and `rra' for the block and character
files respectively. The second component of the name, a drive unit
number in the range of zero to seven, is represented by a `?' in the disk
layouts below. The last component of the name is the file system
partition designated by a letter from `a' to `h' and which corresponds to
a minor device number set: zero to seven, eight to 15, 16 to 23 and so
forth for drive zero, drive two and drive three respectively (see
physio(9)). The location and size (in sectors) of the partitions:
RA60 partitions
disk start length
ra?a 0 15884
ra?b 15884 33440
ra?c 0 400176
ra?d 49324 82080 same as 4.2BSD ra?g
ra?e 131404 268772 same as 4.2BSD ra?h
ra?f 49324 350852
ra?g 242606 157570
ra?h 49324 193282
RA70 partitions
disk start length
ra?a 0 15884
ra?b 15972 33440
ra?c 0 547041
ra?d 34122 15884
ra?e 357192 55936
ra?f 413457 133584
ra?g 341220 205821
ra?h 49731 29136
RA80 partitions
disk start length
ra?a 0 15884
ra?b 15884 33440
ra?c 0 242606
ra?e 49324 193282 same as old Berkeley ra?g
ra?f 49324 82080 same as 4.2BSD ra?g
ra?g 49910 192696
ra?h 131404 111202 same as 4.2BSD
RA81 partitions
disk start length
ra?a 0 15884
ra?b 16422 66880
ra?c 0 891072
ra?d 375564 15884
ra?e 391986 307200
ra?f 699720 191352
ra?g 375564 515508
ra?h 83538 291346
RA81 partitions with 4.2BSD-compatible partitions
disk start length
ra?a 0 15884
ra?b 16422 66880
ra?c 0 891072
ra?d 49324 82080 same as 4.2BSD ra?g
ra?e 131404 759668 same as 4.2BSD ra?h
ra?f 412490 478582 same as 4.2BSD ra?f
ra?g 375564 515508
ra?h 83538 291346
RA82 partitions
disk start length
ra?a 0 15884
ra?b 16245 66880
ra?c 0 1135554
ra?d 375345 15884
ra?e 391590 307200
ra?f 669390 466164
ra?g 375345 760209
ra?h 83790 291346
The ra?a partition is normally used for the root file system, the ra?b
partition as a paging area, and the ra?c partition for pack-pack copying
(it maps the entire disk).
FILES
/dev/ra[0-9][a-p]
/dev/rra[0-9][a-p]
DIAGNOSTICS
panic: udaslave No command packets were available while the driver was
looking for disk drives. The controller is not extending enough credits
to use the drives.
uda%d: no response to Get Unit Status request A disk drive was found,
but did not respond to a status request. This is either a hardware
problem or someone pulling unit number plugs very fast.
uda%d: unit %d off line While searching for drives, the controller found
one that seems to be manually disabled. It is ignored.
uda%d: unable to get unit status Something went wrong while trying to
determine the status of a disk drive. This is followed by an error
detail.
uda%d: unit %d, next %d This probably never happens, but I wanted to
know if it did. I have no idea what one should do about it.
uda%d: cannot handle unit number %d (max is %d) The controller found a
drive whose unit number is too large. Valid unit numbers are those in
the range [0..7].
uda%d: uballoc map failed UNIBUS resource map allocation failed during
initialization. This can only happen if you have 496 devices on a
UNIBUS.
uda%d: timeout during init The controller did not initialize within ten
seconds. A hardware problem, but it sometimes goes away if you try
again.
uda%d: init failed, sa=%b The controller refused to initialize.
uda%d: controller hung The controller never finished initialization.
Retrying may sometimes fix it.
uda%d: still hung When the controller hangs, the driver occasionally
tries to reinitialize it. This means it just tried, without success.
panic: udastart: bp==NULL A bug in the driver has put an empty drive
queue on a controller queue.
uda%d: command ring too small If you increase NCMDL2, you may see a
performance improvement. (See /sys/arch/vax/uba/uda.c.)
panic: udastart A drive was found marked for status or on-line functions
while performing status or on-line functions. This indicates a bug in
the driver.
uda%d: controller error, sa=0%o (%s) The controller reported an error.
The error code is printed in octal, along with a short description if the
code is known (see the UDA50 Maintenance Guide, DEC part number
AA-M185B-TC, pp. 18-22). If this occurs during normal operation, the
driver will reset it and retry pending I/O. If it occurs during
configuration, the controller may be ignored.
uda%d: stray intr The controller interrupted when it should have stayed
quiet. The interrupt has been ignored.
uda%d: init step %d failed, sa=%b The controller reported an error
during the named initialization step. The driver will retry
initialization later.
uda%d: version %d model %d An informational message giving the revision
level of the controller.
uda%d: DMA burst size set to %d An informational message showing the DMA
burst size, in words.
panic: udaintr Indicates a bug in the generic MSCP code.
uda%d: driver bug, state %d The driver has a bogus value for the
controller state. Something is quite wrong. This is immediately
followed by a `panic: udastate'.
uda%d: purge bdp %d A benign message tracing BDP purges. I have been
trying to figure out what BDP purges are for. You might want to comment
out this call to log() in /sys/arch/vax/uba/uda.c.
uda%d: SETCTLRC failed: `detail' The Set Controller Characteristics
command (the last part of the controller initialization sequence) failed.
The detail message tells why.
uda%d: attempt to bring ra%d on line failed: `detail' The drive could
not be brought on line. The detail message tells why.
uda%d: ra%d: unknown type %d The type index of the named drive is not
known to the driver, so the drive will be ignored.
uda%d: attempt to get status for ra%d failed: `detail' A status request
failed. The detail message should tell why.
panic: udareplace The controller reported completion of a REPLACE
operation. The driver never issues any REPLACEs, so something is wrong.
panic: udabb The controller reported completion of bad block related
I/O. The driver never issues any such, so something is wrong.
uda%d: lost interrupt The controller has gone out to lunch, and is being
reset to try to bring it back.
panic: mscp_go: AEB_MAX_BP too small You defined AVOID_EMULEX_BUG and
increased NCMDL2 and Emulex has new firmware. Raise AEB_MAX_BP or turn
off AVOID_EMULEX_BUG.
uda%d: unit %d: unknown message type 0x%x ignored The controller
responded with a mysterious message type. See /sys/vax/mscp.h for a list
of known message types. This is probably a controller hardware problem.
uda%d: unit %d out of range The disk drive unit number (the unit plug)
is higher than the maximum number the driver allows (currently 7).
uda%d: unit %d not configured, message ignored The named disk drive has
announced its presence to the controller, but was not, or cannot now be,
configured into the running system. Message is one of `available
attention' (an `I am here' message) or `stray response op 0x%x status
0x%x' (anything else).
Emulex SC41/MS screwup: uda%d, got %d correct, then changed 0x%x to
0x%x You turned on AVOID_EMULEX_BUG, and the driver successfully avoided
the bug. The number of correctly handled requests is reported, along
with the expected and actual values relating to the bug being avoided.
panic: unrecoverable Emulex screwup You turned on AVOID_EMULEX_BUG, but
Emulex was too clever and avoided the avoidance. Try turning on
MSCP_PARANOIA instead.
uda%d: bad response packet ignored You turned on MSCP_PARANOIA, and the
driver caught the controller in a lie. The lie has been ignored, and the
controller will soon be reset (after a `lost' interrupt). This is
followed by a hex dump of the offending packet.
uda%d: %s error datagram The controller has reported some kind of error,
either `hard' (unrecoverable) or `soft' (recoverable). If the controller
is going on (attempting to fix the problem), this message includes the
remark `(continuing)'. Emulex controllers wrongly claim that all soft
errors are hard errors. This message may be followed by one of the
following 5 messages, depending on its type, and will always be followed
by a failure detail message (also listed below).
memory addr 0x%x A host memory access error; this is the address
that could not be read.
unit %d: level %d retry %d, %s %d A typical disk error; the retry
count and error recovery levels are printed, along with the block
type (`lbn', or logical block; or `rbn', or replacement block) and
number. If the string is something else, DEC has been clever, or
your hardware has gone to Australia for vacation (unless you live
there; then it might be in New Zealand, or Brazil).
unit %d: %s %d Also a disk error, but an `SDI' error, whatever
that is. (I doubt it has anything to do with Ronald Reagan.) This
lists the block type (`lbn' or `rbn') and number. This is followed
by a second message indicating a microprocessor error code and a
front panel code. These latter codes are drive-specific, and are
intended to be used by field service as an aid in locating failing
hardware. The codes for RA81s can be found in the RA81 Maintenance
Guide, DEC order number AA-M879A-TC, in appendices E and F.
unit %d: small disk error, cyl %d Yet another kind of disk error,
but for small disks. (``That's what it says, guv'nor. Dunnask me
what it means.'')
unit %d: unknown error, format 0x%x A mysterious error: the given
format code is not known.
The detail messages are as follows:
success (%s) (code 0, subcode %d) Everything worked, but the
controller thought it would let you know that something went wrong.
No matter what subcode, this can probably be ignored.
invalid command (%s) (code 1, subcode %d) This probably cannot
occur unless the hardware is out; %s should be `invalid msg
length', meaning some command was too short or too long.
command aborted (unknown subcode) (code 2, subcode %d) This should
never occur, as the driver never aborts commands.
unit offline (%s) (code 3, subcode %d) The drive is offline,
either because it is not around (`unknown drive'), stopped (`not
mounted'), out of order (`inoperative'), has the same unit number
as some other drive (`duplicate'), or has been disabled for
diagnostics (`in diagnosis').
unit available (unknown subcode) (code 4, subcode %d) The
controller has decided to report a perfectly normal event as an
error. (Why?)
media format error (%s) (code 5, subcode %d) The drive cannot be
used without reformatting. The Format Control Table cannot be read
(`fct unread - edc'), there is a bad sector header (`invalid sector
header'), the drive is not set for 512-byte sectors (`not 512
sectors'), the drive is not formatted (`not formatted'), or the FCT
has an uncorrectable ECC error (`fct ecc').
write protected (%s) (code 6, subcode %d) The drive is write
protected, either by the front panel switch (`hardware') or via the
driver (`software'). The driver never sets software write protect.
compare error (unknown subcode) (code 7, subcode %d) A compare
operation showed some sort of difference. The driver never uses
compare operations.
data error (%s) (code 7, subcode %d) Something went wrong reading
or writing a data sector. A `forced error' is a software-asserted
error used to mark a sector that contains suspect data. Rewriting
the sector will clear the forced error. This is normally set only
during bad block replacement, and the driver does no bad block
replacement, so these should not occur. A `header compare' error
probably means the block is shot. A `sync timeout' presumably has
something to do with sector synchronisation. An `uncorrectable
ecc' error is an ordinary data error that cannot be fixed via ECC
logic. A `%d symbol ecc' error is a data error that can be (and
presumably has been) corrected by the ECC logic. It might indicate
a sector that is imperfect but usable, or that is starting to go
bad. If any of these errors recur, the sector may need to be
replaced.
host buffer access error (%s) (code %d, subcode %d) Something went
wrong while trying to copy data to or from the host (Vax). The
subcode is one of `odd xfer addr', `odd xfer count', `non-exist.
memory', or `memory parity'. The first two could be a software
glitch; the last two indicate hardware problems.
controller error (%s) (code %d, subcode %d) The controller has
detected a hardware error in itself. A `serdes overrun' is a
serialiser / deserialiser overrun; `edc' probably stands for `error
detection code'; and `inconsistent internal data struct' is
obvious.
drive error (%s) (code %d, subcode %d) Either the controller or
the drive has detected a hardware error in the drive. I am not
sure what an `sdi command timeout' is, but these seem to occur
benignly on occasion. A `ctlr detected protocol' error means that
the controller and drive do not agree on a protocol; this could be
a cabling problem, or a version mismatch. A `positioner' error
means the drive seek hardware is ailing; `lost rd/wr ready' means
the drive read/write logic is sick; and `drive clock dropout' means
that the drive clock logic is bad, or the media is hopelessly
scrambled. I have no idea what `lost recvr ready' means. A `drive
detected error' is a catch-all for drive hardware trouble; `ctlr
detected pulse or parity' errors are often caused by cabling
problems.
SEE ALSOintro(4), mscpbus(4), uba(4), disklabel(5), disklabel(8)HISTORY
The uda driver appeared in 4.2BSD.
OpenBSD 4.9 February 18, 2010 OpenBSD 4.9