1 files changed, 1235 insertions, 0 deletions
diff --git a/contrib/perl5/pod/perlebcdic.pod b/contrib/perl5/pod/perlebcdic.pod
new file mode 100644
index 000000000000..12ea2f3ef4b1
--- /dev/null
+++ b/contrib/perl5/pod/perlebcdic.pod
@@ -0,0 +1,1235 @@
+=head1 NAME
+
+perlebcdic - Considerations for running Perl on EBCDIC platforms
+
+=head1 DESCRIPTION
+
+An exploration of some of the issues facing Perl programmers
+on EBCDIC based computers.  We do not cover localization, 
+internationalization, or multi byte character set issues (yet).
+
+Portions that are still incomplete are marked with XXX.
+
+=head1 COMMON CHARACTER CODE SETS
+
+=head2 ASCII
+
+The American Standard Code for Information Interchange is a set of
+integers running from 0 to 127 (decimal) that imply character 
+interpretation by the display and other system(s) of computers.  
+The range 0..127 can be covered by setting the bits in a 7-bit binary 
+digit, hence the set is sometimes referred to as a "7-bit ASCII".  
+ASCII was described by the American National Standards Institute 
+document ANSI X3.4-1986.  It was also described by ISO 646:1991 
+(with localization for currency symbols).  The full ASCII set is 
+given in the table below as the first 128 elements.  Languages that 
+can be written adequately with the characters in ASCII include 
+English, Hawaiian, Indonesian, Swahili and some Native American 
+languages.
+
+There are many character sets that extend the range of integers
+from 0..2**7-1 up to 2**8-1, or 8 bit bytes (octets if you prefer).
+One common one is the ISO 8859-1 character set.
+
+=head2 ISO 8859
+
+The ISO 8859-$n are a collection of character code sets from the 
+International Organization for Standardization (ISO) each of which 
+adds characters to the ASCII set that are typically found in European 
+languages many of which are based on the Roman, or Latin, alphabet.
+
+=head2 Latin 1 (ISO 8859-1)
+
+A particular 8-bit extension to ASCII that includes grave and acute 
+accented Latin characters.  Languages that can employ ISO 8859-1 
+include all the languages covered by ASCII as well as Afrikaans, 
+Albanian, Basque, Catalan, Danish, Faroese, Finnish, Norwegian, 
+Portugese, Spanish, and Swedish.  Dutch is covered albeit without 
+the ij ligature.  French is covered too but without the oe ligature. 
+German can use ISO 8859-1 but must do so without German-style
+quotation marks.  This set is based on Western European extensions 
+to ASCII and is commonly encountered in world wide web work.
+In IBM character code set identification terminology ISO 8859-1 is
+also known as CCSID 819 (or sometimes 0819 or even 00819).
+
+=head2 EBCDIC
+
+The Extended Binary Coded Decimal Interchange Code  refers to a 
+large collection of slightly different single and multi byte 
+coded character sets that are different from ASCII or ISO 8859-1 
+and typically run on host computers.  The EBCDIC encodings derive 
+from 8 bit byte extensions of Hollerith punched card encodings.
+The layout on the cards was such that high bits were set for the
+upper and lower case alphabet characters [a-z] and [A-Z], but there
+were gaps within each latin alphabet range.
+
+Some IBM EBCDIC character sets may be known by character code set 
+identification numbers (CCSID numbers) or code page numbers.  Leading
+zero digits in CCSID numbers within this document are insignificant.
+E.g. CCSID 0037 may be referred to as 37 in places.
+
+=head2 13 variant characters
+
+Among IBM EBCDIC character code sets there are 13 characters that
+are often mapped to different integer values.  Those characters
+are known as the 13 "variant" characters and are:
+
+    \ [ ] { } ^ ~ ! # | $ @ ` 
+
+=head2 0037
+
+Character code set ID 0037 is a mapping of the ASCII plus Latin-1 
+characters (i.e. ISO 8859-1) to an EBCDIC set.  0037 is used 
+in North American English locales on the OS/400 operating system 
+that runs on AS/400 computers.  CCSID 37 differs from ISO 8859-1 
+in 237 places, in other words they agree on only 19 code point values.
+
+=head2 1047
+
+Character code set ID 1047 is also a mapping of the ASCII plus 
+Latin-1 characters (i.e. ISO 8859-1) to an EBCDIC set.  1047 is 
+used under Unix System Services for OS/390, and OpenEdition for VM/ESA. 
+CCSID 1047 differs from CCSID 0037 in eight places.
+
+=head2 POSIX-BC
+
+The EBCDIC code page in use on Siemens' BS2000 system is distinct from
+1047 and 0037.  It is identified below as the POSIX-BC set.
+
+=head1 SINGLE OCTET TABLES
+
+The following tables list the ASCII and Latin 1 ordered sets including
+the subsets: C0 controls (0..31), ASCII graphics (32..7e), delete (7f),
+C1 controls (80..9f), and Latin-1 (a.k.a. ISO 8859-1) (a0..ff).  In the 
+table non-printing control character names as well as the Latin 1 
+extensions to ASCII have been labelled with character names roughly 
+corresponding to I<The Unicode Standard, Version 2.0> albeit with 
+substitutions such as s/LATIN// and s/VULGAR// in all cases, 
+s/CAPITAL LETTER// in some cases, and s/SMALL LETTER ([A-Z])/\l$1/ 
+in some other cases (the C<charnames> pragma names unfortunately do 
+not list explicit names for the C0 or C1 control characters).  The 
+"names" of the C1 control set (128..159 in ISO 8859-1) listed here are 
+somewhat arbitrary.  The differences between the 0037 and 1047 sets are 
+flagged with ***.  The differences between the 1047 and POSIX-BC sets 
+are flagged with ###.  All ord() numbers listed are decimal.  If you 
+would rather see this table listing octal values then run the table 
+(that is, the pod version of this document since this recipe may not 
+work with a pod2_other_format translation) through:
+
+=over 4
+
+=item recipe 0
+
+=back
+
+    perl -ne 'if(/(.{33})(\d+)\s+(\d+)\s+(\d+)\s+(\d+)/)' \
+     -e '{printf("%s%-9o%-9o%-9o%-9o\n",$1,$2,$3,$4,$5)}' perlebcdic.pod
+
+If you would rather see this table listing hexadecimal values then
+run the table through:
+
+=over 4
+
+=item recipe 1
+
+=back
+
+    perl -ne 'if(/(.{33})(\d+)\s+(\d+)\s+(\d+)\s+(\d+)/)' \
+     -e '{printf("%s%-9X%-9X%-9X%-9X\n",$1,$2,$3,$4,$5)}' perlebcdic.pod
+
+
+                                 8859-1
+    chr                          0819     0037     1047     POSIX-BC
+    ----------------------------------------------------------------
+    <NULL>                       0        0        0        0 
+    <START OF HEADING>           1        1        1        1
+    <START OF TEXT>              2        2        2        2
+    <END OF TEXT>                3        3        3        3
+    <END OF TRANSMISSION>        4        55       55       55
+    <ENQUIRY>                    5        45       45       45
+    <ACKNOWLEDGE>                6        46       46       46
+    <BELL>                       7        47       47       47
+    <BACKSPACE>                  8        22       22       22
+    <HORIZONTAL TABULATION>      9        5        5        5
+    <LINE FEED>                  10       37       21       21  ***
+    <VERTICAL TABULATION>        11       11       11       11
+    <FORM FEED>                  12       12       12       12
+    <CARRIAGE RETURN>            13       13       13       13
+    <SHIFT OUT>                  14       14       14       14
+    <SHIFT IN>                   15       15       15       15
+    <DATA LINK ESCAPE>           16       16       16       16
+    <DEVICE CONTROL ONE>         17       17       17       17
+    <DEVICE CONTROL TWO>         18       18       18       18
+    <DEVICE CONTROL THREE>       19       19       19       19
+    <DEVICE CONTROL FOUR>        20       60       60       60
+    <NEGATIVE ACKNOWLEDGE>       21       61       61       61
+    <SYNCHRONOUS IDLE>           22       50       50       50
+    <END OF TRANSMISSION BLOCK>  23       38       38       38
+    <CANCEL>                     24       24       24       24
+    <END OF MEDIUM>              25       25       25       25
+    <SUBSTITUTE>                 26       63       63       63
+    <ESCAPE>                     27       39       39       39
+    <FILE SEPARATOR>             28       28       28       28
+    <GROUP SEPARATOR>            29       29       29       29
+    <RECORD SEPARATOR>           30       30       30       30
+    <UNIT SEPARATOR>             31       31       31       31
+    <SPACE>                      32       64       64       64
+    !                            33       90       90       90
+    "                            34       127      127      127
+    #                            35       123      123      123
+    $                            36       91       91       91
+    %                            37       108      108      108
+    &                            38       80       80       80
+    '                            39       125      125      125
+    (                            40       77       77       77
+    )                            41       93       93       93
+    *                            42       92       92       92
+    +                            43       78       78       78
+    ,                            44       107      107      107
+    -                            45       96       96       96
+    .                            46       75       75       75
+    /                            47       97       97       97
+    0                            48       240      240      240
+    1                            49       241      241      241
+    2                            50       242      242      242
+    3                            51       243      243      243
+    4                            52       244      244      244
+    5                            53       245      245      245
+    6                            54       246      246      246
+    7                            55       247      247      247
+    8                            56       248      248      248
+    9                            57       249      249      249
+    :                            58       122      122      122
+    ;                            59       94       94       94
+    <                            60       76       76       76
+    =                            61       126      126      126
+    >                            62       110      110      110
+    ?                            63       111      111      111
+    @                            64       124      124      124
+    A                            65       193      193      193
+    B                            66       194      194      194
+    C                            67       195      195      195
+    D                            68       196      196      196
+    E                            69       197      197      197
+    F                            70       198      198      198
+    G                            71       199      199      199
+    H                            72       200      200      200
+    I                            73       201      201      201
+    J                            74       209      209      209
+    K                            75       210      210      210
+    L                            76       211      211      211
+    M                            77       212      212      212
+    N                            78       213      213      213
+    O                            79       214      214      214
+    P                            80       215      215      215
+    Q                            81       216      216      216
+    R                            82       217      217      217
+    S                            83       226      226      226
+    T                            84       227      227      227
+    U                            85       228      228      228
+    V                            86       229      229      229
+    W                            87       230      230      230
+    X                            88       231      231      231
+    Y                            89       232      232      232
+    Z                            90       233      233      233
+    [                            91       186      173      187 *** ###
+    \                            92       224      224      188 ### 
+    ]                            93       187      189      189 ***
+    ^                            94       176      95       106 *** ###
+    _                            95       109      109      109
+    `                            96       121      121      74  ###
+    a                            97       129      129      129
+    b                            98       130      130      130
+    c                            99       131      131      131
+    d                            100      132      132      132
+    e                            101      133      133      133
+    f                            102      134      134      134
+    g                            103      135      135      135
+    h                            104      136      136      136
+    i                            105      137      137      137
+    j                            106      145      145      145
+    k                            107      146      146      146
+    l                            108      147      147      147
+    m                            109      148      148      148
+    n                            110      149      149      149
+    o                            111      150      150      150
+    p                            112      151      151      151
+    q                            113      152      152      152
+    r                            114      153      153      153
+    s                            115      162      162      162
+    t                            116      163      163      163
+    u                            117      164      164      164
+    v                            118      165      165      165
+    w                            119      166      166      166
+    x                            120      167      167      167
+    y                            121      168      168      168
+    z                            122      169      169      169
+    {                            123      192      192      251 ###
+    |                            124      79       79       79
+    }                            125      208      208      253 ###
+    ~                            126      161      161      255 ###
+    <DELETE>                     127      7        7        7
+    <C1 0>                       128      32       32       32
+    <C1 1>                       129      33       33       33
+    <C1 2>                       130      34       34       34
+    <C1 3>                       131      35       35       35
+    <C1 4>                       132      36       36       36
+    <C1 5>                       133      21       37       37  ***
+    <C1 6>                       134      6        6        6
+    <C1 7>                       135      23       23       23
+    <C1 8>                       136      40       40       40
+    <C1 9>                       137      41       41       41
+    <C1 10>                      138      42       42       42
+    <C1 11>                      139      43       43       43
+    <C1 12>                      140      44       44       44
+    <C1 13>                      141      9        9        9
+    <C1 14>                      142      10       10       10
+    <C1 15>                      143      27       27       27
+    <C1 16>                      144      48       48       48
+    <C1 17>                      145      49       49       49
+    <C1 18>                      146      26       26       26
+    <C1 19>                      147      51       51       51
+    <C1 20>                      148      52       52       52
+    <C1 21>                      149      53       53       53
+    <C1 22>                      150      54       54       54
+    <C1 23>                      151      8        8        8
+    <C1 24>                      152      56       56       56
+    <C1 25>                      153      57       57       57
+    <C1 26>                      154      58       58       58
+    <C1 27>                      155      59       59       59
+    <C1 28>                      156      4        4        4
+    <C1 29>                      157      20       20       20
+    <C1 30>                      158      62       62       62
+    <C1 31>                      159      255      255      95  ###
+    <NON-BREAKING SPACE>         160      65       65       65
+    <INVERTED EXCLAMATION MARK>  161      170      170      170
+    <CENT SIGN>                  162      74       74       176 ###
+    <POUND SIGN>                 163      177      177      177
+    <CURRENCY SIGN>              164      159      159      159
+    <YEN SIGN>                   165      178      178      178
+    <BROKEN BAR>                 166      106      106      208 ###
+    <SECTION SIGN>               167      181      181      181
+    <DIAERESIS>                  168      189      187      121 *** ###
+    <COPYRIGHT SIGN>             169      180      180      180
+    <FEMININE ORDINAL INDICATOR> 170      154      154      154
+    <LEFT POINTING GUILLEMET>    171      138      138      138
+    <NOT SIGN>                   172      95       176      186 *** ###       
+    <SOFT HYPHEN>                173      202      202      202
+    <REGISTERED TRADE MARK SIGN> 174      175      175      175
+    <MACRON>                     175      188      188      161 ###
+    <DEGREE SIGN>                176      144      144      144
+    <PLUS-OR-MINUS SIGN>         177      143      143      143
+    <SUPERSCRIPT TWO>            178      234      234      234
+    <SUPERSCRIPT THREE>          179      250      250      250
+    <ACUTE ACCENT>               180      190      190      190
+    <MICRO SIGN>                 181      160      160      160
+    <PARAGRAPH SIGN>             182      182      182      182
+    <MIDDLE DOT>                 183      179      179      179
+    <CEDILLA>                    184      157      157      157
+    <SUPERSCRIPT ONE>            185      218      218      218
+    <MASC. ORDINAL INDICATOR>    186      155      155      155
+    <RIGHT POINTING GUILLEMET>   187      139      139      139
+    <FRACTION ONE QUARTER>       188      183      183      183
+    <FRACTION ONE HALF>          189      184      184      184
+    <FRACTION THREE QUARTERS>    190      185      185      185
+    <INVERTED QUESTION MARK>     191      171      171      171
+    <A WITH GRAVE>               192      100      100      100
+    <A WITH ACUTE>               193      101      101      101
+    <A WITH CIRCUMFLEX>          194      98       98       98
+    <A WITH TILDE>               195      102      102      102
+    <A WITH DIAERESIS>           196      99       99       99
+    <A WITH RING ABOVE>          197      103      103      103
+    <CAPITAL LIGATURE AE>        198      158      158      158
+    <C WITH CEDILLA>             199      104      104      104
+    <E WITH GRAVE>               200      116      116      116
+    <E WITH ACUTE>               201      113      113      113
+    <E WITH CIRCUMFLEX>          202      114      114      114
+    <E WITH DIAERESIS>           203      115      115      115
+    <I WITH GRAVE>               204      120      120      120
+    <I WITH ACUTE>               205      117      117      117
+    <I WITH CIRCUMFLEX>          206      118      118      118
+    <I WITH DIAERESIS>           207      119      119      119
+    <CAPITAL LETTER ETH>         208      172      172      172
+    <N WITH TILDE>               209      105      105      105
+    <O WITH GRAVE>               210      237      237      237
+    <O WITH ACUTE>               211      238      238      238
+    <O WITH CIRCUMFLEX>          212      235      235      235
+    <O WITH TILDE>               213      239      239      239
+    <O WITH DIAERESIS>           214      236      236      236
+    <MULTIPLICATION SIGN>        215      191      191      191
+    <O WITH STROKE>              216      128      128      128
+    <U WITH GRAVE>               217      253      253      224 ###
+    <U WITH ACUTE>               218      254      254      254
+    <U WITH CIRCUMFLEX>          219      251      251      221 ###
+    <U WITH DIAERESIS>           220      252      252      252
+    <Y WITH ACUTE>               221      173      186      173 *** ###
+    <CAPITAL LETTER THORN>       222      174      174      174
+    <SMALL LETTER SHARP S>       223      89       89       89
+    <a WITH GRAVE>               224      68       68       68
+    <a WITH ACUTE>               225      69       69       69
+    <a WITH CIRCUMFLEX>          226      66       66       66
+    <a WITH TILDE>               227      70       70       70
+    <a WITH DIAERESIS>           228      67       67       67
+    <a WITH RING ABOVE>          229      71       71       71
+    <SMALL LIGATURE ae>          230      156      156      156
+    <c WITH CEDILLA>             231      72       72       72
+    <e WITH GRAVE>               232      84       84       84
+    <e WITH ACUTE>               233      81       81       81
+    <e WITH CIRCUMFLEX>          234      82       82       82
+    <e WITH DIAERESIS>           235      83       83       83
+    <i WITH GRAVE>               236      88       88       88
+    <i WITH ACUTE>               237      85       85       85
+    <i WITH CIRCUMFLEX>          238      86       86       86
+    <i WITH DIAERESIS>           239      87       87       87
+    <SMALL LETTER eth>           240      140      140      140
+    <n WITH TILDE>               241      73       73       73
+    <o WITH GRAVE>               242      205      205      205
+    <o WITH ACUTE>               243      206      206      206
+    <o WITH CIRCUMFLEX>          244      203      203      203
+    <o WITH TILDE>               245      207      207      207
+    <o WITH DIAERESIS>           246      204      204      204
+    <DIVISION SIGN>              247      225      225      225
+    <o WITH STROKE>              248      112      112      112
+    <u WITH GRAVE>               249      221      221      192 ###
+    <u WITH ACUTE>               250      222      222      222
+    <u WITH CIRCUMFLEX>          251      219      219      219
+    <u WITH DIAERESIS>           252      220      220      220
+    <y WITH ACUTE>               253      141      141      141
+    <SMALL LETTER thorn>         254      142      142      142
+    <y WITH DIAERESIS>           255      223      223      223
+
+If you would rather see the above table in CCSID 0037 order rather than
+ASCII + Latin-1 order then run the table through:
+
+=over 4
+
+=item recipe 2
+
+=back
+
+    perl -ne 'if(/.{33}\d{1,3}\s{6,8}\d{1,3}\s{6,8}\d{1,3}\s{6,8}\d{1,3}/)'\
+     -e '{push(@l,$_)}' \
+     -e 'END{print map{$_->[0]}' \
+     -e '          sort{$a->[1] <=> $b->[1]}' \ 
+     -e '          map{[$_,substr($_,42,3)]}@l;}' perlebcdic.pod
+
+If you would rather see it in CCSID 1047 order then change the digit
+42 in the last line to 51, like this:
+
+=over 4
+
+=item recipe 3
+
+=back
+
+    perl -ne 'if(/.{33}\d{1,3}\s{6,8}\d{1,3}\s{6,8}\d{1,3}\s{6,8}\d{1,3}/)'\
+     -e '{push(@l,$_)}' \
+     -e 'END{print map{$_->[0]}' \
+     -e '          sort{$a->[1] <=> $b->[1]}' \ 
+     -e '          map{[$_,substr($_,51,3)]}@l;}' perlebcdic.pod
+
+If you would rather see it in POSIX-BC order then change the digit
+51 in the last line to 60, like this:
+
+=over 4
+
+=item recipe 4
+
+=back
+
+    perl -ne 'if(/.{33}\d{1,3}\s{6,8}\d{1,3}\s{6,8}\d{1,3}\s{6,8}\d{1,3}/)'\
+     -e '{push(@l,$_)}' \
+     -e 'END{print map{$_->[0]}' \
+     -e '          sort{$a->[1] <=> $b->[1]}' \ 
+     -e '          map{[$_,substr($_,60,3)]}@l;}' perlebcdic.pod
+
+
+=head1 IDENTIFYING CHARACTER CODE SETS
+
+To determine the character set you are running under from perl one 
+could use the return value of ord() or chr() to test one or more 
+character values.  For example:
+
+    $is_ascii  = "A" eq chr(65);
+    $is_ebcdic = "A" eq chr(193);
+
+Also, "\t" is a C<HORIZONTAL TABULATION> character so that:
+
+    $is_ascii  = ord("\t") == 9;
+    $is_ebcdic = ord("\t") == 5;
+
+To distinguish EBCDIC code pages try looking at one or more of
+the characters that differ between them.  For example:
+
+    $is_ebcdic_37   = "\n" eq chr(37);
+    $is_ebcdic_1047 = "\n" eq chr(21);
+
+Or better still choose a character that is uniquely encoded in any
+of the code sets, e.g.:
+
+    $is_ascii           = ord('[') == 91;
+    $is_ebcdic_37       = ord('[') == 186;
+    $is_ebcdic_1047     = ord('[') == 173;
+    $is_ebcdic_POSIX_BC = ord('[') == 187;
+
+However, it would be unwise to write tests such as:
+
+    $is_ascii = "\r" ne chr(13);  #  WRONG
+    $is_ascii = "\n" ne chr(10);  #  ILL ADVISED
+
+Obviously the first of these will fail to distinguish most ASCII machines
+from either a CCSID 0037, a 1047, or a POSIX-BC EBCDIC machine since "\r" eq 
+chr(13) under all of those coded character sets.  But note too that 
+because "\n" is chr(13) and "\r" is chr(10) on the MacIntosh (which is an 
+ASCII machine) the second C<$is_ascii> test will lead to trouble there.
+
+To determine whether or not perl was built under an EBCDIC 
+code page you can use the Config module like so:
+
+    use Config;
+    $is_ebcdic = $Config{'ebcdic'} eq 'define';
+
+=head1 CONVERSIONS
+
+=head2 tr///
+
+In order to convert a string of characters from one character set to 
+another a simple list of numbers, such as in the right columns in the
+above table, along with perl's tr/// operator is all that is needed.  
+The data in the table are in ASCII order hence the EBCDIC columns 
+provide easy to use ASCII to EBCDIC operations that are also easily 
+reversed.
+
+For example, to convert ASCII to code page 037 take the output of the second 
+column from the output of recipe 0 (modified to add \\ characters) and use 
+it in tr/// like so:
+
+    $cp_037 = 
+    '\000\001\002\003\234\011\206\177\227\215\216\013\014\015\016\017' .
+    '\020\021\022\023\235\205\010\207\030\031\222\217\034\035\036\037' .
+    '\200\201\202\203\204\012\027\033\210\211\212\213\214\005\006\007' .
+    '\220\221\026\223\224\225\226\004\230\231\232\233\024\025\236\032' .
+    '\040\240\342\344\340\341\343\345\347\361\242\056\074\050\053\174' .
+    '\046\351\352\353\350\355\356\357\354\337\041\044\052\051\073\254' .
+    '\055\057\302\304\300\301\303\305\307\321\246\054\045\137\076\077' .
+    '\370\311\312\313\310\315\316\317\314\140\072\043\100\047\075\042' .
+    '\330\141\142\143\144\145\146\147\150\151\253\273\360\375\376\261' .
+    '\260\152\153\154\155\156\157\160\161\162\252\272\346\270\306\244' .
+    '\265\176\163\164\165\166\167\170\171\172\241\277\320\335\336\256' .
+    '\136\243\245\267\251\247\266\274\275\276\133\135\257\250\264\327' .
+    '\173\101\102\103\104\105\106\107\110\111\255\364\366\362\363\365' .
+    '\175\112\113\114\115\116\117\120\121\122\271\373\374\371\372\377' .
+    '\134\367\123\124\125\126\127\130\131\132\262\324\326\322\323\325' .
+    '\060\061\062\063\064\065\066\067\070\071\263\333\334\331\332\237' ;
+
+    my $ebcdic_string = $ascii_string;
+    eval '$ebcdic_string =~ tr/\000-\377/' . $cp_037 . '/';
+
+To convert from EBCDIC 037 to ASCII just reverse the order of the tr/// 
+arguments like so:
+
+    my $ascii_string = $ebcdic_string;
+    eval '$ascii_string = tr/' . $cp_037 . '/\000-\377/';
+
+Similarly one could take the output of the third column from recipe 0 to
+obtain a C<$cp_1047> table.  The fourth column of the output from recipe
+0 could provide a C<$cp_posix_bc> table suitable for transcoding as well.
+
+=head2 iconv
+
+XPG operability often implies the presence of an I<iconv> utility
+available from the shell or from the C library.  Consult your system's
+documentation for information on iconv.
+
+On OS/390 see the iconv(1) man page.  One way to invoke the iconv 
+shell utility from within perl would be to:
+
+    # OS/390 example
+    $ascii_data = `echo '$ebcdic_data'| iconv -f IBM-1047 -t ISO8859-1`
+
+or the inverse map:
+
+    # OS/390 example
+    $ebcdic_data = `echo '$ascii_data'| iconv -f ISO8859-1 -t IBM-1047`
+
+For other perl based conversion options see the Convert::* modules on CPAN.
+
+=head2 C RTL
+
+The OS/390 C run time library provides _atoe() and _etoa() functions.
+
+=head1 OPERATOR DIFFERENCES
+
+The C<..> range operator treats certain character ranges with 
+care on EBCDIC machines.  For example the following array
+will have twenty six elements on either an EBCDIC machine
+or an ASCII machine:
+
+    @alphabet = ('A'..'Z');   #  $#alphabet == 25
+
+The bitwise operators such as & ^ | may return different results
+when operating on string or character data in a perl program running 
+on an EBCDIC machine than when run on an ASCII machine.  Here is
+an example adapted from the one in L<perlop>:
+
+    # EBCDIC-based examples
+    print "j p \n" ^ " a h";                      # prints "JAPH\n"
+    print "JA" | "  ph\n";                        # prints "japh\n" 
+    print "JAPH\nJunk" & "\277\277\277\277\277";  # prints "japh\n";
+    print 'p N$' ^ " E<H\n";                      # prints "Perl\n";
+
+An interesting property of the 32 C0 control characters
+in the ASCII table is that they can "literally" be constructed
+as control characters in perl, e.g. C<(chr(0) eq "\c@")> 
+C<(chr(1) eq "\cA")>, and so on.  Perl on EBCDIC machines has been 
+ported to take "\c@" to chr(0) and "\cA" to chr(1) as well, but the
+thirty three characters that result depend on which code page you are
+using.  The table below uses the character names from the previous table 
+but with substitutions such as s/START OF/S.O./; s/END OF /E.O./; 
+s/TRANSMISSION/TRANS./; s/TABULATION/TAB./; s/VERTICAL/VERT./; 
+s/HORIZONTAL/HORIZ./; s/DEVICE CONTROL/D.C./; s/SEPARATOR/SEP./; 
+s/NEGATIVE ACKNOWLEDGE/NEG. ACK./;.  The POSIX-BC and 1047 sets are
+identical throughout this range and differ from the 0037 set at only 
+one spot (21 decimal).  Note that the C<LINE FEED> character
+may be generated by "\cJ" on ASCII machines but by "\cU" on 1047 or POSIX-BC 
+machines and cannot be generated as a C<"\c.letter."> control character on 
+0037 machines.  Note also that "\c\\" maps to two characters
+not one.
+
+    chr   ord  8859-1               0037                1047 && POSIX-BC     
+    ------------------------------------------------------------------------
+    "\c?" 127  <DELETE>             "                   "              ***><
+    "\c@"   0  <NULL>               <NULL>              <NULL>         ***><
+    "\cA"   1  <S.O. HEADING>       <S.O. HEADING>      <S.O. HEADING> 
+    "\cB"   2  <S.O. TEXT>          <S.O. TEXT>         <S.O. TEXT>
+    "\cC"   3  <E.O. TEXT>          <E.O. TEXT>         <E.O. TEXT>
+    "\cD"   4  <E.O. TRANS.>        <C1 28>             <C1 28> 
+    "\cE"   5  <ENQUIRY>            <HORIZ. TAB.>       <HORIZ. TAB.>    
+    "\cF"   6  <ACKNOWLEDGE>        <C1 6>              <C1 6>   
+    "\cG"   7  <BELL>               <DELETE>            <DELETE>   
+    "\cH"   8  <BACKSPACE>          <C1 23>             <C1 23>
+    "\cI"   9  <HORIZ. TAB.>        <C1 13>             <C1 13>
+    "\cJ"  10  <LINE FEED>          <C1 14>             <C1 14>
+    "\cK"  11  <VERT. TAB.>         <VERT. TAB.>        <VERT. TAB.>
+    "\cL"  12  <FORM FEED>          <FORM FEED>         <FORM FEED>    
+    "\cM"  13  <CARRIAGE RETURN>    <CARRIAGE RETURN>   <CARRIAGE RETURN> 
+    "\cN"  14  <SHIFT OUT>          <SHIFT OUT>         <SHIFT OUT>
+    "\cO"  15  <SHIFT IN>           <SHIFT IN>          <SHIFT IN>
+    "\cP"  16  <DATA LINK ESCAPE>   <DATA LINK ESCAPE>  <DATA LINK ESCAPE> 
+    "\cQ"  17  <D.C. ONE>           <D.C. ONE>          <D.C. ONE>
+    "\cR"  18  <D.C. TWO>           <D.C. TWO>          <D.C. TWO>
+    "\cS"  19  <D.C. THREE>         <D.C. THREE>        <D.C. THREE> 
+    "\cT"  20  <D.C. FOUR>          <C1 29>             <C1 29> 
+    "\cU"  21  <NEG. ACK.>          <C1 5>              <LINE FEED>    ***
+    "\cV"  22  <SYNCHRONOUS IDLE>   <BACKSPACE>         <BACKSPACE>
+    "\cW"  23  <E.O. TRANS. BLOCK>  <C1 7>              <C1 7>
+    "\cX"  24  <CANCEL>             <CANCEL>            <CANCEL>
+    "\cY"  25  <E.O. MEDIUM>        <E.O. MEDIUM>       <E.O. MEDIUM>
+    "\cZ"  26  <SUBSTITUTE>         <C1 18>             <C1 18>
+    "\c["  27  <ESCAPE>             <C1 15>             <C1 15>
+    "\c\\" 28  <FILE SEP.>\         <FILE SEP.>\        <FILE SEP.>\
+    "\c]"  29  <GROUP SEP.>         <GROUP SEP.>        <GROUP SEP.>
+    "\c^"  30  <RECORD SEP.>        <RECORD SEP.>       <RECORD SEP.>  ***><
+    "\c_"  31  <UNIT SEP.>          <UNIT SEP.>         <UNIT SEP.>    ***><
+
+
+=head1 FUNCTION DIFFERENCES
+
+=over 8
+
+=item chr()
+
+chr() must be given an EBCDIC code number argument to yield a desired 
+character return value on an EBCDIC machine.  For example:
+
+    $CAPITAL_LETTER_A = chr(193);
+
+=item ord()
+
+ord() will return EBCDIC code number values on an EBCDIC machine.
+For example:
+
+    $the_number_193 = ord("A");
+
+=item pack()
+
+The c and C templates for pack() are dependent upon character set 
+encoding.  Examples of usage on EBCDIC include:
+
+    $foo = pack("CCCC",193,194,195,196);
+    # $foo eq "ABCD"
+    $foo = pack("C4",193,194,195,196);
+    # same thing
+
+    $foo = pack("ccxxcc",193,194,195,196);
+    # $foo eq "AB\0\0CD"
+
+=item print()
+
+One must be careful with scalars and strings that are passed to
+print that contain ASCII encodings.  One common place
+for this to occur is in the output of the MIME type header for
+CGI script writing.  For example, many perl programming guides 
+recommend something similar to:
+
+    print "Content-type:\ttext/html\015\012\015\012"; 
+    # this may be wrong on EBCDIC
+
+Under the IBM OS/390 USS Web Server for example you should instead
+write that as:
+
+    print "Content-type:\ttext/html\r\n\r\n"; # OK for DGW et alia
+
+That is because the translation from EBCDIC to ASCII is done
+by the web server in this case (such code will not be appropriate for
+the Macintosh however).  Consult your web server's documentation for 
+further details.
+
+=item printf()
+
+The formats that can convert characters to numbers and vice versa
+will be different from their ASCII counterparts when executed
+on an EBCDIC machine.  Examples include:
+
+    printf("%c%c%c",193,194,195);  # prints ABC
+
+=item sort()
+
+EBCDIC sort results may differ from ASCII sort results especially for 
+mixed case strings.  This is discussed in more detail below.
+
+=item sprintf()
+
+See the discussion of printf() above.  An example of the use
+of sprintf would be:
+
+    $CAPITAL_LETTER_A = sprintf("%c",193);
+
+=item unpack()
+
+See the discussion of pack() above.
+
+=back
+
+=head1 REGULAR EXPRESSION DIFFERENCES
+
+As of perl 5.005_03 the letter range regular expression such as 
+[A-Z] and [a-z] have been especially coded to not pick up gap 
+characters.  For example, characters such as E<ocirc> C<o WITH CIRCUMFLEX> 
+that lie between I and J would not be matched by the 
+regular expression range C</[H-K]/>.  
+
+If you do want to match the alphabet gap characters in a single octet 
+regular expression try matching the hex or octal code such 
+as C</\313/> on EBCDIC or C</\364/> on ASCII machines to 
+have your regular expression match C<o WITH CIRCUMFLEX>.
+
+Another construct to be wary of is the inappropriate use of hex or
+octal constants in regular expressions.  Consider the following
+set of subs:
+
+    sub is_c0 {
+        my $char = substr(shift,0,1);
+        $char =~ /[\000-\037]/;
+    }
+
+    sub is_print_ascii {
+        my $char = substr(shift,0,1);
+        $char =~ /[\040-\176]/;
+    }
+
+    sub is_delete {
+        my $char = substr(shift,0,1);
+        $char eq "\177";
+    }
+
+    sub is_c1 {
+        my $char = substr(shift,0,1);
+        $char =~ /[\200-\237]/;
+    }
+
+    sub is_latin_1 {
+        my $char = substr(shift,0,1);
+        $char =~ /[\240-\377]/;
+    }
+
+The above would be adequate if the concern was only with numeric code points.
+However, the concern may be with characters rather than code points 
+and on an EBCDIC machine it may be desirable for constructs such as 
+C<if (is_print_ascii("A")) {print "A is a printable character\n";}> to print
+out the expected message.  One way to represent the above collection
+of character classification subs that is capable of working across the
+four coded character sets discussed in this document is as follows:
+
+    sub Is_c0 {
+        my $char = substr(shift,0,1);
+        if (ord('^')==94)  { # ascii
+            return $char =~ /[\000-\037]/;
+        } 
+        if (ord('^')==176) { # 37
+            return $char =~ /[\000-\003\067\055-\057\026\005\045\013-\023\074\075\062\046\030\031\077\047\034-\037]/;
+        }
+        if (ord('^')==95 || ord('^')==106) { # 1047 || posix-bc
+            return $char =~ /[\000-\003\067\055-\057\026\005\025\013-\023\074\075\062\046\030\031\077\047\034-\037]/;
+        }
+    }
+
+    sub Is_print_ascii {
+        my $char = substr(shift,0,1);
+        $char =~ /[ !"\#\$%&'()*+,\-.\/0-9:;<=>?\@A-Z[\\\]^_`a-z{|}~]/;
+    }
+
+    sub Is_delete {
+        my $char = substr(shift,0,1);
+        if (ord('^')==94)  { # ascii
+            return $char eq "\177";
+        }
+        else  {              # ebcdic
+            return $char eq "\007";
+        }
+    }
+
+    sub Is_c1 {
+        my $char = substr(shift,0,1);
+        if (ord('^')==94)  { # ascii
+            return $char =~ /[\200-\237]/;
+        }
+        if (ord('^')==176) { # 37
+            return $char =~ /[\040-\044\025\006\027\050-\054\011\012\033\060\061\032\063-\066\010\070-\073\040\024\076\377]/;
+        }
+        if (ord('^')==95)  { # 1047
+            return $char =~ /[\040-\045\006\027\050-\054\011\012\033\060\061\032\063-\066\010\070-\073\040\024\076\377]/;
+        }
+        if (ord('^')==106) { # posix-bc
+            return $char =~ 
+              /[\040-\045\006\027\050-\054\011\012\033\060\061\032\063-\066\010\070-\073\040\024\076\137]/;
+        }
+    }
+
+    sub Is_latin_1 {
+        my $char = substr(shift,0,1);
+        if (ord('^')==94)  { # ascii
+            return $char =~ /[\240-\377]/;
+        }
+        if (ord('^')==176) { # 37
+            return $char =~ 
+              /[\101\252\112\261\237\262\152\265\275\264\232\212\137\312\257\274\220\217\352\372\276\240\266\263\235\332\233\213\267\270\271\253\144\145\142\146\143\147\236\150\164\161-\163\170\165-\167\254\151\355\356\353\357\354\277\200\375\376\373\374\255\256\131\104\105\102\106\103\107\234\110\124\121-\123\130\125-\127\214\111\315\316\313\317\314\341\160\335\336\333\334\215\216\337]/;
+        }
+        if (ord('^')==95)  { # 1047
+            return $char =~
+              /[\101\252\112\261\237\262\152\265\273\264\232\212\260\312\257\274\220\217\352\372\276\240\266\263\235\332\233\213\267\270\271\253\144\145\142\146\143\147\236\150\164\161-\163\170\165-\167\254\151\355\356\353\357\354\277\200\375\376\373\374\272\256\131\104\105\102\106\103\107\234\110\124\121-\123\130\125-\127\214\111\315\316\313\317\314\341\160\335\336\333\334\215\216\337]/; 
+        }
+        if (ord('^')==106) { # posix-bc
+            return $char =~ 
+              /[\101\252\260\261\237\262\320\265\171\264\232\212\272\312\257\241\220\217\352\372\276\240\266\263\235\332\233\213\267\270\271\253\144\145\142\146\143\147\236\150\164\161-\163\170\165-\167\254\151\355\356\353\357\354\277\200\340\376\335\374\255\256\131\104\105\102\106\103\107\234\110\124\121-\123\130\125-\127\214\111\315\316\313\317\314\341\160\300\336\333\334\215\216\337]/;
+        }
+    }
+
+Note however that only the C<Is_ascii_print()> sub is really independent 
+of coded character set.  Another way to write C<Is_latin_1()> would be 
+to use the characters in the range explicitly:
+
+    sub Is_latin_1 {
+        my $char = substr(shift,0,1);
+        $char =~ /[������������������������������������������������������������������������������������������������]/;
+    }
+
+Although that form may run into trouble in network transit (due to the 
+presence of 8 bit characters) or on non ISO-Latin character sets.
+
+=head1 SOCKETS
+
+Most socket programming assumes ASCII character encodings in network
+byte order.  Exceptions can include CGI script writing under a
+host web server where the server may take care of translation for you.
+Most host web servers convert EBCDIC data to ISO-8859-1 or Unicode on
+output.
+
+=head1 SORTING
+
+One big difference between ASCII based character sets and EBCDIC ones
+are the relative positions of upper and lower case letters and the
+letters compared to the digits.  If sorted on an ASCII based machine the
+two letter abbreviation for a physician comes before the two letter
+for drive, that is:
+
+    @sorted = sort(qw(Dr. dr.));  # @sorted holds ('Dr.','dr.') on ASCII,
+                                  # but ('dr.','Dr.') on EBCDIC
+
+The property of lower case before uppercase letters in EBCDIC is
+even carried to the Latin 1 EBCDIC pages such as 0037 and 1047.
+An example would be that E<Euml> C<E WITH DIAERESIS> (203) comes 
+before E<euml> C<e WITH DIAERESIS> (235) on an ASCII machine, but 
+the latter (83) comes before the former (115) on an EBCDIC machine.  
+(Astute readers will note that the upper case version of E<szlig> 
+C<SMALL LETTER SHARP S> is simply "SS" and that the upper case version of 
+E<yuml> C<y WITH DIAERESIS> is not in the 0..255 range but it is 
+at U+x0178 in Unicode, or C<"\x{178}"> in a Unicode enabled Perl).
+
+The sort order will cause differences between results obtained on
+ASCII machines versus EBCDIC machines.  What follows are some suggestions
+on how to deal with these differences.
+
+=head2 Ignore ASCII vs. EBCDIC sort differences.
+
+This is the least computationally expensive strategy.  It may require
+some user education.
+
+=head2 MONO CASE then sort data.
+
+In order to minimize the expense of mono casing mixed test try to
+C<tr///> towards the character set case most employed within the data.
+If the data are primarily UPPERCASE non Latin 1 then apply tr/[a-z]/[A-Z]/
+then sort().  If the data are primarily lowercase non Latin 1 then
+apply tr/[A-Z]/[a-z]/ before sorting.  If the data are primarily UPPERCASE
+and include Latin-1 characters then apply:  
+
+    tr/[a-z]/[A-Z]/; 
+    tr/[������������������������������]/[������������������������������]/;
+    s/�/SS/g; 
+
+then sort().  Do note however that such Latin-1 manipulation does not 
+address the E<yuml> C<y WITH DIAERESIS> character that will remain at 
+code point 255 on ASCII machines, but 223 on most EBCDIC machines 
+where it will sort to a place less than the EBCDIC numerals.  With a 
+Unicode enabled Perl you might try:
+
+    tr/^?/\x{178}/;
+
+The strategy of mono casing data before sorting does not preserve the case 
+of the data and may not be acceptable for that reason.
+
+=head2 Convert, sort data, then re convert.
+
+This is the most expensive proposition that does not employ a network
+connection.
+
+=head2 Perform sorting on one type of machine only.
+
+This strategy can employ a network connection.  As such
+it would be computationally expensive.
+
+=head1 TRANFORMATION FORMATS
+
+There are a variety of ways of transforming data with an intra character set 
+mapping that serve a variety of purposes.  Sorting was discussed in the 
+previous section and a few of the other more popular mapping techniques are 
+discussed next.
+
+=head2 URL decoding and encoding
+
+Note that some URLs have hexadecimal ASCII code points in them in an
+attempt to overcome character or protocol limitation issues.  For example 
+the tilde character is not on every keyboard hence a URL of the form:
+
+    http://www.pvhp.com/~pvhp/
+
+may also be expressed as either of:
+
+    http://www.pvhp.com/%7Epvhp/
+
+    http://www.pvhp.com/%7epvhp/
+
+where 7E is the hexadecimal ASCII code point for '~'.  Here is an example
+of decoding such a URL under CCSID 1047:
+
+    $url = 'http://www.pvhp.com/%7Epvhp/';
+    # this array assumes code page 1047
+    my @a2e_1047 = (
+          0,  1,  2,  3, 55, 45, 46, 47, 22,  5, 21, 11, 12, 13, 14, 15,
+         16, 17, 18, 19, 60, 61, 50, 38, 24, 25, 63, 39, 28, 29, 30, 31,
+         64, 90,127,123, 91,108, 80,125, 77, 93, 92, 78,107, 96, 75, 97,
+        240,241,242,243,244,245,246,247,248,249,122, 94, 76,126,110,111,
+        124,193,194,195,196,197,198,199,200,201,209,210,211,212,213,214,
+        215,216,217,226,227,228,229,230,231,232,233,173,224,189, 95,109,
+        121,129,130,131,132,133,134,135,136,137,145,146,147,148,149,150,
+        151,152,153,162,163,164,165,166,167,168,169,192, 79,208,161,  7,
+         32, 33, 34, 35, 36, 37,  6, 23, 40, 41, 42, 43, 44,  9, 10, 27,
+         48, 49, 26, 51, 52, 53, 54,  8, 56, 57, 58, 59,  4, 20, 62,255,
+         65,170, 74,177,159,178,106,181,187,180,154,138,176,202,175,188,
+        144,143,234,250,190,160,182,179,157,218,155,139,183,184,185,171,
+        100,101, 98,102, 99,103,158,104,116,113,114,115,120,117,118,119,
+        172,105,237,238,235,239,236,191,128,253,254,251,252,186,174, 89,
+         68, 69, 66, 70, 67, 71,156, 72, 84, 81, 82, 83, 88, 85, 86, 87,
+        140, 73,205,206,203,207,204,225,112,221,222,219,220,141,142,223
+    );
+    $url =~ s/%([0-9a-fA-F]{2})/pack("c",$a2e_1047[hex($1)])/ge;
+
+Conversely, here is a partial solution for the task of encoding such 
+a URL under the 1047 code page:
+
+    $url = 'http://www.pvhp.com/~pvhp/';
+    # this array assumes code page 1047
+    my @e2a_1047 = (
+          0,  1,  2,  3,156,  9,134,127,151,141,142, 11, 12, 13, 14, 15,
+         16, 17, 18, 19,157, 10,  8,135, 24, 25,146,143, 28, 29, 30, 31,
+        128,129,130,131,132,133, 23, 27,136,137,138,139,140,  5,  6,  7,
+        144,145, 22,147,148,149,150,  4,152,153,154,155, 20, 21,158, 26,
+         32,160,226,228,224,225,227,229,231,241,162, 46, 60, 40, 43,124,
+         38,233,234,235,232,237,238,239,236,223, 33, 36, 42, 41, 59, 94,
+         45, 47,194,196,192,193,195,197,199,209,166, 44, 37, 95, 62, 63,
+        248,201,202,203,200,205,206,207,204, 96, 58, 35, 64, 39, 61, 34,
+        216, 97, 98, 99,100,101,102,103,104,105,171,187,240,253,254,177,
+        176,106,107,108,109,110,111,112,113,114,170,186,230,184,198,164,
+        181,126,115,116,117,118,119,120,121,122,161,191,208, 91,222,174,
+        172,163,165,183,169,167,182,188,189,190,221,168,175, 93,180,215,
+        123, 65, 66, 67, 68, 69, 70, 71, 72, 73,173,244,246,242,243,245,
+        125, 74, 75, 76, 77, 78, 79, 80, 81, 82,185,251,252,249,250,255,
+         92,247, 83, 84, 85, 86, 87, 88, 89, 90,178,212,214,210,211,213,
+         48, 49, 50, 51, 52, 53, 54, 55, 56, 57,179,219,220,217,218,159
+    );
+    # The following regular expression does not address the 
+    # mappings for: ('.' => '%2E', '/' => '%2F', ':' => '%3A') 
+    $url =~ s/([\t "#%&\(\),;<=>\?\@\[\\\]^`{|}~])/sprintf("%%%02X",$e2a_1047[ord($1)])/ge;
+
+where a more complete solution would split the URL into components 
+and apply a full s/// substitution only to the appropriate parts.
+
+In the remaining examples a @e2a or @a2e array may be employed
+but the assignment will not be shown explicitly.  For code page 1047
+you could use the @a2e_1047 or @e2a_1047 arrays just shown.
+
+=head2 uu encoding and decoding
+
+The C<u> template to pack() or unpack() will render EBCDIC data in EBCDIC 
+characters equivalent to their ASCII counterparts.  For example, the 
+following will print "Yes indeed\n" on either an ASCII or EBCDIC computer:
+
+    $all_byte_chrs = '';
+    for (0..255) { $all_byte_chrs .= chr($_); }
+    $uuencode_byte_chrs = pack('u', $all_byte_chrs);
+    ($uu = <<'    ENDOFHEREDOC') =~ s/^\s*//gm;
+    M``$"`P0%!@<("0H+#`T.#Q`1$A,4%187&!D:&QP='A\@(2(C)"4F)R@I*BLL
+    M+2XO,#$R,S0U-C<X.3H[/#T^/T!!0D-$149'2$E*2TQ-3D]045)35%565UA9
+    M6EM<75Y?8&%B8V1E9F=H:6IK;&UN;W!Q<G-T=79W>'EZ>WQ]?G^`@8*#A(6&
+    MAXB)BHN,C8Z/D)&2DY25EI>8F9J;G)V>GZ"AHJ.DI::GJ*FJJZRMKJ^PL;*S
+    MM+6VM[BYNKN\O;Z_P,'"P\3%QL?(R<K+S,W.S]#1TM/4U=;7V-G:V]S=WM_@
+    ?X>+CY.7FY^CIZNOL[>[O\/'R\_3U]O?X^?K[_/W^_P``
+    ENDOFHEREDOC
+    if ($uuencode_byte_chrs eq $uu) {
+        print "Yes ";
+    }
+    $uudecode_byte_chrs = unpack('u', $uuencode_byte_chrs);
+    if ($uudecode_byte_chrs eq $all_byte_chrs) {
+        print "indeed\n";
+    }
+
+Here is a very spartan uudecoder that will work on EBCDIC provided
+that the @e2a array is filled in appropriately:
+
+    #!/usr/local/bin/perl
+    @e2a = ( # this must be filled in
+           );
+    $_ = <> until ($mode,$file) = /^begin\s*(\d*)\s*(\S*)/;
+    open(OUT, "> $file") if $file ne "";
+    while(<>) {
+        last if /^end/;
+        next if /[a-z]/;
+        next unless int(((($e2a[ord()] - 32 ) & 077) + 2) / 3) ==
+            int(length() / 4);
+        print OUT unpack("u", $_);
+    }
+    close(OUT);
+    chmod oct($mode), $file;
+
+
+=head2 Quoted-Printable encoding and decoding
+
+On ASCII encoded machines it is possible to strip characters outside of
+the printable set using:
+
+    # This QP encoder works on ASCII only
+    $qp_string =~ s/([=\x00-\x1F\x80-\xFF])/sprintf("=%02X",ord($1))/ge;
+
+Whereas a QP encoder that works on both ASCII and EBCDIC machines 
+would look somewhat like the following (where the EBCDIC branch @e2a 
+array is omitted for brevity):
+
+    if (ord('A') == 65) {    # ASCII
+        $delete = "\x7F";    # ASCII
+        @e2a = (0 .. 255)    # ASCII to ASCII identity map
+    }
+    else {                   # EBCDIC
+        $delete = "\x07";    # EBCDIC
+        @e2a =               # EBCDIC to ASCII map (as shown above)
+    }
+    $qp_string =~
+      s/([^ !"\#\$%&'()*+,\-.\/0-9:;<>?\@A-Z[\\\]^_`a-z{|}~$delete])/sprintf("=%02X",$e2a[ord($1)])/ge;
+
+(although in production code the substitutions might be done
+in the EBCDIC branch with the @e2a array and separately in the 
+ASCII branch without the expense of the identity map).
+
+Such QP strings can be decoded with:
+
+    # This QP decoder is limited to ASCII only
+    $string =~ s/=([0-9A-Fa-f][0-9A-Fa-f])/chr hex $1/ge;
+    $string =~ s/=[\n\r]+$//;
+
+Whereas a QP decoder that works on both ASCII and EBCDIC machines 
+would look somewhat like the following (where the @a2e array is
+omitted for brevity):
+
+    $string =~ s/=([0-9A-Fa-f][0-9A-Fa-f])/chr $a2e[hex $1]/ge;
+    $string =~ s/=[\n\r]+$//;
+
+=head2 Caesarian cyphers
+
+The practice of shifting an alphabet one or more characters for encipherment
+dates back thousands of years and was explicitly detailed by Gaius Julius
+Caesar in his B<Gallic Wars> text.  A single alphabet shift is sometimes 
+referred to as a rotation and the shift amount is given as a number $n after
+the string 'rot' or "rot$n".  Rot0 and rot26 would designate identity maps 
+on the 26 letter English version of the Latin alphabet.  Rot13 has the 
+interesting property that alternate subsequent invocations are identity maps 
+(thus rot13 is its own non-trivial inverse in the group of 26 alphabet 
+rotations).  Hence the following is a rot13 encoder and decoder that will 
+work on ASCII and EBCDIC machines:
+
+    #!/usr/local/bin/perl
+
+    while(<>){
+        tr/n-za-mN-ZA-M/a-zA-Z/;
+        print;
+    }
+
+In one-liner form:
+
+    perl -ne 'tr/n-za-mN-ZA-M/a-zA-Z/;print'
+
+
+=head1 Hashing order and checksums
+
+XXX
+
+=head1 I18N AND L10N
+
+Internationalization(I18N) and localization(L10N) are supported at least 
+in principle even on EBCDIC machines.  The details are system dependent 
+and discussed under the L<perlebcdic/OS ISSUES> section below.
+
+=head1 MULTI OCTET CHARACTER SETS
+
+Multi byte EBCDIC code pages; Unicode, UTF-8, UTF-EBCDIC, XXX.
+
+=head1 OS ISSUES
+
+There may be a few system dependent issues 
+of concern to EBCDIC Perl programmers.
+
+=head2 OS/400 
+
+The PASE environment.
+
+=over 8
+
+=item IFS access
+
+XXX.
+
+=back
+
+=head2 OS/390 
+
+Perl runs under Unix Systems Services or USS.
+
+=over 8
+
+=item chcp
+
+B<chcp> is supported as a shell utility for displaying and changing 
+one's code page.  See also L<chcp>.
+
+=item dataset access
+
+For sequential data set access try:
+
+    my @ds_records = `cat //DSNAME`;
+
+or:
+
+    my @ds_records = `cat //'HLQ.DSNAME'`;
+
+See also the OS390::Stdio module on CPAN.
+
+=item OS/390 iconv
+
+B<iconv> is supported as both a shell utility and a C RTL routine.
+See also the iconv(1) and iconv(3) manual pages.
+
+=item locales
+
+On OS/390 see L<locale> for information on locales.  The L10N files
+are in F</usr/nls/locale>.  $Config{d_setlocale} is 'define' on OS/390.
+
+=back
+
+=head2 VM/ESA?
+
+XXX.
+
+=head2 POSIX-BC?
+
+XXX.
+
+=head1 BUGS
+
+This pod document contains literal Latin 1 characters and may encounter 
+translation difficulties.  In particular one popular nroff implementation 
+was known to strip accented characters to their unaccented counterparts 
+while attempting to view this document through the B<pod2man> program 
+(for example, you may see a plain C<y> rather than one with a diaeresis 
+as in E<yuml>).  Another nroff truncated the resultant man page at
+the first occurence of 8 bit characters.
+
+Not all shells will allow multiple C<-e> string arguments to perl to
+be concatenated together properly as recipes 2, 3, and 4 might seem
+to imply.
+
+Perl does not yet work with any Unicode features on EBCDIC platforms.
+
+=head1 SEE ALSO
+
+L<perllocale>, L<perlfunc>.
+
+=head1 REFERENCES
+
+http://anubis.dkuug.dk/i18n/charmaps
+
+http://www.unicode.org/
+
+http://www.unicode.org/unicode/reports/tr16/
+
+http://www.wps.com/texts/codes/
+B<ASCII: American Standard Code for Information Infiltration> Tom Jennings,
+September 1999.
+
+B<The Unicode Standard Version 2.0> The Unicode Consortium, 
+ISBN 0-201-48345-9, Addison Wesley Developers Press, July 1996. 
+
+B<The Unicode Standard Version 3.0> The Unicode Consortium, Lisa Moore ed., 
+ISBN 0-201-61633-5, Addison Wesley Developers Press, February 2000. 
+
+B<CDRA: IBM - Character Data Representation Architecture - 
+Reference and Registry>, IBM SC09-2190-00, December 1996. 
+
+"Demystifying Character Sets", Andrea Vine, Multilingual Computing 
+& Technology, B<#26 Vol. 10 Issue 4>, August/September 1999;
+ISSN 1523-0309; Multilingual Computing Inc. Sandpoint ID, USA.
+
+B<Codes, Ciphers, and Other Cryptic and Clandestine Communication>
+Fred B. Wrixon, ISBN 1-57912-040-7, Black Dog & Leventhal Publishers,
+1998.
+
+=head1 AUTHOR
+
+Peter Prymmer pvhp@best.com wrote this in 1999 and 2000 
+with CCSID 0819 and 0037 help from Chris Leach and 
+AndrE<eacute> Pirard A.Pirard@ulg.ac.be as well as POSIX-BC 
+help from Thomas Dorner Thomas.Dorner@start.de.
+Thanks also to Vickie Cooper, Philip Newton, William Raffloer, and 
+Joe Smith.  Trademarks, registered trademarks, service marks and 
+registered service marks used in this document are the property of 
+their respective owners.
+
+