IBM Education Assistance for z/OS V2R1 Item: ASCII Unicode Option
Transcription
IBM Education Assistance for z/OS V2R1 Item: ASCII Unicode Option
IBM Education Assistance for z/OS V2R1 Item: ASCII Unicode Option Element/Component: UNIX Shells and Utilities (S&U) Material is current as of June 2013 © 2013 IBM Corporation Filename: zOS V2R1 USS S&U ASCII Unicode Option Agenda ■ Trademarks ■ Presentation Objectives ■ Overview ■ Usage & Invocation ■ Migration & Coexistence Considerations ■ Presentation Summary ■ Appendix Page 2 of 19 © 2013 IBM Corporation Filename: zOS V2R1 USS S&U ASCII Unicode Option IBM Presentation Template Full Version Trademarks ■ See url http://www.ibm.com/legal/copytrade.shtml for a list of trademarks. Page 3 of 19 © 2013 IBM Corporation Filename: zOS V2R1 USS S&U ASCII Unicode Option IBM Presentation Template Full Presentation Objectives ■ Introduce the features and benefits of the new z/OS UNIX Shells and Utilities (S&U) support for working with ASCII/Unicode files. Page 4 of 19 © 2013 IBM Corporation Filename: zOS V2R1 USS S&U ASCII Unicode Option IBM Presentation Template Full Version Overview ■ Problem Statement – As a z/OS UNIX Shells & Utilities user, I want the ability to control the text conversion of input files used by the S&U commands. – As a z/OS UNIX Shells & Utilities user, I want the ability to run tagged shell scripts (tcsh scripts and SBCS sh scripts) under different SBCS locales. ■ Solution – Add –W filecodeset=codeset,pgmcodeset=codeset option on several S&U commands to enable text conversion – consistent with support added to vi and ex in V1R13. – Add –B option on several S&U commands to disable automatic text conversion – consistent with other commands that already have this override support. – Add new _TEXT_CONV environment variable to enable or disable text conversion. Page 5 of 19 © 2013 IBM Corporation Filename: zOS V2R1 USS S&U ASCII Unicode Option Overview ■ Solution (continued) – With automatic conversion enabled, tagged shell scripts (tcsh scripts and SBCS sh scripts) can be run under different SBCS locales. Note : Tagged non-SBCS sh scripts (e.g DBCS, MBCS) are not supported to run. Benefits –More detailed control of text conversion • No file tagging required • No environment or system setup required –Easily override the system’s automatic text conversion –Easily enable or disable text conversion for all S&U commands that provide control of text conversion –Easily run tagged shell scripts (tcsh scripts and SBCS sh scripts) under SBCS locales Page 6 of 19 © 2013 IBM Corporation Filename: zOS V2R1 USS S&U ASCII Unicode Option IBM Presentation Template Full Version Usage & Invocation ■ –W filecodeset=codeset,pgmcodeset=codeset option was added to the following commands: cat ed head unexpand ● ● ● cmp egrep more uniq comm expand paste wc cut fgrep sed diff file strings dircmp grep tail Consistent with support added to vi and ex in V1R13. Option keywords are case sensitive. Only supported values for pgmcodeset are IBM-1047 and 1047. Page 7 of 19 © 2013 IBM Corporation Filename: zOS V2R1 USS S&U ASCII Unicode Option Usage & Invocation ■ –W filecodeset=codeset,pgmcodeset=codeset option details –Performs text conversion from one code set to another when reading from/writing to the file. For filecodeset, the coded character set of the file is codeset. For pgmcodeset, the coded character set of the program (command) is codeset. –The filecodeset and pgmcodeset options can be used on files with any file tag. –If pgmcodeset is specified but filecodeset is omitted, then the default file code set is ISO8859-1 even if the file is tagged with a different code set. The default program code set is IBM-1047. –When standard input (stdin) is used as an input text file, and stdin is not associated with a terminal, the –W filecodeset and pgmcodeset option will be applied to stdin. Page 8 of 19 © 2013 IBM Corporation Filename: zOS V2R1 USS S&U ASCII Unicode Option Usage & Invocation ■ –B option was added to the following commands: cat egrep sed ● ● ● comm expand unexpand cut fgrep uniq diff grep wc dircmp more ed paste Disables the automatic text conversion of tagged input files. This option is ignored if the filecodeset or pgmcodeset options (–W option) are specified. When standard input (stdin) is used as an input text file, and stdin is not associated with a terminal, –B will disable the automatic conversion of stdin. The head, strings, and tail commands were changed to disable automatic conversion of stdin. Page 9 of 19 © 2013 IBM Corporation Filename: zOS V2R1 USS S&U ASCII Unicode Option Usage & Invocation ■ Support for the new _TEXT_CONV environment variable was added to the following commands: cat cmp comm cut diff dircmp ed egrep ex expand fgrep file grep head more pack paste sed strings tail unexpand uniq wc vi ● ● ● ● ● Contains text conversion information for the command. Supported value keywords are FILECODESET, PGMCODESET, and DISABLE (disable automatic conversion of tagged files). Applies to all commands that support the filecodeset and pgmcodeset option (-W option) and the -B option _TEXT_CONV is ignored when the filecodeset or pgmcodeset options (–W option) or the –B option are specified. Command pack only support _TEXT_CONV=DISABLE Page 10 of 19 © 2013 IBM Corporation Filename: zOS V2R1 USS S&U ASCII Unicode Option Usage & Invocation ■ _TEXT_CONV environment variable continued. User beware! The user must understand that all commands that support either the -W option or the -B option will perform the requested text conversion (from FILECODESET to PGMCODESET, or DISABLE), regardless of the file being used (since all automatic text conversion and file tagging will be ignored). Page 11 of 19 © 2013 IBM Corporation Filename: zOS V2R1 USS S&U ASCII Unicode Option Usage & Invocation ■ The -W option, -B option, and _TEXT_CONV environment variable will only apply to the primary input text file(s) processed by the command. ■ Text conversion for files that are used by the command for reference purposes (file lists, configuration, control information, etc.) will not be affected by the -W option, -B option, or _TEXT_CONV. ■ Any output (standard output – stdout, or output files) produced by these commands will not be affected by the new support. The only exception to this would be output files that are the same as or associated with the primary input files. For example, the editor commands (ex, vi, ed, sed and more) exploit this exception. Page 12 of 19 © 2013 IBM Corporation Filename: zOS V2R1 USS S&U ASCII Unicode Option Usage & Invocation ■ Note the following precedence rules: –The –W filecodeset=codeset,pgmcodeset=codeset option overrides the –B option, the _TEXT_CONV environment variable, and the system’s automatic text conversion. –The –B option overrides the _TEXT_CONV environment variable and the system’s automatic text conversion. –The _TEXT_CONV environment variable overrides the system’s automatic text conversion. If the DISABLE value keyword is used along with either the FILECODESET or PGMCODESET value keywords, the DISABLE value keyword is ignored. –If the –W filecodeset=codeset,pgmcodeset=codeset option, the –B option, and the _TEXT_CONV environment variable aren’t specified, then the system’s automatic text conversion rules apply. Page 13 of 19 © 2013 IBM Corporation Filename: zOS V2R1 USS S&U ASCII Unicode Option Usage & Invocation ■ Example #1: To display the type of an untagged text file containing ISO8859-1 characters issue: file -W filecodeset=ISO8859-1,pgmcodeset=IBM-1047 myAsciiFile ■ Example #2: To display the <newline> count of a file containing EBCDIC characters when automatic conversion has been enabled and the file is incorrectly tagged as UTF-8: wc -lB myMisTaggedFile ■ Example #3: To perform text conversion from the ASCII code set ISO8859-1 to the EBCDIC code set IBM-1047 for all supported commands issue: export _TEXT_CONV=”FILECODESET(ISO8859-1),PGMCODESET(IBM-1047)” Page 14 of 19 © 2013 IBM Corporation Filename: zOS V2R1 USS S&U ASCII Unicode Option Usage & Invocation ■ Shell script shell scripts were limited by the following rule: The code page in which a shell script is encoded must match the code page of the locale in which it is run. With this new support, shell scripts (tcsh scripts and SBCS sh scripts) will be tagged and run correctly when automatic conversion is enabled and the locale is SBCS. Tagged non-SBCS sh scripts (e.g DBCS, MBCS) are not supported to run. ■ Example #1: To run a sh script encoded with the ASCII characters under the locale IBM-1047: export _BPXK_AUTCVT=ALL chtag -tc ISO8859-1 ascii.sh ascii.sh Page 15 of 19 © 2013 IBM Corporation Filename: zOS V2R1 USS S&U ASCII Unicode Option Usage & Invocation ■ UNICODE conversion environments User beware! The environments impact the conversion result. _BPXK_UNICODE_TECHNIQUE=x (x=R,E,C,L,M,0-9) can override the default conversion technique when Unicode Services is called. The default Value is LMREC. _BPXK_UNICODE_SUB=(YES|NO) indicates whether the Unicode Services substitute character action is to be applied during translation. _BPXK_UNICODE_MAL=(YES|NO) indicates whether the Unicode Services mal-formed character action is to be applied during translation. Page 16 of 19 © 2013 IBM Corporation Filename: zOS V2R1 USS S&U ASCII Unicode Option IBM Presentation Template Full Version Migration & Coexistence Considerations ■ Several commands already supported the -B option. 3 commands did not disable autoconversion of tagged files for standard input. – head -B < myTaggedFile – strings -B < myTaggedFile – tail -B < myTaggedFile ■ The head, strings, and tail commands were changed to support the -B option for standard input. Page 17 of 19 © 2013 IBM Corporation Filename: zOS V2R1 USS S&U ASCII Unicode Option IBM Presentation Template Full Version Presentation Summary ■ Several additional S&U commands now provide more detailed control of text conversion to assist S&U users when working with ASCII/Unicode files. ■ Shell script (tcsh scripts and SBCS sh scripts) can be tagged and run when automatic conversion is enabled and the locale is SBCS. Page 18 of 19 © 2013 IBM Corporation Filename: zOS V2R1 USS S&U ASCII Unicode Option IBM Presentation Template Full Version Appendix ■ See z/OS V2R1 “UNIX System Services Command Reference” for the S&U command updates (SA23-2280). ■ See Appendix “Controlling text conversion for z/OS UNIX shell commands” in the z/OS V2R1 “UNIX System Services Command Reference” for details on controlling text conversion for S&U. ■ See z/OS V2R1 “Unicode Services User's Guide and Reference” for the details of Unicode data conversion(SA38-0680). Page 19 of 19 © 2013 IBM Corporation